{
  "name": "Simple eval for legal benchmarking",
  "nodes": [
    {
      "id": "5c25edcc-987c-4b60-b472-6c476dc866c6",
      "name": "When clicking ‘Test workflow’",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -580,
        440
      ]
    },
    {
      "id": "60521a1e-3f00-4dd6-9651-a5f9ae1659ab",
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [
        -20,
        -180
      ]
    },
    {
      "id": "8e0fd8c4-51a1-44fb-8524-034df4fd4fe5",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        1520,
        340
      ]
    },
    {
      "id": "0e457acc-5047-43ed-977f-f08714498ffc",
      "name": "OpenRouter Chat Model",
      "type": "@n8n/n8n-nodes-langchain.lmChatOpenRouter",
      "position": [
        740,
        320
      ]
    },
    {
      "id": "46ce90ac-6b58-4770-a6bd-48fccefc8713",
      "name": "Extract from File",
      "type": "n8n-nodes-base.extractFromFile",
      "position": [
        500,
        20
      ]
    },
    {
      "id": "9e76d679-6f4b-419b-af36-771aba5b9e98",
      "name": "Google Drive",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        120,
        20
      ]
    },
    {
      "id": "8f2fc75f-ea5d-4700-b58b-b5f5e3e360e5",
      "name": "Get Tests",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        -420,
        440
      ]
    },
    {
      "id": "49b0cc43-30d3-4331-a608-4f51c9ac0efd",
      "name": "Structured Output Parser",
      "type": "@n8n/n8n-nodes-langchain.outputParserStructured",
      "position": [
        900,
        320
      ]
    },
    {
      "id": "cf3cce00-33c4-4bcd-a2a2-75d7e49439d3",
      "name": "Update Results",
      "type": "n8n-nodes-base.googleSheets",
      "position": [
        1220,
        140
      ]
    },
    {
      "id": "80a9da50-93df-476a-a39f-ffece642a934",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -540,
        220
      ],
      "parameters": {
        "width": 360,
        "height": 180,
        "content": "## 1. Fetch test cases\nWe start by grabbing our list of test cases stored in a Google Sheet [here](https://docs.google.com/spreadsheets/d/10l_gMtPsge00eTTltGrgvAo54qhh3_twEDsETrQLAGU/edit?usp=sharing)"
      }
    },
    {
      "id": "2bba0cc9-460d-48c3-bcc6-a79c2b00b23e",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -80,
        -400
      ],
      "parameters": {
        "width": 260,
        "height": 200,
        "content": "## 2. Loop through our data\nWe use a loop here to visually see our workflow going through each item one-by-one."
      }
    },
    {
      "id": "7ccb6426-e79e-4cdc-ac5f-59c022f8978d",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        200,
        -180
      ],
      "parameters": {
        "width": 360,
        "height": null,
        "content": "## 3. Grab the PDF as text\nWe download the PDF from the Google Drive link in the Google Sheet, extracting the file as text for the next step. We filter out any files that do not return data."
      }
    },
    {
      "id": "6dc17e29-ea6c-460b-b0e0-99cc41977891",
      "name": "Sticky Note3",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        700,
        -180
      ],
      "parameters": {
        "width": 360,
        "height": 280,
        "content": "## 4. Judge LLM outputs\nOur prompt judges the LLM input/output and decides if the LLM passed the test. We also ask for a reason why the judge made its decision, which we can use to refine our eval lat"
      }
    },
    {
      "id": "1fd001a6-e67a-406c-b81c-c0a454ecd73e",
      "name": "LLM Judge",
      "type": "@n8n/n8n-nodes-langchain.chainLlm",
      "position": [
        740,
        140
      ]
    },
    {
      "id": "2e94c252-9973-4915-ab21-838681e7ea5c",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1160,
        -80
      ],
      "parameters": {
        "width": 260,
        "height": 180,
        "content": "## 5. Update results\nWe create a new row in our output sheet, containing our original data together with the judge decision/reasoning."
      }
    },
    {
      "id": "05d7a5c0-3ad5-4ec1-ae3f-f533e16f8027",
      "name": "Sticky Note5",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1460,
        140
      ],
      "parameters": {
        "width": 220,
        "height": 180,
        "content": "## 6. Pause between runs\nWe wait half a second, so that we don't hit the OpenAI API rate limit and get an error."
      }
    },
    {
      "id": "099b10f3-c9cf-4fa0-9a07-ccb73ea11542",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -600,
        640
      ],
      "parameters": {
        "width": 460,
        "height": 280,
        "content": "## Data format\nOur Tests Sheet contains the following columns:\n- ID: A unique identifier for each row\n- Test No.: The test that the LLM was given\n- AI Platform: The LLM that was given the test.\n- Rele"
      }
    },
    {
      "id": "65532c6d-5638-432f-8f6a-a5a02b1aaf08",
      "name": "Is PDF?",
      "type": "n8n-nodes-base.if",
      "position": [
        -260,
        440
      ]
    },
    {
      "id": "17aa2c76-7ea7-41fe-84ed-4e485ef4db5e",
      "name": "Is a file?",
      "type": "n8n-nodes-base.if",
      "position": [
        300,
        20
      ]
    }
  ],
  "connections": {
    "Wait": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Is PDF?": {
      "main": [
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get Tests": {
      "main": [
        [
          {
            "node": "Is PDF?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "LLM Judge": {
      "main": [
        [
          {
            "node": "Update Results",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Is a file?": {
      "main": [
        [
          {
            "node": "Extract from File",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Loop Over Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Google Drive": {
      "main": [
        [
          {
            "node": "Is a file?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Update Results": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Loop Over Items": {
      "main": [
        [],
        [
          {
            "node": "Google Drive",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract from File": {
      "main": [
        [
          {
            "node": "LLM Judge",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "OpenRouter Chat Model": {
      "ai_languageModel": [
        [
          {
            "node": "LLM Judge",
            "type": "ai_languageModel",
            "index": 0
          }
        ]
      ]
    },
    "Structured Output Parser": {
      "ai_outputParser": [
        [
          {
            "node": "LLM Judge",
            "type": "ai_outputParser",
            "index": 0
          }
        ]
      ]
    },
    "When clicking ‘Test workflow’": {
      "main": [
        [
          {
            "node": "Get Tests",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}