{
  "name": "Process large documents with OCR using SubworkflowAI and Gemini",
  "nodes": [
    {
      "id": "1ef38f7c-61dd-419f-9b6d-c7ed5224d993",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [
        288,
        32
      ]
    },
    {
      "id": "7a2fbe88-a859-4143-b96d-de8d213d1006",
      "name": "Check Job Status",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        128,
        32
      ]
    },
    {
      "id": "87fe7e38-78ec-4f3b-a7e8-331af339dc04",
      "name": "Get Dataset Items",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        832,
        -128
      ]
    },
    {
      "id": "93ecd55d-9f01-4c6a-acb7-94184ad54781",
      "name": "Get Dataset",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        608,
        -128
      ]
    },
    {
      "id": "1e9fa5b0-ca47-4d44-8818-a7251173346f",
      "name": "When clicking ‘Execute workflow’",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [
        -736,
        32
      ]
    },
    {
      "id": "54c34849-bbbb-41e2-a182-1ec021939d5b",
      "name": "Download file",
      "type": "n8n-nodes-base.googleDrive",
      "position": [
        -544,
        32
      ]
    },
    {
      "id": "39367d95-269a-47ed-a23f-4635e4d1fa4f",
      "name": "Split Out",
      "type": "n8n-nodes-base.splitOut",
      "position": [
        1232,
        -128
      ]
    },
    {
      "id": "2b1ac563-43c5-4379-8ce2-d06f14897e42",
      "name": "Sticky Note",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -832,
        -160
      ],
      "parameters": {
        "width": 672,
        "height": 416,
        "content": "### 1. Upload Binary File to Extract API\n[Extract API Documentation](https://docs.subworkflow.ai/api-reference/post-v1-extract)\n\nOur workflow starts with uploading our document to the SubworkflowAI se"
      }
    },
    {
      "id": "6e6fd235-f260-4733-a31f-54f04ae3efc3",
      "name": "Extract API",
      "type": "n8n-nodes-base.httpRequest",
      "position": [
        -352,
        32
      ]
    },
    {
      "id": "3c7d2ca3-8496-4136-82a9-db84baefa658",
      "name": "Sticky Note1",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -128,
        -160
      ],
      "parameters": {
        "width": 608,
        "height": 416,
        "content": "### 2. Poll for Async \"Extract\" Job to Complete\n[Jobs API Documentation](https://docs.subworkflow.ai/api-reference/get-v1-jobs-id)\n\nWhilst the extract API is busy with our file, the \"job\" record assoc"
      }
    },
    {
      "id": "a0cfd795-01c0-4e10-ac86-497b300899e0",
      "name": "Sticky Note2",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        512,
        -368
      ],
      "parameters": {
        "width": 544,
        "height": 432,
        "content": "### 3. Fetch Resulting Dataset and Get Dataset Items\n[Datasets API Documentation](https://docs.subworkflow.ai/api-reference/get-v1-datasets)\n\nOnce the extract process is done, we can safely access the"
      }
    },
    {
      "id": "1ede59b6-24af-4a33-bac4-50bc21d60a2f",
      "name": "Sticky Note4",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1072,
        -368
      ],
      "parameters": {
        "width": 640,
        "height": 432,
        "content": "### 4. Example VLM Use-Case - Document OCR\n[Learn more about the Gemini node](https://docs.n8n.io/integrations/builtin/app-nodes/n8n-nodes-langchain.googlegemini/)\n\nFinally, the DatasetItems API provi"
      }
    },
    {
      "id": "273556be-9941-46ea-985e-94795949a741",
      "name": "Job Complete?",
      "type": "n8n-nodes-base.if",
      "position": [
        -48,
        32
      ]
    },
    {
      "id": "65566edc-f596-4aae-ae57-37393d41f7b9",
      "name": "Document OCR via VLM",
      "type": "@n8n/n8n-nodes-langchain.googleGemini",
      "position": [
        1440,
        -128
      ]
    },
    {
      "id": "00a67bf0-7b51-4bd6-bc2d-dbb25b1ce907",
      "name": "Sticky Note6",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        1744,
        -32
      ],
      "parameters": {
        "width": 416,
        "height": 96,
        "content": "### Next Steps\nLet's next check out the Search API!\nhttps://docs.subworkflow.ai/api-reference/post-v1-search"
      }
    },
    {
      "id": "780ae573-6d15-4cbc-8c5f-0e1893c6a9bc",
      "name": "Sticky Note7",
      "type": "n8n-nodes-base.stickyNote",
      "position": [
        -1360,
        -784
      ],
      "parameters": {
        "width": 480,
        "height": 1232,
        "content": "[![banner](https://cdn.subworkflow.ai/marketing/banner-300x100.png#full-width)](https://subworkflow.ai?utm=n8n)\n## Working with Large Documents In Your VLM OCR Workflow\n\nDocument workflows are popular"
      }
    }
  ],
  "connections": {
    "Wait": {
      "main": [
        [
          {
            "node": "Job Complete?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Split Out": {
      "main": [
        [
          {
            "node": "Document OCR via VLM",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Extract API": {
      "main": [
        [
          {
            "node": "Job Complete?",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get Dataset": {
      "main": [
        [
          {
            "node": "Get Dataset Items",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Download file": {
      "main": [
        [
          {
            "node": "Extract API",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Job Complete?": {
      "main": [
        [
          {
            "node": "Get Dataset",
            "type": "main",
            "index": 0
          }
        ],
        [
          {
            "node": "Check Job Status",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Check Job Status": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Get Dataset Items": {
      "main": [
        [
          {
            "node": "Split Out",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "When clicking ‘Execute workflow’": {
      "main": [
        [
          {
            "node": "Download file",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}