Invoice data extraction with LlamaParse and OpenAI — n8n Workflow

High complexity Trigger17 nodes🏷️ Invoice Processing👁 37,004 viewsby Jimleuk

Overview

This n8n workflow automates the process of parsing and extracting data from PDF invoices. With this workflow, accounts and finance people can realise huge time and cost savings in their busy schedules.

Read the Blog: https://blog.n8n.io/how-to-extract-data-from-pdf-to-excel-spreadsheet-advance-parsing-with-n8n-io-and-llamaparse/

How it works

This workflow will watch an email inbox for incoming invoices from suppliers It will download the attached PDFs and processing them through a third party

Nodes used

Google SheetsHTTP RequestGmailBasic LLM ChainOpenAI ModelStructured Output Parser

Workflow Preview

2. Advanced PDF Processing with LlamaParse
Read more about using HTTP Requests
LlamaIndex's LlamaCloud is a cloud
1. Watch for Invoice Emails
Read more about Gmail Triggers
The Gmail node can watch for all incoming messages
3. Use LLMs to Extract Values from Data
Read more about Basic LLM Chain
Large language models are
🙋‍♂️ Why not just use the built-in PDF conve
A common issue with PDF-to-text convertors are that the
4. Add Label to Avoid Duplication
Read more about working with Gmail
To finish off the workflow, we'll add the "invoice
Try Me Out!
This workflow does the following:
* Waits for email invoices with PDF attachments.
* Uses the LlamaParse service to convert the invoice PD
* Uses a LLM to ex
Need more attributes?
Change it here!
🚨Required
* Set Your Google Sheet URL here
* Set the Name of your Sheet
Don't use GSheets?
Swap this for Excel, Airtable or a Database!
🚨Required
* Change the email filters here!
modelparser
OpenAI Model
Structured Output Parser
Upload to LlamaParse
R
Receiving Invoices
Append to Reconciliation…
Get Processing Status
W
Wait to stay within serv…
I
Is Job Ready?
Add "invoice synced" Label
Get Parsed Invoice Data
M
Map Output
Apply Data Extraction Ru…
S
Should Process Email?
S
Split Out Labels
Get Labels Names
C
Combine Label Names
E
Email with Label Names
17 nodes18 edges

How it Works

  1. 1

    Trigger

    The workflow starts with a trigger trigger.

  2. 2

    Process

    Data flows through 17 nodes, connecting aggregate, chainllm, gmail.

  3. 3

    Output

    The workflow completes its automation and delivers the result to the configured destination.

Node Details (17)

GO

Google Sheets

googleSheets

#1
HT

HTTP Request

httpRequest

#2
GM

Gmail

gmail

#3
BA

Basic LLM Chain

n8n-nodes-langchain.chainLlm

#4
OP

OpenAI Model

n8n-nodes-langchain.lmOpenAi

#5
ST

Structured Output Parser

n8n-nodes-langchain.outputParserStructured

#6

How to Import This Workflow

  1. 1Click Download JSON button on the right to save the workflow file.
  2. 2Open your n8n instance. Go to Workflows → New → Import from file.
  3. 3Select the downloaded invoice-data-extraction-with-llamaparse-and-openai file and click Import.
  4. 4Set up credentials for each service node (API keys, OAuth, etc.).
  5. 5Click Test Workflow to verify everything works, then activate it.

Or paste directly in n8n → Import from JSON:

{ "name": "Invoice data extraction with LlamaParse and OpenAI", "nodes": [...], ...}

Integrations

aggregatechainllmgmailgmailtriggergooglesheetshttprequestiflmopenaimergeoutputparserstructuredsetsplitoutswitchwait

Get This Workflow

Download and import in one click

Download JSONView on n8n.io
Nodes17
Complexityhigh
Triggertrigger
Views37,004

Created by

Jimleuk

Jimleuk

@jimleuk

Tags

aggregatechainllmgmailgmailtriggergooglesheetshttprequestiflmopenaimergeoutputparserstructured

New to n8n?

n8n is a free, open-source workflow automation tool. Self-host it or use the cloud version.

Get n8n Free →

Related Invoice Processing Workflows

COCOEMEX+5
medium

Automate Custom QuickBooks Invoice PDFs & Email with n8n

Standard accounting templates often fail to reflect a premium brand identity. This sophisticated n8n workflow bridges the gap between financial record-keeping and professional client presentation. By moving beyond the native limitations of QuickBooks Online, this automation enables businesses to generate high-end, multi-page PDF invoices that align perfectly with their corporate styling. The process begins the moment a new invoice is generated in QuickBooks, triggering a webhook that captures real-time billing data. The workflow then utilizes advanced HTML-to-File conversion and custom Code nodes to structure data into a polished, branded layout. It handles complex logic such as line-item merging and multi-page formatting automatically. Once the document is rendered, the system bypasses generic 'no-reply' senders by routing the finalized PDF through your preferred email provider. This ensures a seamless, white-labeled experience for your clients while eliminating the manual overhead of exporting, styling, and attaching files. Ideal for agencies and service providers, this flow guarantees that your most frequent touchpoint—the bill—is as professional as your work. **Common Use Cases:** - High-end creative agencies requiring bespoke, white-labeled billing documents for premium clients. - Automated recurring subscription billing where custom tax disclosures or localized branding are required. - Service-based businesses needing to attach dynamic project reports or terms of service directly to QuickBooks invoices.

🔗 Webhook·12 nodes