Extract Invoice Data from PDFs to JSON with Gemini AI and XML Transformation β n8n Workflow
Overview
This n8n workflow converts invoices in PDF format into a structured, ready-to-use JSON, using AI and XML transformation β without writing any code.
π How it works
Upload form β The user uploads a PDF file. Text extraction β The PDF content is extracted as plain text. XML schema definition β A standard invoice structure is defined with fields such as:
Invoice number Customer and issuer details Items with description, quantity, and price Totals and taxes Bank account details AI (
Nodes used
Workflow Preview
How it Works
- 1
Trigger
The workflow starts with a trigger trigger.
- 2
Process
Data flows through 6 nodes, connecting extractfromfile, formtrigger, googlegemini.
- 3
Output
The workflow completes its automation and delivers the result to the configured destination.
Node Details (6)
Google Gemini
n8n-nodes-langchain.googleGemini
How to Import This Workflow
- 1Click Download JSON button on the right to save the workflow file.
- 2Open your n8n instance. Go to Workflows β New β Import from file.
- 3Select the downloaded
extract-invoice-data-from-pdfs-to-json-with-gemini-ai-and-xml-transformationfile and click Import. - 4Set up credentials for each service node (API keys, OAuth, etc.).
- 5Click Test Workflow to verify everything works, then activate it.
Or paste directly in n8n β Import from JSON:
Integrations
Created by
Mauricio Perera
@rckflr
Tags
New to n8n?
n8n is a free, open-source workflow automation tool. Self-host it or use the cloud version.
Get n8n Free β