Build a PDF Document RAG System with Mistral OCR, Qdrant and Gemini AI — n8n Workflow

Hoog complexiteit Trigger29 knooppunten AI👁 20,271 weergavendoor Davide Boizza

Overzicht

This workflow is designed to process PDF documents using Mistral's OCR capabilities, store the extracted text in a Qdrant vector database, and enable Retrieval-Augmented Generation (RAG) for answering questions. Here’s how it functions:

Once configured, the workflow automates document ingestion, vectorization, and intelligent querying, enabling powerful RAG applications.

Benefits

End-to-End Automation** No manual interaction is needed: documents are read, processed, and made queryable wit

Gebruikte knooppunten

HTTP RequestGoogle DriveCodeSummarization ChainQuestion and Answer ChainEmbeddings OpenAIVector Store RetrieverToken SplitterDefault Data LoaderQdrant Vector StoreGoogle Gemini Chat Model

Workflow-voorvertoning

STEP 1
Create Qdrant Collection
Change:
- QDRANTURL
- COLLECTION
STEP 2
Documents vectorization with Qdrant and Google Drive
Change:
- QDRANTURL
- COLLECTION
STEP 3
If you want a "light" and faster rag with the main cont
STEP 4
Test the RAG
Complete RAG system from PDF Documents with M
This workflow is designed to process PDF documents usin
embedembeddocretrievermodelmodel
Mistral Upload
Mistral Signed URL
Mistral DOC OCR
W
When clicking ‘Test work…
L
Loop Over Items
Refresh collection
Embeddings OpenAI
Default Data Loader
Token Splitter
W
When chat message received
Question and Answer Chain
Google Gemini Chat Model
Vector Store Retriever
Qdrant Vector Store1
Embeddings OpenAI1
Code
W
Wait
Qdrant Vector Store
L
Loop Over Items1
E
Execute Workflow
W
When Executed by Another…
E
Edit Fields1
Create collection
Summarization Chain
Google Gemini Chat Model1
S
Set page
S
Set summary
Search PDFs
Get PDF
29 nodes26 edges

Hoe het werkt

  1. 1

    Trigger

    De workflow start met een trigger-trigger.

  2. 2

    Verwerking

    Gegevens stromen door 29 knooppunten, connecting chainretrievalqa, chainsummarization, chattrigger.

  3. 3

    Uitvoer

    De workflow voltooit zijn automatisering en levert het resultaat aan de geconfigureerde bestemming.

Knooppuntdetails (29)

HT

HTTP Request

httpRequest

#1
GO

Google Drive

googleDrive

#2
CO

Code

code

#3
SU

Summarization Chain

n8n-nodes-langchain.chainSummarization

#4
QU

Question and Answer Chain

n8n-nodes-langchain.chainRetrievalQa

#5
EM

Embeddings OpenAI

n8n-nodes-langchain.embeddingsOpenAi

#6
VE

Vector Store Retriever

n8n-nodes-langchain.retrieverVectorStore

#7
TO

Token Splitter

n8n-nodes-langchain.textSplitterTokenSplitter

#8
DE

Default Data Loader

n8n-nodes-langchain.documentDefaultDataLoader

#9
QD

Qdrant Vector Store

n8n-nodes-langchain.vectorStoreQdrant

#10
GO

Google Gemini Chat Model

n8n-nodes-langchain.lmChatGoogleGemini

#11

Hoe deze workflow te importeren

  1. 1Klik op de knop JSON downloaden rechts om het workflowbestand op te slaan.
  2. 2Open uw n8n-instantie. Ga naar Workflows → Nieuw → Importeren uit bestand.
  3. 3Selecteer het gedownloade bestand build-a-pdf-document-rag-system-with-mistral-ocr-qdrant-and-gemini-ai en klik op Importeren.
  4. 4Stel inloggegevens in voor elk serviceknooppunt (API-sleutels, OAuth, enz.).
  5. 5Klik op Workflow testen om te controleren of alles werkt, activeer het vervolgens.

Of plak rechtstreeks in n8n → Importeren uit JSON:

{ "name": "Build a PDF Document RAG System with Mistral OCR, Qdrant and Gemini AI", "nodes": [...], ...}

Integraties

chainretrievalqachainsummarizationchattriggercodedocumentdefaultdataloaderembeddingsopenaiexecuteworkflowexecuteworkflowtriggergoogledrivehttprequestlmchatgooglegeminimanualtriggerretrievervectorstoresetsplitinbatchestextsplittertokensplittervectorstoreqdrantwait

Haal deze workflow op

Download en importeer met één klik

JSON downloadenBekijken op n8n.io
Knooppunten29
Complexiteithigh
Triggertrigger
Weergaven20,271
CategorieAI

Gemaakt door

Davide Boizza

Davide Boizza

@n3witalia

Tags

chainretrievalqachainsummarizationchattriggercodedocumentdefaultdataloaderembeddingsopenaiexecuteworkflowexecuteworkflowtriggergoogledrivehttprequest

Nieuw bij n8n?

n8n is een gratis open-source workflow-automatiseringstool. Host het zelf of gebruik de cloudversie.

n8n gratis ophalen →

Related AI Workflows

AGCOFIGM+10
high

Automate Digital Product Delivery: Stripe to Gmail via n8n

Transform your post-purchase operations with this high-performance n8n workflow designed for digital creators and SaaS founders. Instead of manual fulfillment, this automation acts as a 24/7 digital concierge. It begins by scanning Stripe for successful transactions, cross-referencing buyer data against a centralized Google Sheets inventory to identify the correct digital asset. Utilizing advanced AI via GPT-4o, the system then drafts a personalized onboarding email, including secure access credentials and custom instructions, ensuring a premium customer experience without manual intervention. This flow eliminates the 'human-in-the-middle' delay, significantly reducing support tickets related to missing downloads. By integrating an AI agent, the workflow can intelligently parse complex product variations, making it far more robust than standard linear automations. Whether you are selling automation templates, software licenses, or protected PDF guides, this system provides a scalable infrastructure that grows with your sales volume while maintaining a personal touch through LLM-generated content. **Common Use Cases:** - Scaling a niche digital marketplace for selling specialized code snippets or design assets. - Automating the distribution of unique software license keys and documentation after a SaaS subscription purchase. - Delivering personalized AI-generated consulting reports or audit results based on customer input data.

Scheduled·25 nodes
AGCHCOEX+10
high

Automated AI Resume Parser & JD Matcher via n8n & GPT-4

Transform your recruitment funnel with this advanced AI-driven candidate evaluation engine. This n8n workflow eliminates manual screening fatigue by autonomously analyzing batches of resumes against specific job descriptions. Using GPT-4 and LangChain's structured output parsers, the system extracts key qualifications, scores them against your criteria, and generates objective alignment reports. The process begins with a custom n8n form for document upload, followed by intelligent text extraction from PDFs. The data is then processed through an LLM chain to ensure unbiased scoring. Results are synchronized directly to Google Sheets for centralized tracking, while high-match alerts are dispatched via Slack and SendGrid to keep hiring managers informed in real-time. This workflow is essential for high-volume recruitment agencies and scaling startups that need to maintain a rigorous, auditable, and data-backed shortlisting process without increasing headcount or sacrificing quality of hire. **Common Use Cases:** - High-volume university recruitment and internship screening - Technical talent sourcing for niche engineering roles - Internal mobility matching for large corporate restructuring

Trigger·21 nodes
AGCHGMGM+5
medium

AI Gmail Auto-Labeler: Smart Inbox Sorting with GPT-4 & n8n

Stop drowning in a cluttered inbox and regain control of your digital communication. This advanced n8n automation leverages GPT-4's natural language processing to intelligently analyze, categorize, and label incoming Gmail messages in real-time. Unlike basic filter rules that rely on rigid keywords, this workflow understands the context and sentiment of every email, ensuring high-precision organization. The process begins with a Gmail Trigger that captures new messages. It then passes the content through a LangChain LLM chain where OpenAI evaluates the intent—distinguishing between urgent client requests, internal project updates, or low-priority newsletters. Using structured output parsing, the workflow extracts key metadata and applies the appropriate Gmail labels automatically. This eliminates the manual cognitive load of triaging emails, allowing your team to focus on high-value tasks rather than administrative upkeep. Whether you are managing high-volume support tickets or complex sales inquiries, this workflow ensures that critical messages are highlighted and organized without human intervention. **Common Use Cases:** - Automated Customer Support Triage: Instantly tag emails as 'Urgent Support', 'Feature Request', or 'Billing' to speed up response times. - Sales Lead Prioritization: Automatically identify high-intent inquiries and label them for immediate follow-up by account executives. - Project Management Sync: Categorize incoming vendor updates and stakeholder feedback by project name or department for better visibility.

Trigger·11 nodes
AGGMGOLM+3
medium

AI Dental Lead Follow-up: n8n, OpenAI & Google Sheets Sync

Stop losing high-value patients to delayed responses. This advanced n8n workflow bridges the gap between lead capture and appointment booking by deploying an AI-driven engagement layer. When a prospect submits a query via your website or landing page, the automation immediately triggers, logging the data into Google Sheets for centralized tracking. Instead of sending a generic auto-reply, the integrated LangChain agent utilizes GPT-4/3.5 to analyze the specific treatment interest—be it Invisalign, dental implants, or routine cleaning—and crafts a personalized, empathetic response delivered via Gmail. The workflow includes a strategic 'Wait' node to mimic natural human timing and a 'Memory Buffer' to maintain context if the lead replies. This system is essential for clinics looking to scale their patient acquisition without increasing administrative headcount, ensuring every inquiry is nurtured instantly with professional, clinical-grade communication. By automating the initial touchpoint, your front-desk team can focus on confirmed arrivals rather than chasing cold leads. **Common Use Cases:** - Automated Patient Triage: Categorizing and responding to specific dental treatment inquiries based on urgency and procedure type. - Medical Spa Lead Nurturing: Instantly engaging prospects interested in high-ticket aesthetic treatments to increase conversion rates. - Multi-Location Clinic Sync: Centralizing lead data from various web forms into a single Google Sheet while maintaining personalized local email follow-ups.

Trigger·8 nodes
AGCOGOHT+8
high

Automate AI UGC Video Production with Google Sheets & Veo

Transform your digital marketing strategy by automating the production of high-converting User-Generated Content (UGC) at scale. This sophisticated n8n workflow eliminates the logistical bottleneck of traditional content creation by orchestrating a seamless pipeline between Google Sheets and advanced AI video models. By leveraging NanoBanana Pro for precise image synthesis and Veo 3.1 for fluid motion, the system takes three distinct visual inputs—your product, a chosen persona, and a target environment—and blends them into hyper-realistic, selfie-style video assets. The automation begins by monitoring a Google Sheet for new campaign parameters, triggers an intelligent AI agent to handle complex image processing, and manages asynchronous API calls to ensure high-fidelity video rendering. This is an enterprise-grade solution for performance marketers who need to refresh creative assets daily without manual intervention. It effectively handles the heavy lifting of prompt engineering and file management, allowing you to focus on strategy while the workflow generates 8-second, platform-ready clips optimized for the TikTok and Instagram algorithms. **Common Use Cases:** - Scaling creative testing for TikTok and Meta Ads by generating hundreds of product variants - Automating personalized influencer-style shoutouts for e-commerce loyalty programs - Rapid prototyping of social media video content for global brand localization

Scheduled·24 nodes
@BAGCOIF+7
high

Automate AI Twitter Threads via Telegram & n8n (No-Code)

Streamline your social media presence with this enterprise-grade n8n automation that bridges the gap between raw inspiration and professional X (Twitter) publishing. By integrating Telegram as a mobile command center, this workflow allows you to dictate or type thread ideas on the go. The system leverages OpenAI’s LLMs and LangChain agents to draft high-engagement thread structures, which are then sent back to your private Telegram chat for review. Unlike basic automation, this setup features a sophisticated 'human-in-the-loop' approval system. You can interactively refine the AI-generated content, request edits, or finalize the draft. Once you provide the 'Okay' confirmation, the workflow automatically formats and publishes the entire thread to X via the Blotato integration. This eliminates the friction of manual drafting and copy-pasting, ensuring your brand stays active without the constant overhead of social media management. It is the perfect solution for creators and businesses looking to maintain a high-quality digital presence using a voice-to-thread mobile pipeline. **Common Use Cases:** - Thought Leadership: Convert spontaneous voice memos into structured educational threads while commuting. - Event Live-Tweeting: Rapidly draft and approve professional event summaries from a mobile device without opening the X app. - Content Repurposing: Send links or snippets of long-form articles to Telegram to generate 'tweetable' summaries for social distribution.

Trigger·19 nodes