API Schema Extractor — рабочий процесс n8n

Высокая сложность Триггер82 узлов⚒️ Engineering👁 23,830 просмотровот Polina Medvedieva

Обзор

This workflow automates the process of discovering and extracting APIs from various services, followed by generating custom schemas. It works in three distinct stages: research, extraction, and schema generation, with each stage tracking progress in a Google Sheet.

🙏 Jim Le deserves major kudos for helping to build this sophisticated three-stage workflow that cleverly automates API documentation processing using a smart combination of web scraping, vector search, and LLM technologies.

How it w

Использованные узлы

Google SheetsHTTP RequestGoogle DriveCodeRecursive Character Text SplitterDefault Data LoaderQdrant Vector StoreEmbeddings Google GeminiGoogle Gemini Chat ModelText ClassifierInformation Extractor

Предпросмотр рабочего процесса

Stage 1 - Research for API Documentation
- Fetch a list of services pending research from Databa
- Uses a search engine (Google) to find API Documentati
- Uses W
Stage 2 - Extract API Operations From Documen
- Fetch a list of services pending extraction from Data
- Query Vector store (Qdrant) to figure out service's p
Stage 3 - Generate Custom Schema From API Ope
- Fetch a list of services pending generation from Data
- Fetch all API operations for each service from Databa
Stage 1 - Subworkflow
Stage 2 - Subworkflow
Stage 3 - Subworkflow
docembedmodelembedembedmodelmodel
W
When clicking ‘Test work…
Web Search For API Schema
Scrape Webpage Contents
R
Results to List
Recursive Character Text…
C
Content Chunking @ 50k C…
S
Split Out Chunks
Default Data Loader
S
Set Embedding Variables
E
Execute Workflow Trigger
E
Execution Data
E
EventRouter
Google Gemini Chat Model
S
Successful Runs
F
For Each Document...
Embeddings Google Gemini
Has API Documentation?
Store Document Embeddings
Embeddings Google Gemini1
Google Gemini Chat Model1
Extract API Operations
Search in Relevant Docs
W
Wait
R
Remove Dupes
F
Filter Results
R
Research
H
Has Results?
R
Response Empty
R
Response OK
C
Combine Docs
T
Template to List
Q
Query Templates
Google Gemini Chat Model2
F
For Each Template...
Q
Query & Docs
Identify Service Products
E
Extract API Templates
Embeddings Google Gemini2
Search in Relevant Docs1
C
Combine Docs1
Q
Query & Docs1
F
For Each Template...1
Merge Lists
R
Remove Duplicates
Append Row
R
Response OK1
H
Has Operations?
R
Response Empty1
Research Pending
Research Result
Research Error
Extract Pending
R
Research Event
E
Extract Event
E
Extract
Extract Result
Extract Error
Get API Operations
Contruct JSON Schema
Upload to Drive
S
Set Upload Fields
R
Response OK2
G
Generate Event
Generate Pending
G
Generate
Generate Error
Generate Result
Get All Extract
Get All Research
F
For Each Research...
F
For Each Extract...
W
Wait1
A
All Research Done?
A
All Extract Done?
Get All Generate
A
All Generate Done?
F
For Each Generate...
W
Wait2
H
Has Results?1
R
Response Scrape Error
H
Has Results?3
R
Response No API Docs
82 nodes91 edges

Как это работает

  1. 1

    Триггер

    Рабочий процесс запускается триггером триггер.

  2. 2

    Обработка

    Данные проходят через 82 узлов, connecting aggregate, code, documentdefaultdataloader.

  3. 3

    Вывод

    Рабочий процесс завершает автоматизацию и доставляет результат в настроенное место назначения.

Детали узлов (82)

GO

Google Sheets

googleSheets

#1
HT

HTTP Request

httpRequest

#2
GO

Google Drive

googleDrive

#3
CO

Code

code

#4
RE

Recursive Character Text Splitter

n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter

#5
DE

Default Data Loader

n8n-nodes-langchain.documentDefaultDataLoader

#6
QD

Qdrant Vector Store

n8n-nodes-langchain.vectorStoreQdrant

#7
EM

Embeddings Google Gemini

n8n-nodes-langchain.embeddingsGoogleGemini

#8
GO

Google Gemini Chat Model

n8n-nodes-langchain.lmChatGoogleGemini

#9
TE

Text Classifier

n8n-nodes-langchain.textClassifier

#10
IN

Information Extractor

n8n-nodes-langchain.informationExtractor

#11

Как импортировать этот рабочий процесс

  1. 1Нажмите кнопку Скачать JSON справа, чтобы сохранить файл рабочего процесса.
  2. 2Откройте ваш экземпляр n8n. Перейдите в Рабочие процессы → Новый → Импорт из файла.
  3. 3Выберите скачанный файл api-schema-extractor и нажмите Импортировать.
  4. 4Настройте учётные данные для каждого узла сервиса (ключи API, OAuth и т.д.).
  5. 5Нажмите Протестировать рабочий процесс, чтобы убедиться в правильной работе, затем активируйте его.

Или вставьте напрямую в n8n → Импорт из JSON:

{ "name": "API Schema Extractor", "nodes": [...], ...}

Интеграции

aggregatecodedocumentdefaultdataloaderembeddingsgooglegeminiexecuteworkflowexecuteworkflowtriggerexecutiondatafiltergoogledrivegooglesheetshttprequestifinformationextractorlmchatgooglegeminimanualtriggerremoveduplicatessetsplitinbatchessplitoutswitch

Получить этот рабочий процесс

Скачайте и импортируйте одним кликом

Скачать JSONПросмотреть на n8n.io
Узлы82
Сложностьhigh
Триггерtrigger
Просмотры23,830
КатегорияEngineering

Создан

Polina Medvedieva

Polina Medvedieva

@polina-n8n

Теги

aggregatecodedocumentdefaultdataloaderembeddingsgooglegeminiexecuteworkflowexecuteworkflowtriggerexecutiondatafiltergoogledrivegooglesheets

Новичок в n8n?

n8n — бесплатный инструмент автоматизации рабочих процессов с открытым исходным кодом. Разверните самостоятельно или используйте облачную версию.

Получить n8n бесплатно →

Related Engineering Workflows

COGOHTIF+5
high

Automate YooKassa Payments & Order Logs in Google Sheets

Transform your payment operations with this comprehensive n8n automation designed for seamless YooKassa integration. Instead of manual data entry, this workflow creates a self-correcting financial ledger by syncing every transaction directly into Google Sheets in real-time. The logic handles the entire commerce lifecycle: from the moment a customer initiates a checkout to the final status confirmation via secure webhooks. What sets this template apart is its robust error-handling and multi-stage processing. It doesn't just log successful sales; it intelligently updates refund statuses, sorts transaction types, and uses conditional logic to ensure your spreadsheet remains a 'single source of truth.' By utilizing the Respond to Webhook node, the workflow provides immediate feedback to the payment gateway, ensuring high reliability. This is an enterprise-grade solution for small business owners who need to scale their digital sales without the overhead of expensive ERP software or manual bookkeeping. It eliminates human error, accelerates fulfillment, and provides a transparent audit trail for every ruble processed. **Common Use Cases:** - Automated Digital Product Delivery: Trigger instant access to downloads or courses once the YooKassa payment status is verified in the sheet. - Real-time Financial Reporting: Maintain a live dashboard of sales performance and refund rates for e-commerce stakeholders without manual exports. - No-code Subscription Management: Track recurring customer payments and trial expirations by logging every transaction event into a centralized Google Sheets database.

🔗 Webhook·39 nodes
COGOHTMA
low

Automate Google AI Overview Tracking: SEO Audit with SerpApi

As Google’s Search Generative Experience (SGE) reshapes the digital landscape, traditional SEO metrics often fall short. This advanced n8n workflow bridges the data gap by providing a scalable solution for monitoring brand visibility within AI Overviews. By integrating SerpApi with Google Sheets, the automation systematically audits your target keyword clusters to detect AI-generated summaries and source citations. The process begins by pulling high-priority keywords directly from your repository. It then executes real-time search queries via SerpApi, utilizing custom JavaScript in a Code node to parse complex JSON responses. The workflow specifically identifies if an AI Overview is present and, more importantly, whether your domain is cited as a primary source. All findings are exported into a structured Google Sheet, creating a historical record of your 'AI Share of Voice.' This automation saves SEO teams dozens of manual hours, allowing for data-driven adjustments to content strategy based on how Google’s LLM perceives and attributes your site’s authority. It is an essential tool for agencies and enterprise SEOs looking to defend their organic traffic in the age of generative search. **Common Use Cases:** - SGE Visibility Benchmarking vs Competitors - Automated AI Citation Reporting for SEO Clients - Content Strategy Optimization for Informational Intent Queries

Trigger·5 nodes
EDFOGIGM+3
medium

Automate Dynamic GitHub Images & URL Redirects in n8n

Stop manually updating marketing assets across distributed channels. This professional n8n automation eliminates the 'stale content' problem by transforming static GitHub-hosted images and links into dynamic, self-updating resources. Instead of hunting down every email template or PDF where you've embedded a promotional banner, this workflow allows you to push updates to a central GitHub repository that automatically reflects across all live instances. The workflow functions by utilizing GitHub as a headless content management system. It leverages the EditImage node to programmatically modify visual assets and the GitHub node to commit these changes automatically. It includes a Form Trigger for on-demand updates and a Schedule Trigger for recurring maintenance. By using stable embed links that point to dynamic GitHub files, your marketing collateral remains evergreen. Whether you are updating a limited-time offer in an old email sequence or refreshing data visualizations in a shared report, this automation ensures your audience always sees the most current information without requiring manual re-distribution of files. **Common Use Cases:** - Automated Email Banner Updates for Evergreen Campaigns - Dynamic GitHub Profile Readme & Repository Statistics - Centralized Promo Link Management for Distributed PDF Guides

Trigger·12 nodes
AGCOEXGI+8
medium

Automate Jekyll SEO Blogs with GPT-4 and GitHub (n8n Guide)

This sophisticated automation engine transforms your content strategy by bridging the gap between raw data and a live, high-ranking Jekyll website. Instead of manually drafting and formatting Markdown files, this workflow utilizes an advanced LangChain agent powered by GPT-4 to synthesize high-quality, long-form articles from simple CSV inputs. The system intelligently parses your data, constructs SEO-optimized blog posts, and handles the technical heavy lifting by committing the code directly to your GitHub repository. This triggers your CI/CD pipeline for instant deployment. Beyond just publishing, the workflow acts as a social media manager, automatically distributing your new content to LinkedIn and X (Twitter) to maximize reach and backlinks. It is an ideal solution for lean marketing teams or solo developers who need to maintain a consistent publishing cadence without the overhead of a traditional CMS. By automating the research, writing, and distribution phases, users can scale their organic traffic exponentially while focusing on high-level strategy rather than repetitive formatting and manual commits. **Common Use Cases:** - Programmatic SEO for Niche Affiliate Marketing Sites - Automated Technical Documentation and Change Log Updates - Scalable Content Engine for Multi-Language Recipe or Directory Portals

Trigger·13 nodes
AGCOEMEM+5
medium

Build a Local AI Book Concierge with Ollama & n8n

Transform your inbox into an intelligent, automated literary assistant with this advanced n8n workflow. By leveraging the power of local LLMs through Ollama and the OpenLibrary API, this automation eliminates the manual effort of responding to reading inquiries. The process begins by monitoring an IMAP folder for incoming requests. Once an email arrives, a sophisticated AI agent analyzes the sender's natural language to pinpoint their specific interests and genre preferences. The workflow then executes a real-time API lookup to fetch accurate bibliographical data, including summaries and metadata. Finally, it crafts and sends a professional, personalized recommendation email back to the user. This setup is perfect for organizations looking to deploy privacy-focused AI solutions without relying on expensive cloud tokens. It demonstrates a high-level integration of LangChain agents, conditional logic, and external data enrichment, providing a seamless 'set-and-forget' experience for managing high volumes of literary or educational inquiries while maintaining a human-like touch in communication. **Common Use Cases:** - Personalized Reading Lists for Digital Newsletters - Automated Curriculum Support for Educational Institutions - Privacy-First Customer Engagement for Local Bookstores

▶️ Manual·15 nodes
AGCOFIGM+6
medium

Automate HubSpot AI Email Replies with Slack Approval (n8n)

This advanced n8n automation bridges the gap between AI-driven speed and human-verified precision. By integrating Google Gemini with your HubSpot CRM, the workflow automatically generates context-aware email responses whenever a new inquiry hits your Gmail inbox. Unlike basic auto-responders, this system queries your HubSpot records—including contact history, active deals, and support tickets—to ensure every draft is personalized and accurate. To maintain brand integrity, the draft isn't sent immediately; instead, it is routed to a dedicated Slack channel via an interactive message. Your team can review, edit, or approve the reply with a single click, triggering the final send via Gmail. This eliminates the manual overhead of switching between tabs to find customer data and drafting repetitive emails from scratch. It is the perfect solution for scaling customer operations without hiring additional headcount, ensuring that every outbound communication is data-backed and professionally vetted. The workflow utilizes LangChain agents for intelligent reasoning, filtering out internal noise to focus purely on high-value customer interactions. **Common Use Cases:** - High-priority sales lead nurturing where quick, data-informed follow-ups are critical for conversion. - Technical support escalation where AI drafts troubleshooting steps based on existing HubSpot ticket history. - Account management automation for handling routine client inquiries with personalized data from active deals.

Trigger·13 nodes