Data Validator

Perform comprehensive data quality checks on datasets — validate schemas, detect anomalies, find duplicates, and enforce data contracts. Essential for ETL pipelines where bad data silently corrupts downstream analytics and dashboards.

Overview

The Data Validator is a specialized skill for AI agents, part of the TerminalSkills/skills repository on GitHub. This tool addresses the critical need for data integrity within automated workflows and ETL pipelines. By enabling agents like Claude, Gemini, and Codex to perform comprehensive quality checks, it helps prevent the silent corruption of downstream analytics and dashboards. The skill facilitates schema validation, anomaly detection, and duplicate identification while enforcing strict data contracts. As part of a repository with 72 stars, this skill provides a structured approach to maintaining dataset health. It is designed for developers using coding-focused agents to ensure that incoming data meets predefined standards before further processing or visualization occurs.

Use Cases

Verifying dataset schemas against predefined contracts during ETL pipeline execution.
Identifying statistical anomalies and duplicate records in raw data files.
Ensuring data quality before feeding information into analytics dashboards.

Install Notes

# Review source first
open https://github.com/TerminalSkills/skills/blob/main/skills/data-validator/SKILL.md

Copy or clone the skill folder into your agent skills directory after reviewing its instructions and scripts.

Security Notes

Users should ensure that the AI agent has appropriate read permissions for the datasets being analyzed. When processing sensitive or regulated information, verify that the agent's environment complies with local data privacy standards, as the skill interacts directly with dataset contents to perform validation and anomaly detection.

Related Skills

Electron

vercel-labs/agent-browser

Data Analysis

Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include "au

CodexClaude
designbrowser
37,057 starsSource linked

CodeQL

trailofbits/skills

Data Analysis

Scans a codebase for security vulnerabilities using CodeQL's interprocedural data flow and taint tracking analysis. Triggers on "run codeql", "codeql scan", "codeql analysis", "build codeql database", or "find vulnerabilities with codeql". Supports "run all" (security-and-quality + security-experimental suites) and "im

Claude CodeClaude
typescriptpython
5,853 starsSource linked

Deep Agents Orchestration

langchain-ai/langchain-skills

Data Analysis

INVOKE THIS SKILL when using subagents, task planning, or human approval in Deep Agents. Covers SubAgentMiddleware, TodoList for planning, and HITL interrupts.

CodexClaude
typescriptpython
817 starsSource linked

LangChain Fundamentals

langchain-ai/langchain-skills

Data Analysis

Create LangChain agents with create_agent, define tools, and use middleware for human-in-the-loop and error handling.

Claude
typescriptpython
817 starsSource linked

LangGraph Fundamentals

langchain-ai/langchain-skills

Data Analysis

INVOKE THIS SKILL when writing ANY LangGraph code. Covers StateGraph, state schemas, nodes, edges, Command, Send, invoke, streaming, and error handling.

CodexClaude
typescriptpython
817 starsSource linked

Ecosystem Primer

langchain-ai/langchain-skills

Data Analysis

INVOKE FIRST for any LangChain / LangGraph / Deep Agents agent building project before consulting other skills or writing any agent code. Required starting point for up to date info on framework selection (LangChain vs LangGraph vs Deep Agents vs hybrid composition), agent patterns, install, environment setup, and whic

CodexClaude
typescriptpython
817 starsSource linked