Great Expectations Data Validation
Validate data quality with Great Expectations 1.x. Generates Expectation Suites, Checkpoints, Data Docs, and integrations with Airflow, dbt, and CI pipelines to catch bad data before it reaches downstream.
This skill writes Expectation Suites for tabular sources (Postgres, Snowflake, Spark, Pandas), configures Fluent Datasources, builds Checkpoints that fail CI on validation breaks, generates Data Docs as static HTML, and integrates GE into Airflow/Prefect/Dagster pipelines. Migrates legacy V2 yaml-config projects to the V1 Python-first API.
When to use
Use when adding data quality gates to pipelines, catching schema drift in upstream sources, generating data documentation for stakeholders, or migrating from GE V2 to V1.
Examples
Validate a Snowflake table
Suite with row-count, null, and uniqueness checks
Generate a Great Expectations suite for our orders Snowflake table: order_id unique, customer_id not null, total_amount between $0 and $100k, and row count within 10% of yesterday
CI gate on data quality
Fail GitHub Actions when validation breaks
Wire Great Expectations into a GitHub Actions workflow that fails the build if any Checkpoint validation fails — and posts the Data Docs URL to the PR comment