Skills / Data / Great Expectations Data Validation

Great Expectations Data Validation

Validate data quality with Great Expectations 1.x. Generates Expectation Suites, Checkpoints, Data Docs, and integrations with Airflow, dbt, and CI pipelines to catch bad data before it reaches downstream.

This skill writes Expectation Suites for tabular sources (Postgres, Snowflake, Spark, Pandas), configures Fluent Datasources, builds Checkpoints that fail CI on validation breaks, generates Data Docs as static HTML, and integrates GE into Airflow/Prefect/Dagster pipelines. Migrates legacy V2 yaml-config projects to the V1 Python-first API.

data-quality validation testing great-expectations observability

When to use

Use when adding data quality gates to pipelines, catching schema drift in upstream sources, generating data documentation for stakeholders, or migrating from GE V2 to V1.

Examples

Validate a Snowflake table

Suite with row-count, null, and uniqueness checks

Generate a Great Expectations suite for our orders Snowflake table: order_id unique, customer_id not null, total_amount between $0 and $100k, and row count within 10% of yesterday

CI gate on data quality

Fail GitHub Actions when validation breaks

Wire Great Expectations into a GitHub Actions workflow that fails the build if any Checkpoint validation fails — and posts the Data Docs URL to the PR comment
Added to wishlist