Data Pipeline & ETL
Build data pipelines and ETL workflows. Generates extraction scripts, transformation logic, loading procedures, and scheduling for moving data between APIs, databases, warehouses, and file systems.
This skill helps you build reliable data pipelines. It generates extraction scripts for APIs and databases, implements transformation logic with proper error handling, creates loading procedures for data warehouses, and sets up scheduling with idempotent processing. Supports Python (pandas), SQL, and Node.js pipelines.
When to use
Use when moving data between systems, building analytics pipelines, syncing databases with warehouses, processing CSV/JSON feeds, or implementing incremental data loads.
Examples
API to warehouse pipeline
Sync API data to a data warehouse
Build a Python pipeline that extracts user events from our API, transforms them into a star schema, and loads into BigQuery with incremental processing
CSV processing pipeline
Process and clean incoming CSV files
Create a pipeline that watches an S3 bucket for CSV uploads, validates and cleans the data, deduplicates, and loads into PostgreSQL with error reporting