dlt Data Loading
Build Python ELT pipelines with dlt (data load tool). Generates source functions, resource definitions, schema inference, incremental loads, and destination configs for warehouses, lakes, and vector DBs.
dlt is a Python library for loading data from anything to anywhere with auto-inferred schemas. This skill writes @dlt.source and @dlt.resource decorators, configures incremental loading with merge/append modes, sets up SQL/REST/filesystem verified sources, deploys to Airflow/GitHub Actions/Modal, and lands data into DuckDB, BigQuery, Snowflake, Iceberg, or Weaviate.
When to use
Use when building Python-first ELT without the overhead of full orchestrators, replacing custom extraction scripts, loading SaaS APIs into a warehouse, or feeding vector DBs from production sources.
Examples
REST API to warehouse
Incremental pull with merge strategy
Build a dlt pipeline that pulls paginated data from a REST API incrementally on updated_at, merges into a Snowflake destination on id, and runs daily on GitHub Actions
Database CDC to lakehouse
SQL source to Iceberg
Write a dlt source for our Postgres database using sql_database, with incremental loading per table and a destination config for Apache Iceberg on S3