DuckDB Analytics
Run fast analytical queries on Parquet, CSV, and JSON with DuckDB. Generates in-process pipelines, S3/HTTPFS reads, MotherDuck deploys, and pandas/polars integration for local-first analytics.
DuckDB is a columnar in-process database — like SQLite for analytics. This skill writes pipelines that read Parquet/CSV directly from S3, joins them in memory, exports results, and migrates queries between DuckDB local and MotherDuck cloud.
duckdb analytics olap parquet sql
When to use
Use for ad-hoc analytics on flat files, replacing pandas for big-but-not-huge data, building local-first dashboards, or running CI data quality checks without provisioning a warehouse.
Examples
Query Parquet directly from S3
Skip the warehouse for one-off analysis
Write a DuckDB query that joins three Parquet files on S3 via httpfs and exports the result to CSV without loading into memory
Replace pandas in an ETL job
Speed up a slow pandas pipeline
Rewrite this pandas join+aggregation pipeline as DuckDB SQL over the same CSVs — it's running out of memory at scale