Skills / Data / Apache Flink Streaming

Apache Flink Streaming

Build stateful stream processing with Apache Flink. Generates DataStream and Table API jobs, SQL pipelines, watermark strategies, exactly-once sinks, and Kubernetes Operator deployments.

This skill writes Flink jobs in Java, Scala, Python (PyFlink), and Flink SQL. Covers DataStream API with stateful operators and ProcessFunctions, Table/SQL API with CDC and Hive catalog, watermark and event-time handling, RocksDB state backend tuning, Kafka source/sink with exactly-once, and deployments via Flink Kubernetes Operator. Includes Confluent Cloud Flink and Amazon Managed Service for Flink.

flink streaming real-time stateful kafka

When to use

Use when building large-scale streaming pipelines with strict consistency guarantees, complex event processing, streaming SQL on Kafka, or migrating from Spark Streaming to a true streaming engine.

Examples

Flink SQL CDC pipeline

Postgres CDC to Iceberg with joins

Write a Flink SQL job that reads CDC streams from two Postgres tables via Debezium, joins them, and writes to an Iceberg table on S3 with checkpointing every minute

Stateful pattern detection

Detect fraud sequences with DataStream API

Build a Flink DataStream job that detects card-not-present fraud patterns: 3+ failed auths within 60s on the same card, with keyed state, watermarks, and exactly-once Kafka sink

Apache Flink Streaming

When to use

Examples

Flink SQL CDC pipeline

Stateful pattern detection

Save to Wishlist