Prometheus & Alertmanager
Set up metrics collection and alerting with Prometheus. Generates PromQL queries, recording rules, alerting rules, Alertmanager routing configs, and custom exporters for application monitoring.
This skill helps you build observability stacks with Prometheus. It generates PromQL queries for dashboards and alerts, creates recording rules for expensive computations, configures Alertmanager with routing trees and notification channels, writes custom exporters in Go or Python, and sets up service discovery for Kubernetes. Covers Thanos and Mimir for long-term storage and multi-cluster federation.
When to use
Use when setting up Prometheus monitoring, writing PromQL queries, configuring alerting rules, building custom exporters, or designing multi-cluster observability with Thanos.
Examples
SLO alerting
Create SLO-based alerts with burn rate windows
Define Prometheus alerting rules for a 99.9% availability SLO using multi-window burn rate approach with 5m, 30m, 1h, and 6h windows
Custom exporter
Build a Prometheus exporter for application metrics
Write a Prometheus exporter in Go that exposes custom business metrics: active users, transaction volume, queue depth, and cache hit ratio