⚙️ Engineering
Verified
Incident Response
Run incident response: triage, communicate, resolve, and write a blameless postmortem.
incident reliability postmortem on-call
When to use
Trigger with 'we have an incident', 'production is down', an alert needing severity assessment, a status update mid-incident, or when writing a blameless postmortem after resolution. Guides triage, communication, mitigation, and retrospective.
Examples
Triage an active incident
Assess severity and coordinate initial response
We have an incident: checkout is failing for ~30% of users. Error rate spiked 10 minutes ago. Help me triage.
Draft a status update
Write a customer-facing or internal incident communication
Write a status page update. We identified a DB connection pool exhaustion issue, a fix is deployed, monitoring recovery.
Write a postmortem
Document the incident, root cause, and action items
Write a blameless postmortem for yesterday's 45-minute outage. Root cause was a missing index on the orders table after migration.