Advanced Writing #post-mortem #incident #sre

Post-mortems & Incident Writing

3 exercises — write professional incident timelines, client-facing outage notifications, and actionable "what we'll do differently" sections.

0 / 3 completed
Post-mortem writing principles
  • Timeline: Precise UTC timestamps · specific tech details · shows gaps (where time was lost)
  • Client notification: Impact in customer terms · no data loss / data safe · no internal blame · follow-up commitment
  • Action items: Owner + due date + specific measurable change · systemic fixes preferred over behaviour changes
  • Blameless: Focus on systems and processes, not on individuals
1 / 3

Incident scenario: On 2024-09-14, the payment service experienced a 47-minute outage (14:32–15:19 UTC) caused by a misconfigured load balancer rule deployed at 14:28 UTC. The deploy was not rolled back for 44 minutes because the on-call engineer was in a meeting with phone on silent.

Which post-mortem timeline section is written most effectively?