Advanced Writing #post-mortem #incident #sre

Post-mortems & Incident Writing

3 exercises — write professional incident timelines, client-facing outage notifications, and actionable "what we'll do differently" sections.

0 / 3 completed

Post-mortem writing principles

Timeline: Precise UTC timestamps · specific tech details · shows gaps (where time was lost)
Client notification: Impact in customer terms · no data loss / data safe · no internal blame · follow-up commitment
Action items: Owner + due date + specific measurable change · systemic fixes preferred over behaviour changes
Blameless: Focus on systems and processes, not on individuals

1 / 3

Incident scenario: On 2024-09-14, the payment service experienced a 47-minute outage (14:32–15:19 UTC) caused by a misconfigured load balancer rule deployed at 14:28 UTC. The deploy was not rolled back for 44 minutes because the on-call engineer was in a meeting with phone on silent.

Which post-mortem timeline section is written most effectively?

2 / 3

After a 47-minute payment outage affecting enterprise customers, your team needs to send a client-facing status update. Which message is written most professionally for external stakeholders?

3 / 3

The team is writing the "What we will do differently" section of the post-mortem for the payment outage. Which version demonstrates the best engineering improvement mindset?