Beginner Incident Response #SLA #SRE #MTTR

On-Call Vocabulary

3 exercises — master the essential metrics and terms every on-call engineer needs: SLI/SLO/SLA, MTTR/MTTD, and severity levels.

0 / 3 completed
Quick reference: on-call metrics
  • SLI — what you measure (e.g. error rate, latency)
  • SLO — your internal target (e.g. 99.9% success rate)
  • SLA — the contractual commitment to customers (lower than SLO)
  • MTTD — time from failure start → first alert / detection
  • MTTR — time from failure start → full recovery
  • MTBF — average time between incidents (higher = more stable)
  • P0–P4 — lower number = more severe (P0 = all-hands, P4 = cosmetic)
1 / 3
A product manager asks: "What's the difference between an SLA, SLO, and SLI?" Which definition set is correct?