Advanced Vocabulary #observability #prometheus #otel #sre

Observability & Monitoring Vocabulary

5 exercises — three pillars (metrics/logs/traces), Prometheus metric types, cardinality, OpenTelemetry spans and context propagation, and SLI/SLO/SLA/error budget.

0 / 5 completed
Observability vocabulary quick reference
  • Metrics = what (numbers over time) · Logs = what happened (events) · Traces = how (request journey)
  • Counter = only increases · Gauge = up and down · Histogram = buckets (for latency p99)
  • Cardinality = unique time series count; high cardinality (user_id labels) = OOM risk
  • Trace = end-to-end request · Span = one unit of work · context propagation = passing trace/span IDs across services
  • SLI = measured metric · SLO = internal target · SLA = customer contract · error budget = 100% - SLO
1 / 5
A postmortem document says: "The incident was not detected for 23 minutes because our three pillars of observability — metrics, logs, and traces — were not correlated." What does each pillar tell you and how do they differ?