Intermediate–Advanced 15 terms

DevOps & Cloud

Vocabulary for CI/CD pipelines, deployment strategies, SRE practices, and cloud infrastructure.

  • Pipeline /ˈpaɪplaɪn/

    An automated sequence of stages (build → test → deploy) that takes code from commit to production.

    "Every pull request triggers the CI pipeline — it must pass all stages before the branch can be merged."
  • Artefact /ˈɑːtɪfækt/

    A packaged output of the build stage (Docker image, .jar, .zip) that is versioned and promoted through environments.

    "The build artefact is a Docker image tagged with the commit SHA — the same image is deployed to staging and production."
  • Container /kənˈteɪnər/

    A lightweight, isolated runtime environment that packages an application with its dependencies using OS-level virtualisation.

    "Running the app in a container eliminates the 'works on my machine' problem — the same image runs locally and in production."
  • Orchestration /ˌɔːkɪˈstreɪʃən/

    Automated management of containerised workloads: scheduling, scaling, self-healing, and service discovery (e.g. Kubernetes).

    "Kubernetes handles orchestration — when a pod crashes, the controller detects it and schedules a replacement automatically."
  • Infrastructure as Code /ˈɪnfrəstrʌktʃər æz kəʊd/

    Managing and provisioning infrastructure through machine-readable configuration files (Terraform, Pulumi, CloudFormation) rather than manual processes.

    "All our AWS resources are defined in Terraform — we can spin up an identical staging environment with a single apply command."
  • Blue-Green Deployment /bluː ɡriːn dɪˈplɔɪmənt/

    A release strategy that runs two identical environments (blue = live, green = new). Traffic is switched atomically, enabling instant rollback.

    "We use blue-green deployments to achieve zero downtime — if smoke tests fail on green, we flip the load balancer back to blue."
  • Canary Release /kəˈneəri rɪˈliːs/

    A deployment strategy that routes a small percentage of traffic to a new version before a full rollout, limiting blast radius.

    "We deployed the new recommendation engine as a canary to 5% of users — after 24 hours with no error rate increase, we ramped to 100%."
  • Feature Flag /ˈfiːtʃər flæɡ/

    A runtime toggle that enables or disables features without deploying new code, allowing dark launches and A/B tests.

    "The new checkout flow is behind a feature flag — we can enable it for beta users without a separate release."
  • SLI / SLO / SLA /ɛsɛlˈaɪ / ɛslˈəʊ / ɛslˈeɪ/

    SLI = measurement (e.g. request latency). SLO = internal target (p99 < 200ms). SLA = contract with customers (99.9% uptime).

    "Our SLO is 99.9% availability — the SLI (current measurement) shows 99.95%, so we have error budget to spend on experiments."
  • Error Budget /ˈerər ˈbʌdʒɪt/

    The allowed downtime or unreliability within an SLO period. When exhausted, new deployments are paused until reliability recovers.

    "We burned through 80% of our monthly error budget in one incident — feature releases are frozen until we address the root cause."
  • Observability /ɒbˌzɜːvəˈbɪlɪti/

    The ability to infer the internal state of a system from its external outputs: logs, metrics, and distributed traces.

    "We improved observability by adding structured logging, Prometheus metrics, and Jaeger tracing to every service."
  • Distributed Trace /dɪˈstrɪbjuːtɪd treɪs/

    A record of a request's path across multiple services, showing latency at each hop to identify performance bottlenecks.

    "The distributed trace showed that 90% of the latency came from a synchronous call to the inventory service — we made it async."
  • On-Call /ɒn kɔːl/

    A rotation where engineers are responsible for responding to production incidents outside business hours.

    "I'm on-call this week — my PagerDuty is configured to escalate to my phone if a P1 alert fires."
  • Runbook /ˈrʌnbʊk/

    A documented set of procedures for handling a specific operational task or incident type.

    "The runbook for database failover has 12 steps — anyone on-call can follow it without needing the original author."
  • Postmortem /ˈpəʊstmɔːtəm/

    A blameless written analysis of an incident covering timeline, root cause, impact, and action items to prevent recurrence.

    "The postmortem identified three contributing factors — we opened tickets for all three with owners and due dates."

Ready to practice?

Test your knowledge of these terms in the interactive exercise.

Start exercise →