Intermediate 15 terms

Cloud Infrastructure

Core vocabulary for cloud engineers: VPC, availability zones, auto-scaling, IAM, serverless, IaC, managed services, and cost concepts across AWS, GCP, and Azure.

  • Region /ˈriːdʒən/

    A geographic area where a cloud provider operates data centres. AWS has regions like us-east-1, eu-west-1. Each region is independent — data stored in one region does not leave it unless you explicitly replicate it. Choosing a region affects latency, compliance, and service availability.

    "We deploy to eu-west-1 for our European users — GDPR requires that personal data stays within the EU, and this region provides the lowest latency to our primary customer base in Germany and France."
  • Availability Zone (AZ) /əˌveɪləˈbɪlɪti zoʊn/

    An isolated data centre (or cluster of data centres) within a cloud region. Multiple AZs in a region are physically separated but connected by low-latency links. Deploying across multiple AZs ensures high availability — a failure in one AZ does not take down your service.

    "We deploy our application across three AZs in the same region — if a fire or power failure affects one data centre, the load balancer automatically routes traffic to the remaining two."
  • VPC (Virtual Private Cloud) /viː piː siː/

    A logically isolated virtual network within a cloud provider. You define IP address ranges, subnets (public and private), route tables, and gateways. Resources in a VPC are not accessible from the internet by default — you must explicitly allow traffic.

    "Our database instances live in private subnets with no internet route — only the application servers in the public subnet can reach the database, and only via the VPC's internal IP range."
  • Load Balancer /loʊd ˈbælənsər/

    A service that distributes incoming traffic across multiple backend instances. Types: Application Load Balancer (ALB — HTTP/HTTPS, path-based routing), Network Load Balancer (NLB — TCP/UDP, ultra-low latency), Classic Load Balancer (legacy). Performs health checks and stops routing to unhealthy instances.

    "The ALB routes /api/* requests to our backend fleet and /* to the static frontend. Health checks run every 10 seconds — if an instance fails 3 consecutive checks, it is removed from the target group."
  • Auto-scaling /ˈɔːtoʊ ˈskeɪlɪŋ/

    Automatically adjusting the number of running instances based on demand metrics (CPU, request rate, queue depth). Scale-out: add instances when load increases. Scale-in: remove instances when load drops. Improves availability and reduces cost compared to static provisioning.

    "Our auto-scaling group scales out when CPU exceeds 70% for 2 minutes and scales in when it drops below 30% for 10 minutes — during peak traffic we run 12 instances; overnight it drops to 2."
  • Managed Service /ˈmænɪdʒd ˈsɜːrvɪs/

    A cloud-provider-operated service where the provider handles infrastructure management: patching, backups, scaling, and availability. You use the service, not the underlying servers. Examples: RDS (managed database), S3 (managed object storage), SQS (managed message queue), ElastiCache (managed Redis).

    "We replaced our self-hosted PostgreSQL cluster with Amazon RDS — we no longer manage OS patches, backups, or failover. The time saved goes into feature development, not infrastructure maintenance."
  • IAM (Identity and Access Management) /aɪ eɪ em/

    The cloud permission system that controls who (users, services, roles) can do what (actions) on which resources. Follows the principle of least privilege — each service should have only the permissions it needs and no more. Misconfigured IAM is a leading cause of cloud security incidents.

    "Our Lambda function has an IAM role that allows only s3:GetObject on the specific bucket it reads from — it cannot list buckets, write files, or access any other AWS service, limiting blast radius if the function is compromised."
  • Serverless /ˈsɜːrvərles/

    An execution model where you deploy code (functions) without managing servers. The cloud provider allocates resources on demand, scales automatically, and charges per execution (not per idle server). Examples: AWS Lambda, Google Cloud Functions, Azure Functions. Cold start: latency on first invocation after idle period.

    "Our image resizing pipeline is serverless — each upload triggers a Lambda function that creates thumbnails. We pay only for the seconds of execution time, not for idle capacity between uploads."
  • Object Storage /ˈɒbdʒɪkt ˈstɔːrɪdʒ/

    Flat file storage accessed via HTTP API (no filesystem hierarchy). Each object has a unique key and metadata. Infinitely scalable, durable (11 nines on S3), and cheap for large volumes. Examples: S3, GCS, Azure Blob Storage. Contrast: block storage (EBS — for databases), file storage (EFS — for shared filesystems).

    "All user-uploaded files go to S3 — we generate a pre-signed URL valid for 1 hour when the client needs to upload directly, so the file never passes through our API servers and we avoid bandwidth costs."
  • CDN (Content Delivery Network) /siː diː en/

    A globally distributed network of edge servers that cache content close to end users. Reduces latency for static assets (images, JS, CSS) and can absorb DDoS traffic. Examples: CloudFront, Cloudflare, Fastly, Akamai. Origin: the source server the CDN pulls from when the cache misses.

    "Our images are served via CloudFront — the edge node in Frankfurt serves German users without a round-trip to our US origin. Cache hit ratio is 94%, saving significant bandwidth costs."
  • Infrastructure as Code (IaC) /ˈɪnfrəstrʌktʃər æz koʊd/

    Defining and managing cloud infrastructure through code files rather than the console UI. Changes are version-controlled, peer-reviewed, and applied by automation. Examples: Terraform (multi-cloud), CloudFormation (AWS), Pulumi (code-native), CDK (AWS).

    "All our AWS infrastructure is managed via Terraform — no one clicks in the console. If an engineer accidentally deletes resources, we restore them with terraform apply. Every infra change goes through a PR review."
  • Spot Instance / Preemptible VM /spɒt ˈɪnstəns/

    Discounted cloud instances (60–90% cheaper) that can be interrupted by the cloud provider on short notice (typically 2 minutes warning) when capacity is needed for on-demand customers. Suitable for fault-tolerant, stateless, or batch workloads that can handle interruption.

    "Our ML training jobs run on Spot Instances — each job checkpoints progress every 10 minutes. If the instance is interrupted, the next one resumes from the last checkpoint. We save 75% on compute costs."
  • Egress Cost /ˈiːɡres kɒst/

    The cost charged for data leaving a cloud provider's network (egress = outbound). Cloud providers charge for outbound data transfer — ingress (data in) is typically free. Egress costs are a significant budget item for data-heavy applications and a major concern for multi-cloud or cloud exit strategies.

    "Moving 1TB from S3 to our office costs $90 in egress fees — we now use S3 Transfer Acceleration only for data we export once. Regular reading within AWS (EC2 to S3 in the same region) is free."
  • Health Check /helθ tʃek/

    A periodic probe sent to a service instance to verify it is responding correctly. Load balancers, orchestrators (Kubernetes), and monitoring systems use health checks to route traffic only to healthy instances. Common patterns: HTTP /health endpoint returning 200. Liveness: is the process running? Readiness: is it ready to take traffic?

    "Our /health endpoint returns 200 only when the database connection is alive — this means the load balancer stops sending traffic to an instance that has lost its DB connection, preventing cascading failures."
  • SLA / Uptime SLA /es el eɪ/

    Service Level Agreement — a contractual commitment to availability. 99.9% ("three nines") = 8.7 hours downtime/year. 99.99% ("four nines") = 52 minutes/year. 99.999% ("five nines") = 5.3 minutes/year. Cloud providers publish SLAs per service; you build your product SLA on top of them.

    "AWS RDS Multi-AZ has a 99.95% SLA — with two dependent services at 99.95% each, our combined availability target must account for the multiplicative effect when designing our own SLA commitment to customers."