Advanced 6 topic areas 26+ exercises

ML Platform Engineer

ML Platform Engineers build the internal infrastructure that enables data scientists and ML engineers to train, deploy, monitor, and iterate on machine learning models at scale. Their English work spans technical documentation (feature store design specs, platform runbooks), cross-functional communication (explaining drift detection to product managers, presenting experiment tracking governance to compliance), and internal developer advocacy (teaching teams to use the platform correctly). This path focuses on the vocabulary of MLOps infrastructure from a platform ownership perspective.

Start first exercise → Browse all exercises

Topics covered

Feature store design
Model registry & versioning
Model drift detection
Experiment tracking
Batch vs real-time inference
ML governance & reproducibility

Vocabulary spotlight

4 terms every ML Platform Engineer should know in English:

training-serving skew n.

The divergence between feature computation in a training pipeline (usually Python/pandas) and in the production serving layer (often a different language or framework), causing a model to perform differently in production than in evaluation

"The training-serving skew was traced to a normalisation function implemented differently in the Spark training job and the Java serving microservice — the feature store solved this by making both use the same feature definition."

point-in-time correctness n.

The property of retrieving feature values as they existed at the moment of a historical prediction, not their current values — required to prevent future data leakage into training labels

"Our feature store enforces point-in-time correctness for all training queries: the system joins features at the timestamp of each label, not the latest available value."

concept drift n.

A change in the statistical relationship between input features and the target variable over time, causing a model trained on historical data to degrade in production — requires ground truth labels to detect

"Concept drift was confirmed after the marketing team changed the customer segmentation strategy — the churn model's predictions degraded because the same features now corresponded to different behaviour patterns."

SBOM (ML context) n.

Software Bill of Materials applied to ML artefacts — a manifest of the model's training data version, code version, framework dependencies, and hardware environment, used for reproducibility and compliance auditing

"Our platform generates an SBOM for every model promoted to production, enabling full reproducibility of any training run and compliance with the organisation's AI governance policy."

Open full glossary →

📚 Vocabulary Reference

Key terms organised by category for ML Platform Engineers:

Feature Store

feature storeoffline storeonline storefeature pipelinefeature definitiontraining-serving skewpoint-in-time correctnessfeature registryfeature groupmaterialisation

Model Lifecycle

model registryexperiment runmodel versionmodel stagechampion/challengermodel promotionmodel rollbackreproducibility bundleSBOM (ML)artefact lineage

Drift & Monitoring

data driftconcept driftcovariate shiftPSI (Population Stability Index)KS testprediction distribution shiftground truth labellabel latencydrift scoreretraining trigger

Inference Infrastructure

batch inferencereal-time inferencemodel servingTriton Inference ServerONNXp99 latencycold startmodel warm-upinference pipelineshadow mode

Study full vocabulary modules →

Recommended exercises

ML Platform Engineer Interview Questions 5 exercises

Interview

Architecture Talks — Listening 3 exercises

Listening

Writing Design Documents 3 exercises

Writing

Hedging Language 5 exercises

Grammar

Email: Cross-Team Coordination 5 exercises