Intermediate Vocabulary #data-science #machine-learning #ml-vocabulary #ai

Data Science & ML Vocabulary

5 exercises — essential vocabulary for data scientists, ML engineers, and analysts: model evaluation metrics, pipeline terminology, and the concepts you need to discuss AI systems in English.

Core ML vocabulary clusters
  • Model quality: overfitting, underfitting, generalisation, bias-variance trade-off
  • Evaluation: accuracy, precision, recall, F1 score, AUC-ROC, confusion matrix
  • Training: hyperparameter, epoch, batch size, learning rate, gradient descent
  • Data: feature, label, training/validation/test split, feature engineering, normalisation
  • Infrastructure: ML pipeline, feature store, model registry, experiment tracking
  • Explainability: interpretability, SHAP, LIME, feature importance, attention
0 / 5 completed
1 / 5
A data scientist explains their work to a colleague. Which sentence correctly describes overfitting?