AWS MLA-C01 Glossary: ML Pipeline, Drift, and Endpoint Terms

March 30, 2026

AWS MLA-C01 glossary of ML pipeline, drift, endpoint, deployment, and monitoring terms.

On this page

Use this glossary when SageMaker and MLOps terms start to blur together. Keep it beside the cheat sheet and resources rather than treating it as a substitute for study.

High-yield terms

Term	Short meaning	Why it matters on MLA-C01
MLOps	Deployment, monitoring, versioning, and lifecycle discipline for ML systems	The exam is more engineering and operations than pure modeling theory
Feature store	Managed store for reusable model features	Prevents train/serve skew and supports repeated feature use
Endpoint	Hosted inference interface for serving model predictions	Central to real-time, async, and multi-model serving choices
Batch transform	Offline inference over a dataset rather than real-time requests	Tested against real-time and async serving patterns
Model registry	Managed inventory of model versions and approval states	Critical for rollback, traceability, and safe promotion
Drift	Production data or behavior changing away from the training or expected pattern	Core monitoring concept in live ML systems
Clarify	SageMaker tool for explainability and bias-related analysis	Often appears in fairness and explainability questions
Model Monitor	SageMaker monitoring capability for production models	Strongest first answer for drift and data-quality monitoring
Pipeline	Orchestrated ML workflow such as prepare, train, validate, and deploy	Key to retraining, CI/CD, and repeatability
Shadow deployment	Comparing production traffic against a new model without full cutover	Safer comparison than blind promotion
Blue/green deployment	Safer rollout pattern with a separate replacement environment	Helps reduce blast radius during rollout
Inference recommender	SageMaker guidance for deployment instance and configuration fit	Helps connect model serving choices to cost and capacity
Data Wrangler	Managed data-prep workflow for transformations and feature work	Strong answer in data preparation questions
Hyperparameter tuning	Systematic search across training settings	Distinct from model choice and deployment tuning
Multi-model endpoint	Shared endpoint that serves several low-traffic models	Cost and serving-fit concept, not a training concept
VPC isolation	Keeping inference resources inside private network boundaries	Common security and deployment control on MLA-C01
Train/serve skew	Difference between how features are built for training versus live inference	A classic feature-store and data-pipeline problem
Baseline	Reference dataset or statistics used to compare later inference behavior	Central to drift and monitoring questions
Ground truth	Actual observed outcome used later to assess prediction quality	Needed to reason about delayed quality evaluation
Lineage	Record of how data, code, training, and models relate across versions	Helps with auditability, rollback, and governance
Serverless inference	Managed inference that scales down when idle	Often the right answer for spiky low-duty-cycle traffic
Async inference	Inference pattern where the request returns later rather than immediately	Better fit for long-running or bursty jobs than always-on real-time serving

Commonly confused pairs

Pair	Keep this distinction clear
online inference vs batch transform	low-latency serving versus offline dataset scoring
drift vs bias	changing production behavior versus unfair or skewed model behavior
registry vs endpoint	managed version catalog versus live serving target
monitoring vs rollback	detecting trouble versus returning to a safer version
feature store vs raw training data	reusable engineered features versus general source data
batch vs async inference	scheduled or offline scoring versus delayed-response online serving
model quality vs infra health	whether the predictions are still good versus whether the platform is still healthy
Clarify vs Model Monitor	fairness and explainability analysis versus production drift/data monitoring
feature engineering vs labeling	improving input signal versus creating target values
train/serve skew vs drift	inconsistent feature logic versus production behavior changing over time
lineage vs registry	end-to-end record of artifacts and steps versus governed model version catalog

If three terms blur together

Cluster	Fast separation
endpoint / registry / pipeline	endpoint serves, registry tracks versions, pipeline orchestrates workflow
drift / data quality / infra issue	drift means behavior changed over time, data quality means input is malformed or incomplete, infra issue means the platform is slow or unstable
Data Wrangler / Feature Store / Model Monitor	Data Wrangler prepares data, Feature Store serves reusable features, Model Monitor watches live inference data
real-time / async / batch	real-time answers now, async answers later, batch scores offline datasets
IAM / VPC isolation / encryption	IAM controls who, VPC controls where from, encryption protects the data itself
registry / lineage / approval	registry stores model versions, lineage shows how they were produced, approval decides whether they move forward

One-sentence memory hooks

If the question is about reuse across training and inference, think Feature Store.
If the question is about comparing or approving model versions, think Model Registry.
If the question is about live drift or production input quality, think Model Monitor.
If the question is about fairness or explainability, think Clarify.
If the question is about costly always-on serving for light traffic, think endpoint fit before instance size.
If the question is about safe release, think staged rollout plus rollback, not just “deploy latest”.
If the question is about comparing now to earlier behavior, think baseline before you think “just add more metrics”.
If the question is about late-arriving actual outcomes, think ground truth collection before you claim you can measure live model quality immediately.

Operational clusters worth keeping straight

Cluster	What it usually signals on the exam
data quality / leakage / train-serve skew	fix the pipeline and feature logic before changing the model
tuning / evaluation / approval	improve the model and decide whether it is promotable
endpoint fit / autoscaling / inference recommender	match serving shape and cost to traffic reality
drift / baseline / ground truth	decide whether the model is still behaving acceptably in production
IAM / VPC / KMS / secrets	separate identity, network boundary, encryption, and secret handling

If the confusion is really about…

Topic family	Best page to revisit
deployment and MLOps quick rules	Cheat Sheet
official AWS facts and service docs	Resources
pacing and review order	Study Plan
overall exam framing	Guide root
training, tuning, and model versioning	2.2 Training, Tuning & Versions
drift, A/B testing, and live monitoring	4.1 Monitoring, Drift & A/B
endpoint shapes and scaling	3.1 Endpoints & Containers

Revised on Monday, June 15, 2026

Resources

Browse AWS Certification Guides