MLA-C01 Model Monitoring, Drift, Data Quality and A/B Testing Guide

April 1, 2026

Study MLA-C01 Model Monitoring, Drift, Data Quality and A/B Testing: key concepts, common traps, and exam decision cues.

On this page

This lesson is about knowing when the model has started to behave differently in production. AWS expects ML engineers to monitor data quality and inference behavior, detect drift, watch for workflow anomalies, and compare live variants safely before major rollout decisions.

Drift: Change in the incoming data or model behavior over time that can reduce inference quality.

A/B test: Controlled comparison where two variants receive different portions of traffic so their outcomes can be compared.

Shadow comparison: Evaluation pattern where a candidate model sees production-like traffic without becoming the primary live answer path.

What AWS is really testing here

AWS wants you to separate:

data-quality monitoring from infrastructure monitoring
drift detection from ordinary endpoint health checks
A/B testing from a blind full cutover
workflow anomaly detection from pure model-quality metrics

Start with the signal that proves the problem

    flowchart TD
	  A["Unexpected production behavior"] --> B{"Model output degraded?"}
	  B -->|Yes| C["Check drift, data quality, and live model behavior"]
	  B -->|No| D{"Release comparison needed?"}
	  D -->|Yes| E["Use A/B or shadow comparison"]
	  D -->|No| F["Check infrastructure or workflow lane first"]

The exam usually punishes candidates who answer with generic monitoring language before deciding whether the issue is drift, bad incoming data, or unsafe model promotion.

Strongest-first chooser

If the stem is mainly about…	Strongest first lane
changes in feature distributions or live inference input shape	data-quality and drift monitoring
whether a new model variant is safer than the current one	A/B test or shadow comparison
detecting degrading output quality over time	model-behavior monitoring
strange failures in the pipeline or monitoring workflow itself	workflow anomaly detection

Monitoring lanes are different

Lane	Main question
Data-quality monitoring	Are live inputs still well-formed and usable?
Drift detection	Has the input or output pattern shifted enough to threaten quality?
A/B or shadow testing	Is the new model actually better or safer than the current one?
Infra monitoring	Is the serving platform healthy?

If the question is about why outputs are worsening even though the endpoint is up, it is usually not an infrastructure question first.

If you keep missing questions in this lesson

Symptom	What is usually going wrong	Fix first
every monitoring answer looks plausible	you are not separating model signals from platform signals	ask whether the issue is output quality, input quality, or host health
drift questions feel vague	you are not anchoring on change over time in production data or behavior	ask what changed since the model last performed well
A/B and shadow tests blur together	you are not deciding whether the candidate should influence live outcomes yet	if safe comparison is needed without full live impact, favor shadow first
you keep promoting too quickly	you are ignoring controlled validation before rollout	ask whether the candidate model has been compared under realistic live conditions

Common traps

Trap	Better reading
“Endpoint is healthy, so the model is healthy.”	Infrastructure health does not prove output quality or data stability.
“Drift means the model should be replaced immediately.”	Drift is a signal for analysis, retraining, or controlled comparison, not blind promotion.
“A/B testing and shadow comparison are the same.”	A/B affects live outcomes for some traffic; shadow is safer when you only need comparison first.
“Monitoring is complete once the dashboard exists.”	MLA-C01 expects monitoring that leads to operational decisions, not just charts.

Harder scenario

A recommendation model’s endpoint latency looks normal, but click-through performance has dropped over the last month as the incoming product catalog and user behavior shifted. A new candidate model is available, but the team does not want to expose all users to it immediately.

The strongest first response is to work in the drift and controlled comparison lane: inspect the live-data shift, then use a safe A/B or shadow-style comparison before a full cutover. The core issue is not host health. It is changing production behavior.

Decision order that usually wins

Separate platform health, model drift, and live variant comparison.
If the endpoint is healthy but prediction quality is degrading, think drift and model-monitoring before infrastructure resizing.
If the question is about watching live data quality and model behavior, think Model Monitor.
If the team wants evidence before switching all traffic, think A/B or controlled live testing.
Do not confuse service stability with model quality stability.

Quiz

Loading quiz…

Revised on Monday, June 15, 2026

4.2 Observability, Cost & Rightsizing

Browse AWS Certification Guides