MLA-C01 Model Monitoring, Drift, Data Quality and A/B Testing Guide

Study MLA-C01 Model Monitoring, Drift, Data Quality and A/B Testing: key concepts, common traps, and exam decision cues.

This lesson is about knowing when the model has started to behave differently in production. AWS expects ML engineers to monitor data quality and inference behavior, detect drift, watch for workflow anomalies, and compare live variants safely before major rollout decisions.

Drift: Change in the incoming data or model behavior over time that can reduce inference quality.

A/B test: Controlled comparison where two variants receive different portions of traffic so their outcomes can be compared.

Shadow comparison: Evaluation pattern where a candidate model sees production-like traffic without becoming the primary live answer path.

What AWS is really testing here

AWS wants you to separate:

  • data-quality monitoring from infrastructure monitoring
  • drift detection from ordinary endpoint health checks
  • A/B testing from a blind full cutover
  • workflow anomaly detection from pure model-quality metrics

Start with the signal that proves the problem

    flowchart TD
	  A["Unexpected production behavior"] --> B{"Model output degraded?"}
	  B -->|Yes| C["Check drift, data quality, and live model behavior"]
	  B -->|No| D{"Release comparison needed?"}
	  D -->|Yes| E["Use A/B or shadow comparison"]
	  D -->|No| F["Check infrastructure or workflow lane first"]

The exam usually punishes candidates who answer with generic monitoring language before deciding whether the issue is drift, bad incoming data, or unsafe model promotion.

Strongest-first chooser

If the stem is mainly about… Strongest first lane
changes in feature distributions or live inference input shape data-quality and drift monitoring
whether a new model variant is safer than the current one A/B test or shadow comparison
detecting degrading output quality over time model-behavior monitoring
strange failures in the pipeline or monitoring workflow itself workflow anomaly detection

Monitoring lanes are different

Lane Main question
Data-quality monitoring Are live inputs still well-formed and usable?
Drift detection Has the input or output pattern shifted enough to threaten quality?
A/B or shadow testing Is the new model actually better or safer than the current one?
Infra monitoring Is the serving platform healthy?

If the question is about why outputs are worsening even though the endpoint is up, it is usually not an infrastructure question first.

If you keep missing questions in this lesson

Symptom What is usually going wrong Fix first
every monitoring answer looks plausible you are not separating model signals from platform signals ask whether the issue is output quality, input quality, or host health
drift questions feel vague you are not anchoring on change over time in production data or behavior ask what changed since the model last performed well
A/B and shadow tests blur together you are not deciding whether the candidate should influence live outcomes yet if safe comparison is needed without full live impact, favor shadow first
you keep promoting too quickly you are ignoring controlled validation before rollout ask whether the candidate model has been compared under realistic live conditions

Common traps

Trap Better reading
“Endpoint is healthy, so the model is healthy.” Infrastructure health does not prove output quality or data stability.
“Drift means the model should be replaced immediately.” Drift is a signal for analysis, retraining, or controlled comparison, not blind promotion.
“A/B testing and shadow comparison are the same.” A/B affects live outcomes for some traffic; shadow is safer when you only need comparison first.
“Monitoring is complete once the dashboard exists.” MLA-C01 expects monitoring that leads to operational decisions, not just charts.

Harder scenario

A recommendation model’s endpoint latency looks normal, but click-through performance has dropped over the last month as the incoming product catalog and user behavior shifted. A new candidate model is available, but the team does not want to expose all users to it immediately.

The strongest first response is to work in the drift and controlled comparison lane: inspect the live-data shift, then use a safe A/B or shadow-style comparison before a full cutover. The core issue is not host health. It is changing production behavior.

Decision order that usually wins

  1. Separate platform health, model drift, and live variant comparison.
  2. If the endpoint is healthy but prediction quality is degrading, think drift and model-monitoring before infrastructure resizing.
  3. If the question is about watching live data quality and model behavior, think Model Monitor.
  4. If the team wants evidence before switching all traffic, think A/B or controlled live testing.
  5. Do not confuse service stability with model quality stability.

Quiz

Loading quiz…
Revised on Sunday, May 10, 2026