AWS MLA-C01 deployment guide covering endpoints, containers, networking, orchestration, and retraining decisions.
This chapter is where MLA-C01 tests whether you can get a model into production safely and keep the workflow repeatable. AWS expects ML engineers to choose the right endpoint type, provision compute sensibly, automate infrastructure, and wire CI/CD or retraining flows that still allow rollback.
AWS currently weights Deployment and Orchestration of ML Workflows at 22% of scored content.
This domain is testing whether your ML solution can survive contact with production. Strong answers here:
| Lesson | Focus |
|---|---|
| 3.1 Endpoints & Containers | Learn how to match real-time, async, batch, multi-model, and container choices to the deployment requirement. |
| 3.2 IaC, Autoscaling & Networking | Learn how provisioning, autoscaling, VPC placement, and endpoint resource controls shape production behavior. |
| 3.3 ML CI/CD & Retraining | Learn how pipelines, retraining flows, tests, and rollback strategy keep ML delivery repeatable. |
| If the question is really about… | Go first to… |
|---|---|
| real-time vs async vs batch, CPU vs GPU, containers, SageMaker endpoints, ECS, EKS, Lambda, or edge optimization | 3.1 Endpoint Types, Containers, Deployment Targets & Trade-Offs |
| CloudFormation, CDK, VPC-hosted endpoints, autoscaling policies, or inference capacity sizing | 3.2 IaC, Autoscaling, VPC Hosting & Resource Provisioning |
| CodePipeline, CodeBuild, EventBridge retraining, tests, deployment flow, or rollback | 3.3 ML CI/CD, Orchestration, Retraining & Rollback |
| Symptom | What is usually going wrong | Fix first |
|---|---|---|
| every serving option sounds plausible | you are not classifying the latency and request pattern first | rework 3.1 and decide real-time vs async vs batch before naming a service |
| autoscaling and provisioning answers blur together | you are mixing baseline capacity, network placement, and runtime elasticity | rework 3.2 and separate static provisioning choices from dynamic scaling behavior |
| CI/CD questions feel too DevOps-heavy | you are missing the ML-specific parts: validation, registry, retraining, and rollback | rework 3.3 and track what is unique about model delivery versus generic app delivery |
| you keep choosing complex orchestration | you are not rewarding repeatability and safe rollback enough | prefer the simpler repeatable path that still meets the production requirement |
Make sure you can explain:
Then move to 4. Operations, where AWS expects you to operate the full system after it is live.