Study SOA-C03 Scaling, Elasticity and Caching: key concepts, common traps, and exam decision cues.
This lesson is about keeping workloads responsive without throwing random capacity at the wrong bottleneck. SOA-C03 expects you to read the symptom first, then decide whether the fix belongs in the compute tier, the cache tier, or the managed-database tier.
Elasticity: Ability to add or remove capacity as demand changes without designing for fixed peak load all the time.
Caching layer: Faster read path that serves repeated requests without hitting the origin or database every time.
AWS wants you to recognize:
| Symptom | Strongest first lane | Why |
|---|---|---|
| Stateless web tier cannot keep up with burst traffic | Auto Scaling policy review | The core issue is changing request volume across horizontally scalable instances. |
| Same objects or pages are requested repeatedly from many locations | CloudFront or ElastiCache | Repeated reads usually mean cache relief is stronger than raw compute growth. |
| Database reads are saturated while application hosts stay healthy | Read scaling or cache in front of the data tier | Adding app instances does not fix a read-bound database. |
| Queue backlog rises during bursts but workers are otherwise fine | Queue-depth-driven worker scaling | The issue is elasticity of consumers, not the producer path. |
| One large instance is consistently maxed by a stateful workload | Vertical resize or workload redesign | Not every workload can scale out cleanly. |
| Need | Strongest first service or pattern | Why it fits |
|---|---|---|
| Dynamic stateless compute capacity | EC2 Auto Scaling with target tracking or step scaling | Lets capacity follow load signals automatically. |
| Global repeated content requests | CloudFront | Removes origin load and shortens network path. |
| Low-latency repeated application reads | ElastiCache | Offloads hot reads from the database tier. |
| Managed NoSQL throughput elasticity | DynamoDB auto scaling or on-demand mode | Database scaling is built into the service model. |
| Managed relational read relief | Read replicas or cache | RDS scaling choices differ from pure app-tier scaling. |
flowchart TD
A["Performance issue under load"] --> B{"What is actually saturated?"}
B -->|"Stateless app tier"| C["Review Auto Scaling policy, health checks, cooldowns, and launch capacity"]
B -->|"Repeated reads or hot content"| D["Add or tune CloudFront or ElastiCache"]
B -->|"Managed database reads or throughput"| E["Use database-native scaling or offload reads"]
B -->|"Single stateful node"| F["Resize vertically or redesign the bottleneck"]
The exam often tries to blur these lanes:
If the database is saturated, more web servers can make the incident worse by generating even more reads.
| Trap | Better thinking |
|---|---|
| “Traffic went up, so add bigger instances first.” | Start by asking whether the tier is horizontally scalable and whether the signal actually points to compute. |
| “CloudFront is just for static websites.” | CloudFront is also a scaling and latency tool for cacheable content in front of dynamic origins. |
| “Multi-AZ means the workload scales automatically.” | Multi-AZ is primarily an availability control, not an elasticity control. |
| “If the app is slow, the database must be fine because it is managed.” | Managed does not mean infinite throughput or zero tuning decisions. |