SOA-C03 Scaling, Elasticity and Caching Guide

April 1, 2026

Study SOA-C03 Scaling, Elasticity and Caching: key concepts, common traps, and exam decision cues.

On this page

This lesson covers SOA-C03 Task 2.1: implementing scalability and elasticity. AWS is testing whether you can follow demand without turning every traffic problem into a brute-force compute problem. Strong answers identify whether the real issue is stateless compute capacity, repeated reads, queue backlog, or managed database pressure, then choose the service that relieves that exact bottleneck.

Elasticity: Ability to add or remove capacity as demand changes instead of permanently provisioning for peak load.

Horizontal scaling: Adding more instances, tasks, or workers rather than making one node bigger.

Cache offload: Serving repeated reads from a fast intermediate layer so the origin or database is hit less often.

What AWS is really testing here

AWS wants you to recognize:

when the right answer is Auto Scaling instead of vertical resizing
when repeated reads point first to CloudFront or ElastiCache
when database scaling belongs in the RDS or DynamoDB lane
when queue depth and worker throughput should control elasticity
when a workload is not naturally horizontally scalable and needs a different answer

Start with the bottleneck, not the service name

Symptom pattern	Strongest first lane	Why
Stateless web or API tier cannot keep up with changing request volume	EC2 Auto Scaling or container/task scaling	Demand-following compute is the actual requirement.
Same objects or responses are requested repeatedly from many users	CloudFront or ElastiCache	The fastest path is often to avoid hitting the origin every time.
Worker backlog grows in a queue-based system	consumer scaling from queue depth	Queue length is the better scaling signal than front-end CPU.
Application hosts look healthy but managed database reads are saturated	cache or database read scaling	Adding more app instances would amplify the wrong tier.
One stateful component is the limit and cannot scale out cleanly	vertical resize or redesign	Not every bottleneck is a horizontal-scaling problem.

The main service-choice patterns

Need	Strongest first choice	Why it fits
Stateless compute should follow changing load	Auto Scaling target tracking or step scaling	This is the core elasticity control for compute environments.
Global repeated content requests should stop hammering the origin	CloudFront	Edge caching reduces origin load and often improves latency too.
Low-latency repeated application reads should avoid the database	ElastiCache	It removes hot-read pressure from the data tier.
NoSQL throughput should adjust with demand	DynamoDB on-demand or auto scaling	The service already exposes elasticity in a managed model.
Relational read pressure is the issue	read replicas, proxy design, or cache	Relational scaling is not the same as adding app servers.

CloudFront vs ElastiCache

SOA-C03 likes to test both as “caching,” but they solve different problems.

If the question is mainly about…	Think first about…
caching content for distributed users over the network edge	CloudFront
offloading repeated dynamic reads close to the application	ElastiCache
reducing database read pressure from hot objects or sessions	ElastiCache
reducing origin hits for cacheable responses and assets	CloudFront

Scaling signals matter

Strong SOA-C03 answers do not just say “use Auto Scaling.” They choose the signal that best represents useful work.

Better signals often include

request count
target response time
queue depth
consumer lag
database or backend-specific read pressure

Weaker signals often include

unrelated host metrics that do not map well to demand
alarms that react too late
signals from the wrong tier

Common traps

Trap	Better thinking
“Traffic spike means use a larger instance.”	First ask whether the workload is stateless and better solved by scaling out.
“CloudFront is only a content-delivery feature.”	On this exam, CloudFront is also a reliability and elasticity tool because it protects origins from repeated demand.
“If the app is slow, scale the app tier.”	If the data tier is saturated, that can make the problem worse.
“Multi-AZ and Auto Scaling solve the same problem.”	Multi-AZ is about availability; Auto Scaling is about elasticity.

Sample exam question

A web application runs on an Auto Scaling group and stores hot session and lookup data in a relational database. During predictable daily spikes, the web instances still have headroom, but the database read workload rises sharply and overall latency increases.

Which action is strongest first?

Increase the minimum number of web instances only
Add a caching layer or database read-scaling pattern to reduce read pressure
Replace Route 53 with CloudTrail
Reduce CloudWatch alarm periods

Correct answer: 2

Why: The stem says the web tier still has headroom while the database read path is saturated. The strongest first move is to reduce database read pressure rather than scale the wrong tier.

Decision order that usually wins

Decide whether the main problem is stateless compute demand, repeated reads, queue-driven worker pressure, or database throughput.
If it is stateless compute demand, stay in the Auto Scaling lane.
If it is repeated reads, think cache before more app servers.
If it is backend or database pressure, optimize the data tier instead of blindly scaling the front end.
If the workload is not horizontally scalable, consider vertical fit or redesign instead of forcing an elasticity pattern that does not match the architecture.

Quiz

Loading quiz…

Revised on Monday, June 15, 2026

2.2 HA, Load Balancing & Resilience

Browse AWS Certification Guides