Study DVA-C02 Logs, Metrics, Alerts, and Health Checks: key concepts, common traps, and exam decision cues.
This lesson is about building observability into the application instead of hoping infrastructure will explain everything later. DVA-C02 wants developers to understand logging strategy, custom metrics, alerts, tracing annotations, and the difference between liveness and readiness signals.
Structured logging: Logging approach where fields such as user ID, request ID, action, and status are emitted in a machine-readable format.
Health check: Probe or endpoint used to determine whether an application instance is alive, ready, or safe to receive traffic.
AWS wants you to know:
| Need | Strongest first control | Why |
|---|---|---|
| Search and correlate failures quickly | Structured logging | Queryable fields outperform free-form text under pressure. |
| Measure business or app-specific behavior | Custom metrics | Infra defaults do not capture app semantics. |
| Alert only when action is required | Thresholds tied to meaningful symptoms | Actionable alerts reduce noise fatigue. |
| Know whether the process is alive | Liveness health check | This is not the same as traffic readiness. |
| Know whether the instance can safely receive requests | Readiness health check | Prevents sending traffic to a warming or broken dependency path. |
| Connect application logs to metrics | EMF or deliberate structured emission | Bridges application events into CloudWatch metric workflows. |
flowchart LR
A["Application code"] --> B["Structured logs"]
A --> C["Custom metrics / EMF"]
A --> D["Tracing annotations"]
A --> E["Health endpoints"]
C --> F["Dashboards and alerts"]
E --> G["Traffic decisions and deployment safety"]
Strong answers usually prefer:
| Pair | How to separate them |
|---|---|
| logging vs monitoring | event detail vs aggregate signal and thresholds |
| custom metrics vs raw log search | measurable trends and alarms vs ad hoc investigation detail |
| liveness vs readiness | process alive vs safe to receive traffic |
| alerting vs dashboarding | proactive notification vs operator inspection surface |