Study Confluent CCAAK Cluster Health: key concepts, common traps, and exam decision cues.
On this page
This lesson matters because not every Kafka warning has the same severity. The exam expects you to distinguish replication lag, leadership failure, and controller instability before you choose a response.
Health-signal chooser
Signal
Strongest first reading
URP increasing
followers are not keeping up safely
offline partitions
leadership availability is impaired
controller churn
cluster coordination is unstable
healthy traffic but strange client symptoms
observability and path evidence need correlation
What the exam is really testing
If the scenario shows…
Strong reading
URP only
replication health is degraded but not necessarily offline
offline partitions
more severe availability problem
repeated leadership changes
controller or broker stability is under test
unclear symptoms
metrics, logs, and health indicators should be correlated before action
Common traps
Trap
Better rule
treating URP as a consumer backlog problem
URP is a replication-side signal
treating offline partitions like normal lag
missing leader availability is more severe
skipping observability and going straight to config changes
evidence-first diagnosis is safer
Decision order that usually wins
Rank the symptom first: offline partitions are usually more severe than ordinary URP growth.
Treat URP as replication-health evidence, not as consumer lag.
If leadership is flapping, think controller or broker instability before tuning topics.
Correlate metrics, logs, and recent events before making broad config changes.