Confluent CCDAK Cheat Sheet: Kafka Dev, Producers, Consumers, and Streams

April 13, 2026

Confluent CCDAK cheat sheet for Kafka dev, producers, consumers, streams, traps, and final review.

On this page

Offset: Position marker for a record within a partition.

Consumer group: Coordinated set of consumers that share partition work while tracking position state.

Idempotence: Producer behavior that prevents duplicates caused by retries.

CCDAK answer sequence

Use this when the stem mixes stream processing, connectors, SQL, or deployment.

    flowchart TD
	  S["Scenario"] --> D["Classify the data-app lane"]
	  D --> S2["Check streams, connectors, or SQL fit"]
	  S2 --> O["Check deployment, scale, or failure behavior"]
	  O --> V["Verify output and operational evidence"]

Fast lane picker

If the question is really about…	Focus first on…	Strongest first move
ordering or partition behavior	keys, partitions, group parallelism	decide what must stay together before thinking about throughput
delivery guarantees	commit timing, idempotence, transactions, `acks`	decide what may be lost or duplicated
producer performance	batching, compression, retries, in-flight safety	separate latency tuning from semantics
consumer instability	lag, rebalances, polling cadence, commit strategy	decide whether the issue is throughput or liveness
schema compatibility	format choice, compatibility mode, shared-topic safety	protect consumers before optimizing payload shape

Producer to consumer mental model

    flowchart LR
	  P["Producer"] --> T["Topic"]
	  T --> P0["Partition 0"]
	  T --> P1["Partition 1"]
	  P0 --> G["Consumer group"]
	  P1 --> G
	  G --> C1["Consumer 1"]
	  G --> C2["Consumer 2"]

What to notice:

ordering exists only inside a single partition
keys usually decide partition affinity
group scaling depends on partition count, not on wishful parallelism

Ordering, keys, and throughput chooser

Requirement	Strongest first fit	Why	Common trap
preserve ordering for one entity	stable key	same key routes to same partition	increasing consumers and expecting cross-partition ordering
increase parallelism	more partitions	more shards let more consumers work	scaling consumers beyond partition count
reduce hot partitions	improve key distribution or partitioner strategy	skew, not raw throughput, is the issue	adding brokers when the key design is wrong
predictable group scaling	partitions at least equal to peak consumer count	extra consumers idle otherwise	blaming lag on the consumer library first

Producer chooser

Goal	Strongest first settings	Why
strongest delivery safety	`acks=all` + idempotence + healthy replication	protects against retry duplicates and weak acknowledgements
lowest latency	low `linger.ms`, smaller batches	more sends, less wait
highest throughput	larger batches, `linger.ms`, compression	better network efficiency
safer retry behavior	idempotence enabled	avoids duplicate writes from retry paths

Setting	What it changes	High-yield reminder
`acks`	acknowledgement level	semantics and durability, not just speed
`linger.ms`	batching delay	throughput vs latency trade-off
`batch.size`	batch capacity	larger batches can improve efficiency
`compression.type`	payload compression	network and storage efficiency
`retries`	resend attempts	helpful only if duplicate safety is understood
`max.in.flight.requests.per.connection`	concurrent sends	can affect ordering and safe retry behavior

Consumer and commit chooser

Requirement	Strongest first fit	Why
simplest behavior with less control	auto commit	easy but weaker control over semantics
process then mark progress	manual commit after successful handling	common at-least-once pattern
avoid consuming from the end on first start	`auto.offset.reset=earliest` when replay is needed	new group should read old data
reduce rebalance churn during long processing	tune `max.poll.interval.ms` and processing design	liveness must survive real work

Semantics	How you get it	Risk
at-most-once	commit before processing	loss possible
at-least-once	process then commit	duplicates possible
exactly-once in Kafka workflows	idempotent producer + transactions + `read_committed`	more complexity, stronger guarantees inside supported patterns

Rebalance and lag symptom table

Symptom	First things to check	Common trap
lag rising steadily	partitions, processing speed, downstream bottleneck	tuning schemas when the consumer is simply too slow
frequent rebalances	poll interval, session timeout, unstable membership	blaming brokers first
duplicates observed	retries without idempotence, replay after failure, handler not idempotent	assuming Kafka alone guarantees end-to-end exactly-once
out-of-order events	missing or inconsistent key, cross-partition comparison	treating the topic like a single global queue
idle consumers in a group	partition count too low	adding more consumers instead of more partitions

Schema and serialization chooser

Format	Strongest use	Why
JSON	prototyping or loose contracts	easy to read, weaker schema discipline
Avro	evolving event contracts	compact with strong schema support
Protobuf	strong typed cross-language contracts	compact and explicit, but more tooling overhead

Compatibility mode	What it protects	Simple reading
BACKWARD	new consumers reading old data	safer default in many event systems
FORWARD	old consumers reading new data	protects older readers
FULL	both directions	strictest shared-topic choice
NONE	no compatibility guarantee	fastest iteration, highest contract risk

High-confusion pairs

Pair	Keep this distinction clear
key vs partition	key influences partition choice; partition is the actual ordering boundary
auto commit vs manual commit	convenience vs semantic control
at-least-once vs exactly-once	duplicate-tolerant processing vs stronger transactional guarantees
idempotence vs transaction	safe retries for one producer vs atomic multi-write workflow
lag vs rebalance	throughput problem vs group coordination/liveness problem
schema format vs compatibility mode	payload encoding choice vs evolution rule

Last 15-minute review

Review this	Because it fixes…
ordering is per partition, not per topic	architecture misunderstandings
keys, partitions, and group scaling rules	throughput and lag misses
commit timing and semantics	delivery-guarantee mistakes
idempotence and transactions	duplicate-vs-EOS confusion
rebalances and poll timing	consumer-stability misses
Avro/Protobuf/JSON plus compatibility modes	schema evolution distractors

What strong answers usually do

protect ordering, duplication, and replay behavior before optimizing raw throughput
separate producer guarantees from consumer coordination
distinguish settings that change semantics from settings that mostly change performance
reason from partition and group behavior instead of memorizing isolated config flags

Revised on Monday, June 15, 2026

Study Plan

Sample Questions

Browse Confluent Certification Guides