Confluent CCDAK Cheat Sheet: Kafka Dev, Producers, Consumers, and Streams

Confluent CCDAK cheat sheet for Kafka dev, producers, consumers, streams, traps, and final review.

Use this for last-mile review. Pair it with the Resources for coverage, the Glossary for term refresh, and the matching Confluent practice flow on MasteryExamPrep.com when you want timed reps.

Offset: Position marker for a record within a partition.

Consumer group: Coordinated set of consumers that share partition work while tracking position state.

Idempotence: Producer behavior that prevents duplicates caused by retries.

CCDAK answer sequence

Use this when the stem mixes stream processing, connectors, SQL, or deployment.

    flowchart TD
	  S["Scenario"] --> D["Classify the data-app lane"]
	  D --> S2["Check streams, connectors, or SQL fit"]
	  S2 --> O["Check deployment, scale, or failure behavior"]
	  O --> V["Verify output and operational evidence"]

Fast lane picker

If the question is really about… Focus first on… Strongest first move
ordering or partition behavior keys, partitions, group parallelism decide what must stay together before thinking about throughput
delivery guarantees commit timing, idempotence, transactions, acks decide what may be lost or duplicated
producer performance batching, compression, retries, in-flight safety separate latency tuning from semantics
consumer instability lag, rebalances, polling cadence, commit strategy decide whether the issue is throughput or liveness
schema compatibility format choice, compatibility mode, shared-topic safety protect consumers before optimizing payload shape

Producer to consumer mental model

    flowchart LR
	  P["Producer"] --> T["Topic"]
	  T --> P0["Partition 0"]
	  T --> P1["Partition 1"]
	  P0 --> G["Consumer group"]
	  P1 --> G
	  G --> C1["Consumer 1"]
	  G --> C2["Consumer 2"]

What to notice:

  • ordering exists only inside a single partition
  • keys usually decide partition affinity
  • group scaling depends on partition count, not on wishful parallelism

Ordering, keys, and throughput chooser

Requirement Strongest first fit Why Common trap
preserve ordering for one entity stable key same key routes to same partition increasing consumers and expecting cross-partition ordering
increase parallelism more partitions more shards let more consumers work scaling consumers beyond partition count
reduce hot partitions improve key distribution or partitioner strategy skew, not raw throughput, is the issue adding brokers when the key design is wrong
predictable group scaling partitions at least equal to peak consumer count extra consumers idle otherwise blaming lag on the consumer library first

Producer chooser

Goal Strongest first settings Why
strongest delivery safety acks=all + idempotence + healthy replication protects against retry duplicates and weak acknowledgements
lowest latency low linger.ms, smaller batches more sends, less wait
highest throughput larger batches, linger.ms, compression better network efficiency
safer retry behavior idempotence enabled avoids duplicate writes from retry paths
Setting What it changes High-yield reminder
acks acknowledgement level semantics and durability, not just speed
linger.ms batching delay throughput vs latency trade-off
batch.size batch capacity larger batches can improve efficiency
compression.type payload compression network and storage efficiency
retries resend attempts helpful only if duplicate safety is understood
max.in.flight.requests.per.connection concurrent sends can affect ordering and safe retry behavior

Consumer and commit chooser

Requirement Strongest first fit Why
simplest behavior with less control auto commit easy but weaker control over semantics
process then mark progress manual commit after successful handling common at-least-once pattern
avoid consuming from the end on first start auto.offset.reset=earliest when replay is needed new group should read old data
reduce rebalance churn during long processing tune max.poll.interval.ms and processing design liveness must survive real work
Semantics How you get it Risk
at-most-once commit before processing loss possible
at-least-once process then commit duplicates possible
exactly-once in Kafka workflows idempotent producer + transactions + read_committed more complexity, stronger guarantees inside supported patterns

Rebalance and lag symptom table

Symptom First things to check Common trap
lag rising steadily partitions, processing speed, downstream bottleneck tuning schemas when the consumer is simply too slow
frequent rebalances poll interval, session timeout, unstable membership blaming brokers first
duplicates observed retries without idempotence, replay after failure, handler not idempotent assuming Kafka alone guarantees end-to-end exactly-once
out-of-order events missing or inconsistent key, cross-partition comparison treating the topic like a single global queue
idle consumers in a group partition count too low adding more consumers instead of more partitions

Schema and serialization chooser

Format Strongest use Why
JSON prototyping or loose contracts easy to read, weaker schema discipline
Avro evolving event contracts compact with strong schema support
Protobuf strong typed cross-language contracts compact and explicit, but more tooling overhead
Compatibility mode What it protects Simple reading
BACKWARD new consumers reading old data safer default in many event systems
FORWARD old consumers reading new data protects older readers
FULL both directions strictest shared-topic choice
NONE no compatibility guarantee fastest iteration, highest contract risk

High-confusion pairs

Pair Keep this distinction clear
key vs partition key influences partition choice; partition is the actual ordering boundary
auto commit vs manual commit convenience vs semantic control
at-least-once vs exactly-once duplicate-tolerant processing vs stronger transactional guarantees
idempotence vs transaction safe retries for one producer vs atomic multi-write workflow
lag vs rebalance throughput problem vs group coordination/liveness problem
schema format vs compatibility mode payload encoding choice vs evolution rule

Last 15-minute review

Review this Because it fixes…
ordering is per partition, not per topic architecture misunderstandings
keys, partitions, and group scaling rules throughput and lag misses
commit timing and semantics delivery-guarantee mistakes
idempotence and transactions duplicate-vs-EOS confusion
rebalances and poll timing consumer-stability misses
Avro/Protobuf/JSON plus compatibility modes schema evolution distractors

What strong answers usually do

  • protect ordering, duplication, and replay behavior before optimizing raw throughput
  • separate producer guarantees from consumer coordination
  • distinguish settings that change semantics from settings that mostly change performance
  • reason from partition and group behavior instead of memorizing isolated config flags
Revised on Sunday, May 10, 2026