Thursday, April 9, 2026

Apache Kafka Interviews Questions and Answers

Real-Time Streaming Interview Prep

Apache Kafka Interview Questions & Answers (2026)

The go-to guide for Kafka interview questions — from core architecture to real-world streaming scenarios asked at Netflix, LinkedIn, Uber and top tech companies.

📅 Updated April 2026  |  ⏱ 13 min read  |  🎯 All Levels

Kafka is now the standard for real-time data pipelines. If you're interviewing for a Senior Data Engineer, Platform Engineer, or Backend Engineer role — Kafka questions are almost guaranteed. Here's everything you need to know.

KAFKA ARCHITECTURE OVERVIEW
─────────────────────────────────────────────
Producers ──► [ Topic: orders ]
                  ├── Partition 0 ──► Consumer Group A (Consumer 1)
                  ├── Partition 1 ──► Consumer Group A (Consumer 2)
                  └── Partition 2 ──► Consumer Group B (Consumer 1)
─────────────────────────────────────────────
ZooKeeper / KRaft ──► Broker Coordination
Brokers: 3 (each stores partition replicas)

1. Core Architecture & Concepts

Q1What is the difference between a Kafka Topic and a Partition?
Topic — A logical category/feed that producers write to and consumers read from. Think of it like a database table name.

Partition — A physical subdivision of a topic stored on a broker. Each partition is an ordered, immutable log of records. Partitions enable:
Parallelism — Multiple consumers can read different partitions simultaneously.
Scalability — Partitions are distributed across brokers.
Ordering guarantee — Order is guaranteed within a partition, NOT across partitions.

Key rule: A topic with N partitions can have at most N consumers in one consumer group actively reading at the same time.
Q2What is a Kafka Broker and what does ZooKeeper (or KRaft) do?
Broker — A Kafka server that stores partition data and serves producer/consumer requests. A Kafka cluster typically has 3+ brokers for fault tolerance.

ZooKeeper (legacy) — Managed broker metadata, leader election, and cluster coordination. Required in Kafka versions before 2.8.

KRaft (Kafka 3.3+ GA) — Kafka's own built-in consensus protocol replacing ZooKeeper. Eliminates the operational complexity of maintaining a separate ZooKeeper cluster. KRaft is now the default and recommended mode.
Q3What is a Kafka Consumer Group and why does it matter?
A Consumer Group is a set of consumers that cooperatively consume a topic. Kafka ensures each partition is read by exactly one consumer in the group at a time.

Key behaviours:
• If you have 4 partitions and 2 consumers → each consumer reads 2 partitions.
• If you have 4 partitions and 5 consumers → 1 consumer is idle (no partition for it).
• Multiple consumer groups can read the same topic independently (fan-out).

Rebalancing — When a consumer joins or leaves the group, Kafka triggers a rebalance to reassign partitions. During a rebalance, consumption pauses briefly.

2. Producer & Consumer Deep Dive

Q4What does acks=all mean in Kafka producer configuration?
The acks setting controls when the producer considers a message "successfully sent":

acks valueMeaningRisk
0Fire and forget — no acknowledgementData loss possible
1Leader broker acknowledgesLoss if leader fails before replication
all (or -1)All in-sync replicas (ISR) acknowledgeSlowest but zero data loss
For financial or critical data pipelines: always use acks=all with min.insync.replicas=2.
Q5How does Kafka decide which partition a message goes to?
With a key: Kafka hashes the message key using murmur2 and applies hash(key) % numPartitions. Same key always goes to the same partition — guaranteeing order for related messages (e.g., all events for user ID 123).

Without a key: Kafka uses a sticky partitioner (default since Kafka 2.4) — batches messages to the same partition until the batch is full, then rotates. Previously used round-robin.

Custom partitioner: You can implement your own to route messages based on business logic (e.g., send high-priority orders to partition 0).
Q6What is the difference between at-most-once, at-least-once, and exactly-once delivery?
Delivery SemanticRiskHow to Achieve in Kafka
At-most-onceMessages can be lostAuto-commit offsets before processing
At-least-onceDuplicates possibleCommit after processing (most common)
Exactly-once (EOS)No loss, no duplicatesIdempotent producer + transactional API
Most production systems use at-least-once with idempotent consumers. True exactly-once requires Kafka Transactions and adds latency overhead.

3. Reliability, Replication & Offsets

Q7What is an offset in Kafka? Who manages it?
An offset is a monotonically increasing integer that uniquely identifies each message within a partition. Kafka never deletes messages based on consumption — it retains them based on retention.ms (default 7 days).

Who manages offsets?
• Kafka itself stores committed offsets in an internal topic called __consumer_offsets (since Kafka 0.9).
• Consumers commit their offset after processing to track progress.
• If a consumer restarts, it resumes from its last committed offset.

auto.offset.reset — Controls what happens when there's no committed offset: earliest (read from beginning) or latest (read only new messages).
Q8What is replication in Kafka and what is an ISR?
Each Kafka partition has one Leader and N-1 Follower replicas on different brokers. Producers write to the Leader; Followers replicate.

ISR (In-Sync Replicas) — The set of replicas that are caught up with the Leader within replica.lag.time.max.ms. If a follower falls too far behind, it's removed from the ISR.

Typical production config:
replication.factor=3, min.insync.replicas=2, acks=all

This means: 3 copies of data, requires at least 2 replicas to acknowledge writes — tolerates 1 broker failure with zero data loss.

4. Performance & Tuning

⚠️ Senior Interview Territory Performance tuning questions separate junior from senior candidates. Know these settings and when to use them.
Q9How would you increase Kafka throughput for a high-volume producer?
Batching: Increase batch.size (default 16KB → try 64KB–256KB) and linger.ms (add small delay to fill batches).

Compression: Set compression.type=snappy or lz4 — dramatically reduces network and disk I/O.

Increase partitions: More partitions = more parallelism = more producers writing simultaneously.

Async sends: Use async producer with a callback instead of blocking on each send.

buffer.memory: Increase from 32MB to 64–128MB to reduce producer back-pressure.
Q10What causes consumer lag and how do you fix it?
Consumer lag = the difference between the latest offset in a partition and the consumer's current offset. High lag means your consumers are falling behind producers.

Causes:
• Consumer processing is too slow (heavy computation, slow DB writes).
• Too few consumer instances for the number of partitions.
• Frequent rebalances causing pause time.

Fixes:
• Scale out — add more consumers (up to the number of partitions).
• Optimize consumer processing — batch DB writes, async processing.
• Increase max.poll.records to process more records per poll.
• Monitor with Kafka's kafka-consumer-groups.sh --describe or a tool like Burrow.

5. Scenario-Based Questions

Q11Design a real-time order processing system using Kafka.
Architecture:

1. Order Service (Producer) — Publishes order events to orders topic with order_id as key (ensures all events for the same order go to the same partition).

2. Kafka Topics: orders-createdorders-validatedorders-fulfilled

3. Consumer Microservices:
• Validation Service — reads from orders-created, validates stock/payment, publishes to orders-validated.
• Fulfillment Service — reads from orders-validated, triggers shipping.
• Notification Service — reads both topics, sends emails/SMS.

4. Reliability: acks=all, idempotent producers, dead-letter topic for failed orders.
Q12How would you handle duplicate messages in a Kafka consumer?
At-least-once delivery means duplicates can happen (consumer crashes after processing but before committing offset). Strategies to handle this:

1. Idempotent Processing — Design your processing logic to be safe to run twice (e.g., upsert to DB using the message ID as the primary key).

2. Deduplication Store — Track processed message IDs in Redis with a short TTL. If seen before, skip processing.

3. Exactly-Once Semantics (EOS) — Use Kafka's transactional API with enable.idempotence=true for true end-to-end exactly-once guarantees within the Kafka ecosystem.

🎯 Master More Data Engineering Interview Topics

Practice 100-question interactive quizzes on SQL, Spark, PySpark, Hadoop and more — completely free. Then get the 300Q PDF bundle for offline deep prep.

Visit the Blog → Get the PDF Bundle

No comments:

Post a Comment

Networking concepts of Data Engineer

Networking for Data Engineers Networking Concepts Every Data Engineer Must Know (2026) You don't need to be a n...

🚫
Content Protected
Copying content from this site is not permitted.
© 2026 InterviewQuestionsToLearn.com