Kafka Consumer Groups Basics

🐥 the most basic 101 + extra resources

Consumer Groups

A consumer is an application that leverages the Kafka client library (in particular, its KafkaConsumer class) to consume messages from Kafka and do something with them.

A Kafka topic may have more messages than a single app would ever be able to process - so we need to scale the consumption up.

Enter Consumer Groups.

A consumer group is a collection of consumer applications that work in tandem to consume from the same topic(s).

The apps don’t talk to each other - they have a broker put them in the same consumer group as others via the client-configured group_id config.

Coordination

When we have N consumers in the same group, we need to make sure they’re coordinating their work appropriately.

At any one time, the goal is to have one consumer reading from a partition - you wouldn’t want two reading duplicate records from the same partition.

This is done through the consumer group protocol:

  1. consumer clients join the group before consuming anything.

  2. to join the group, they talk to a specific broker - called the Group Coordinator.

The Group Coordinator’s job is to maintain the membership of the group - what consumers are part of it. Because we have everything group-related owned by one broker, it becomes possible to avoid these duplicate reads in the happy path.

This Group Coordinator broker is also the one that helps the consumers save their offsets — also called offset commits. This data is stored in an internal topic named __consumer_offsets.

This topic also contains metadata about the group, so that the coordinator can fail over appropriately to another broker if the original one dies.

Consumer Group Rebalance

We rebalance when we want to move partition ownership from one consumer to another.

A consumer group rebalance is the act of having every member re-join the group. Through this process, each consumer receives the assigned partitions it should consume from.

There are 6 reasons why a group can be forced to rebalance:

  • a consumer joins the group (sends a JoinGroup request).

  • a consumer shuts down gracefully.

    • it leaves the group via a LeaveGroup request

  • max_poll_interval_ms passing between Consumer#poll() calls.

  • a consumer dies. (its heartbeat requests time out after session_timeout_ms)

  • the Consumer#enforceRebalance API is called.

  • a new partition is added to a topic that the group is subscribed to.

Heartbeats

Each consumer maintains a heartbeat to the group coordinator - sending a heartbeat every heartbeat_interval_ms.

This is the main way that the coordinator communicates the need for a rebalance to the consumer.

i.e a consumer restarts and a rebalance is needed - how do the other consumers realize this?

The Coordinator responds with an error in the heartbeat request.

Example rebalance sequence due to a consumer restarting.

The lower the interval setting, the faster your consumers will react to rebalances → the faster they’ll complete them.

Liked this?

Help support our growth so that we can continue to deliver valuable content!

More Kafka? 🔥

Those are the basics that fit in 2 minutes. Here are some more consumer group concepts that we have covered:

More Content ⚡️💥

  • bad news: the letter took longer to post this time.

  • good news: because of that, we have more content to share!

We have quite a few interesting threads on different topics. I’m sharing the Twitter links solely because they allow more characters than LinkedIn.

Long, very good summary from many sources about this infamous outage. If you read anything out of this issue - I recommend this.

  • ✍️ Sketch Algorithms - 90% less memory, 4x less CPU and cost for roughly the same result.

Apache®, Apache Kafka®, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.