Introduction
Kafka consumers commit their processed offsets to the __consumer_offsets internal topic, managed by a designated group coordinator broker. When this broker becomes unavailable, offset commits fail, and consumers continue processing without recording their progress. If the consumer restarts before the coordinator recovers, it will reprocess all messages since the last successful commit.
Symptoms
- Consumer logs show
CommitFailedException: Group coordinator not available - Offset commit latency increases to timeout levels
__consumer_offsetstopic partition leader is unavailable- Consumers continue processing but offsets are not persisted
- On consumer restart, messages are reprocessed from the last committed offset
Common Causes
- Group coordinator broker crashed or was taken offline for maintenance
__consumer_offsetstopic partition leader election failed- Network partition between consumers and the coordinator broker
- Broker overload causing coordinator request processing to timeout
- Consumer group coordinator migration during rebalance taking too long
Step-by-Step Fix
- 1.Identify the group coordinator for the affected consumer group: Find which broker is the coordinator.
- 2.```bash
- 3.kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
- 4.--describe --group my-consumer-group
- 5.
` - 6.Check coordinator broker health: Verify the coordinator broker is running and responsive.
- 7.```bash
- 8.# Find coordinator broker ID
- 9.kafka-metadata.sh --snapshot /var/lib/kafka/data/__consumer_offsets-0/00000000000000000000.log | head -5
- 10.# Check broker status
- 11.kafka-broker-api-versions.sh --bootstrap-server coordinator-broker:9092
- 12.
` - 13.Restart the coordinator broker if it is down: Restore the coordinator service.
- 14.```bash
- 15.systemctl restart kafka
- 16.# Wait for broker to rejoin cluster
- 17.kafka-broker-api-versions.sh --bootstrap-server localhost:9092 | grep "coordinator-broker"
- 18.
` - 19.Manually commit offsets once coordinator is available: Force an offset commit to recover progress tracking.
- 20.```bash
- 21.# Trigger offset commit via consumer group management
- 22.kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
- 23.--group my-consumer-group --reset-offsets --to-current --execute
- 24.
` - 25.Configure synchronous offset commits for critical consumers: Ensure offset commits block processing.
- 26.```java
- 27.// Use commitSync instead of commitAsync for critical processing
- 28.consumer.commitSync();
- 29.
`
Prevention
- Configure
offsets.topic.replication.factor=3to ensure__consumer_offsetstopic is highly available - Monitor group coordinator availability and alert on coordinator changes
- Use
enable.auto.commit=falsewith explicitcommitSync()for critical processing pipelines - Implement offset tracking in an external store (database) as a backup to Kafka's internal offsets
- Set
offsets.retention.minutesto at least 43200 (30 days) for production consumer groups - Distribute
__consumer_offsetspartition leaders across multiple brokers