Introduction

When a consumer group loses its committed offsets -- due to offset expiration, manual reset, or configuration change -- setting auto.offset.reset=earliest causes the consumer to replay all available messages from the beginning of the log. If downstream systems are not idempotent, this replay creates duplicate records, corrupted aggregates, and incorrect business state.

Symptoms

  • Downstream database contains duplicate records after consumer restart
  • Aggregate metrics show sudden spike then drop as duplicates inflate then normalize counts
  • Consumer logs show processing of events with timestamps far in the past
  • Idempotency keys conflict causing constraint violation errors in downstream databases
  • Error message: Duplicate key entry or UNIQUE constraint violation

Common Causes

  • Consumer group offsets expired due to offsets.retention.minutes being too short
  • Manual offset reset command run with --reset-to-earliest instead of --reset-to-latest
  • New consumer group joining an existing topic with auto.offset.reset=earliest
  • Kafka consumer group coordinator failover causing offset commit loss
  • Offset topic itself compacted or deleted due to misconfigured retention

Step-by-Step Fix

  1. 1.Check current committed offsets before any reset: Document the current state to enable recovery.
  2. 2.```bash
  3. 3.kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  4. 4.--describe --group my-consumer-group --offsets
  5. 5.`
  6. 6.Stop the consumer to prevent further duplicate processing: Halt processing immediately.
  7. 7.```bash
  8. 8.kubectl scale deployment/consumer-service --replicas=0
  9. 9.`
  10. 10.Reset offsets to the latest position to resume from current head: Avoid replaying all historical messages.
  11. 11.```bash
  12. 12.kafka-consumer-groups.sh --bootstrap-server localhost:9092 \
  13. 13.--group my-consumer-group --topic my-topic --reset-offsets --to-latest --execute
  14. 14.`
  15. 15.Clean up duplicate data in downstream systems: Run a deduplication query to remove duplicate records.
  16. 16.```sql
  17. 17.DELETE FROM events t1 USING events t2
  18. 18.WHERE t1.id < t2.id
  19. 19.AND t1.event_id = t2.event_id;
  20. 20.`
  21. 21.Restart the consumer and verify no further duplicates: Monitor the consumer for clean processing.
  22. 22.```bash
  23. 23.kubectl scale deployment/consumer-service --replicas=3
  24. 24.kubectl logs -l app=consumer-service --tail=100 | grep -i "duplicate|error"
  25. 25.`

Prevention

  • Set auto.offset.reset=latest by default and only use earliest for explicit recovery scenarios
  • Implement idempotent downstream processing using unique event IDs and database upserts
  • Extend offsets.retention.minutes to at least 30 days for critical consumer groups
  • Enable periodic offset backup to an external store for disaster recovery
  • Require confirmation prompts or approval workflows for manual offset reset operations in production