Fix Message Ordering Violation After Broker Failover

Introduction

Message brokers guarantee ordering within a partition -- messages are delivered in the same order they were appended. However, during broker failover with unclean leader election, a new leader may not have all the messages that the old leader had, causing sequence gaps or reordering. This breaks consumer processing logic that depends on strict ordering, such as event sourcing or state machine transitions.

Symptoms

Consumers receive messages out of sequence number order
State machine transitions fail because prerequisite events are missing
Event-sourced aggregates produce incorrect state due to missing intermediate events
Consumer logs show sequence number gaps or backward jumps
Error message: Sequence number 4523 received after 4530, ordering violated

Common Causes

Unclean leader election enabled, allowing out-of-sync replica to become leader
Broker crash before in-flight writes are flushed to disk, losing recent messages
Network partition causes leader change with ISR that is missing recent commits
Producer sends messages with acks=1 instead of acks=all, not waiting for full replication
Consumer processes messages asynchronously, reordering within the application layer

Step-by-Step Fix

jq -r '.sequenceNumber'

Prevention

Always set acks=all on producers to ensure messages are replicated before acknowledgment
Disable unclean.leader.election.enable in production to prevent out-of-sync leader election
Use min.insync.replicas=2 or higher to guarantee message durability across multiple brokers
Implement sequence number validation in consumers that rejects out-of-order messages
For critical ordering requirements, use single-partition topics or partition by a strict ordering key
Monitor ISR shrink events as an early warning indicator of potential ordering violations

Message Ordering Guarantee Violated After Broker Failover

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Messaging Troubleshooting Guides

Pulsar Namespace Policy Denied

Pulsar Broker Connection Failed

Pulsar Producer Queue Full

Pulsar Subscription Not Found

Pulsar Topic Not Found

RocketMQ Delay Message Not Delivering