Fix Producer Timeout Retry Exhausting Circuit Breaker

Introduction

When a message broker experiences degraded performance or temporary unavailability, producer retry logic can amplify the problem. Each timed-out message triggers retries, multiplying the load on an already struggling broker. This cascading retry storm trips the circuit breaker, which then blocks all message production -- including messages that would have succeeded.

Symptoms

Circuit breaker transitions to OPEN state after exceeding failure threshold
Producer logs show repeated RequestTimedOutException followed by CircuitBreakerOpenException
Broker CPU and network I/O spike from retry traffic amplification
All producer threads blocked waiting for circuit breaker half-open transition
Downstream services stop receiving events, causing stale data and processing gaps

Common Causes

Retry count set too high with no exponential backoff, overwhelming the broker
Circuit breaker failure threshold too low relative to normal transient error rate
Broker underprovisioned for peak load, causing request queue buildup and timeouts
Multiple producer instances all retrying simultaneously without jitter
No backpressure mechanism between producers and the message broker

Step-by-Step Fix

1.Check circuit breaker state and configuration: Inspect the current circuit breaker status.
2.```bash
3.curl -s http://producer-service:8080/actuator/health | jq '.components.circuitBreaker'
4.`
5.Reduce retry count and add exponential backoff with jitter: Prevent retry storms from overwhelming the broker.
6.```java
7.Retry retry = Retry.custom("producer", RetryConfig.custom()
8..maxAttempts(3)
9..waitDuration(Duration.ofMillis(500))
10..intervalFunction(IntervalFunction.ofExponentialRandomBackoff(500, 2.0))
11..retryExceptions(RequestTimedOutException.class)
12..build());
13.`
14.Increase producer request timeout to accommodate broker load spikes: Give the broker more time to process under load.
15.```properties
16.request.timeout.ms=60000
17.delivery.timeout.ms=120000
18.linger.ms=50
19.batch.size=65536
20.`
21.Implement backpressure to throttle producers during broker degradation: Use a bounded send queue.
22.```java
23.// Block when send buffer is full instead of queuing unboundedly
24.producer.send(record).get(30, TimeUnit.SECONDS);
25.`
26.Monitor circuit breaker metrics and broker health: Set up alerting for early warning.
27.```bash
28.# Monitor Resilience4j circuit breaker metrics
29.curl -s http://producer-service:8080/actuator/prometheus | grep resilience4j
30.`

Prevention

Configure exponential backoff with jitter for all producer retries to avoid synchronized retry storms
Set circuit breaker thresholds based on historical error rates, not theoretical expectations
Implement producer-side rate limiting that adapts to broker response times
Use separate circuit breakers per topic to prevent a single degraded topic from blocking all production
Monitor broker request queue depth and response latency as leading indicators of timeout risk
Implement graceful degradation with local message buffering during circuit breaker open states

Message Producer Timeout Retry Exhausting Circuit Breaker

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Messaging Troubleshooting Guides

Pulsar Namespace Policy Denied

Pulsar Broker Connection Failed

Pulsar Producer Queue Full

Pulsar Subscription Not Found

Pulsar Topic Not Found

RocketMQ Delay Message Not Delivering