Introduction

NATS client disconnected when server restarted or network issue. This guide provides step-by-step diagnosis and resolution.

Symptoms

Typical error output:

bash
ERROR: Connection disconnected
Server: nats://nats.example.com:4222
Reason: server restart
Reconnecting...

Common Causes

  1. 1.Broker unreachable or cluster partition
  2. 2.Topic or queue configuration issue
  3. 3.Authentication or authorization failure
  4. 4.Resource limit exceeded (memory, disk, connections)

Step-by-Step Fix

Step 1: Check Current State

bash
# Check broker status
systemctl status kafka
curl -s http://localhost:9092/brokers
# View consumer/producer logs
tail -f /var/log/messaging/*.log

Step 2: Identify Root Cause

bash
# Check broker logs
tail -f /var/log/kafka/server.log
# Verify cluster status
kafka-broker-api-versions.sh --bootstrap-server localhost:9092
# Check consumer group
kafka-consumer-groups.sh --describe --group <group> --bootstrap-server localhost:9092

Step 3: Apply Primary Fix

```bash # Primary fix: Check and restart # Verify broker status systemctl status kafka-server

# Check topic existence kafka-topics.sh --list --bootstrap-server localhost:9092

# Restart broker if needed systemctl restart kafka-server ```

Step 4: Apply Alternative Fix

bash
# Alternative: Check configuration
cat /etc/kafka/server.properties
# Verify network connectivity
netstat -tulpn | grep 9092
# Check disk space
df -h /var/lib/kafka

Step 5: Verify the Fix

bash
kafka-topics.sh --describe --topic <topic> --bootstrap-server localhost:9092
# Should show topic metadata with ISR
curl -s http://localhost:9092/health

Common Pitfalls

  • Not monitoring consumer lag regularly
  • Using incorrect topic names in producers
  • Forgetting to close consumer connections
  • Not handling rebalance events properly

Best Practices

  • Set appropriate consumer timeouts
  • Use dead letter queues for failed messages
  • Monitor broker health continuously
  • Implement proper retry and backoff strategies
  • Broker Connection Failed
  • Topic Not Found
  • Consumer Lag High
  • Producer Send Failed