Introduction

Redis connection pool exhausted occurs when all available connections to the Redis server are in use, preventing new connections from being established. Applications receive errors like ERR maxclients reached, Connection refused, or connection timeout errors. This indicates either the Redis server maxclients limit has been reached, or the client-side connection pool is misconfigured. Connection exhaustion causes application failures, caching layer outages, and can cascade into full service unavailability.

Symptoms

  • Application logs show ERR maxclients reached or Connection pool exhausted
  • Redis CLI cannot connect: Could not connect to Redis: Connection refused
  • redis-cli info clients shows connected_clients at or near maxclients
  • New Redis operations timeout waiting for available connection
  • Application health checks fail due to Redis connectivity
  • Issue appears during traffic spikes, after deploy with connection leak, or after Redis restart

Common Causes

  • Redis maxclients limit too low for application concurrency
  • Client application connection leak (connections not returned to pool)
  • Client pool max_idle or max_active settings too low
  • Long-running Redis operations holding connections
  • Redis memory exhaustion causing connection rejections
  • Timeout settings too aggressive, connections not timing out
  • Multiple application instances sharing single Redis instance

Step-by-Step Fix

### 1. Check current Redis connection usage

Query Redis to understand connection state:

```bash # Check client statistics redis-cli INFO clients

# Output: # Clients # connected_clients:95 # cluster_connections:0 # maxclients:10000 # client_recent_max_input_buffer:0 # client_recent_max_output_buffer:0 # blocked_clients:3 # tracking_clients:0 # clients_in_timeout_table:0

# Check all connected clients redis-cli CLIENT LIST

# Get client count by type redis-cli CLIENT LIST | grep -E "cmd=get|cmd=set|cmd=eval" | wc -l ```

Key metrics: - connected_clients: Current active connections - maxclients: Maximum allowed connections - blocked_clients: Clients waiting on blocking operations (BLPOP, BRPOP) - Available = maxclients - connected_clients

### 2. Identify connections by client

Find which applications are consuming connections:

```bash # List all clients with details redis-cli CLIENT LIST

# Output format: # id=123 addr=192.168.1.10:45678 fd=8 name=myapp age=3600 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=26 qbuf-free=32742 obl=0 oll=0 omem=0 events=r cmd=get user=default

# Parse client list redis-cli CLIENT LIST | awk -F'[ =]' '{print $4, $6, $12, $24}' | sort | uniq -c | sort -rn

# Shows: count, address, name, command ```

Client type breakdown:

```bash # Count by client name redis-cli CLIENT LIST | grep -oP 'name=\K[^ ]+' | sort | uniq -c | sort -rn

# Count by address redis-cli CLIENT LIST | grep -oP 'addr=\K[^ ]+' | cut -d: -f1 | sort | uniq -c | sort -rn

# Count by command redis-cli CLIENT LIST | grep -oP 'cmd=\K[^ ]+' | sort | uniq -c | sort -rn ```

Look for: - Single application consuming > 50% of connections - High connection count from single IP address - Idle connections (idle value high) not being released

### 3. Increase maxclients limit

If connections are legitimately needed, increase the limit:

```bash # Check current maxclients redis-cli CONFIG GET maxclients

# Set new limit (runtime, lost on restart) redis-cli CONFIG SET maxclients 20000

# Make permanent in redis.conf # /etc/redis/redis.conf maxclients 20000

# Restart Redis to apply sudo systemctl restart redis

# Or for specific version sudo systemctl restart redis-6 ```

System requirements for high maxclients:

```bash # Check file descriptor limit ulimit -n

# Redis needs one FD per client + overhead # Required FDs = maxclients + 32 (for internal use)

# Check system limit cat /proc/sys/fs/file-max

# Increase if needed # /etc/security/limits.conf redis soft nofile 65535 redis hard nofile 65535

# Or in systemd service # /etc/systemd/system/redis.service.d/override.conf [Service] LimitNOFILE=65535

sudo systemctl daemon-reload sudo systemctl restart redis ```

### 4. Check client-side connection pool configuration

Configure connection pool in application:

```python # Python with redis-py import redis from redis import ConnectionPool

# WRONG: No pool configuration client = redis.Redis(host='localhost', port=6379, db=0) # Creates new connection for every operation

# CORRECT: Use connection pool pool = ConnectionPool( host='localhost', port=6379, db=0, max_connections=50, # Max connections in pool decode_responses=True, socket_connect_timeout=5, socket_timeout=5, ) client = redis.Redis(connection_pool=pool) ```

```python # Python with asyncio (aredis/aioredis) import asyncio import aioredis

pool = await aioredis.create_connection_pool( 'redis://localhost:6379', minsize=5, maxsize=20, pool_recycle=3600, timeout=10, ) ```

```java // Java with Jedis import redis.clients.jedis.JedisPool; import redis.clients.jedis.JedisPoolConfig;

JedisPoolConfig config = new JedisPoolConfig(); config.setMaxTotal(50); // Max connections config.setMaxIdle(25); // Idle connections to keep config.setMinIdle(5); // Minimum idle connections config.setMaxWaitMillis(5000); // Wait time for connection config.setTestOnBorrow(true); // Validate connection before use

JedisPool pool = new JedisPool(config, "localhost", 6379);

// Usage try (Jedis jedis = pool.getResource()) { jedis.get("key"); } // Connection automatically returned to pool ```

```javascript // Node.js with ioredis const Redis = require('ioredis');

const redis = new Redis({ host: 'localhost', port: 6379, db: 0, maxRetriesPerRequest: 3, retryDelayOnFailure: 100, connectionRetryStrategy: (times) => { if (times > 3) return null; return Math.min(times * 100, 3000); }, // Pool settings maxRetriesPerRequest: 3, lazyConnect: true, // Don't connect until first command }); ```

### 5. Fix connection leaks in application code

Ensure connections are returned to pool:

```python # WRONG: Connection not returned to pool def wrong(): client = redis.Redis() value = client.get('key') # Connection leaked if exception occurs return value

# CORRECT: Use context manager def correct(): with redis.Redis() as client: value = client.get('key') return value # Connection returned to pool automatically

# CORRECT: Use try/finally def correct2(): client = redis.Redis() try: value = client.get('key') return value finally: client.close() ```

```java // WRONG: Connection not closed public String wrong() { Jedis jedis = pool.getResource(); String value = jedis.get("key"); // Connection leaked if exception occurs return value; }

// CORRECT: Use try-with-resources public String correct() { try (Jedis jedis = pool.getResource()) { return jedis.get("key"); } // Connection automatically returned } ```

### 6. Configure connection timeouts

Timeout settings prevent connections from being held indefinitely:

```bash # Redis server timeout (close idle connections) redis-cli CONFIG GET timeout

# Set timeout (0 = no timeout) redis-cli CONFIG SET timeout 300 # 5 minutes

# Make permanent in redis.conf timeout 300 ```

Client-side timeouts:

```python # Python pool = ConnectionPool( host='localhost', socket_connect_timeout=5, # Connection establishment timeout socket_timeout=5, # Operation timeout retry_on_timeout=True, # Retry on timeout errors )

# Java config.setConnectionTimeout(5000); // 5 seconds config.setSoTimeout(5000); // Socket timeout ```

### 7. Identify and kill idle connections

Terminate idle connections to free up slots:

```bash # Find idle connections (> 5 minutes idle) redis-cli CLIENT LIST | awk -F'[ =]' '$12 > 300 {print $2, $12}'

# Kill client by ID redis-cli CLIENT KILL ID 123

# Kill clients by address redis-cli CLIENT KILL ADDR 192.168.1.10:45678

# Kill clients by type (normal, master, slave, pubsub) redis-cli CLIENT KILL TYPE normal

# Kill idle clients (> 300 seconds idle) # Requires Redis 6.2+ redis-cli CLIENT KILL IDLE 300 ```

Automated cleanup script:

```python import redis

def cleanup_idle_connections(redis_client, idle_threshold=300): """Kill connections idle longer than threshold (seconds)""" clients = redis_client.client_list() killed = 0

for client in clients: if int(client['idle']) > idle_threshold: try: redis_client.client_kill_filter(_id=client['id']) killed += 1 print(f"Killed client ID {client['id']}, idle for {client['idle']}s") except redis.exceptions.RedisError as e: print(f"Failed to kill client {client['id']}: {e}")

return killed

# Usage client = redis.Redis() killed = cleanup_idle_connections(client, idle_threshold=300) print(f"Killed {killed} idle connections") ```

### 8. Check for blocked clients

Blocking operations can hold connections:

```bash # Check blocked clients redis-cli INFO clients | grep blocked

# List blocked clients redis-cli CLIENT LIST | grep flags=B

# Check what commands are blocking redis-cli CLIENT LIST | grep -E "cmd=blpop|cmd=brpop|cmd=blpop|cmd=xread" | wc -l ```

Blocking commands: - BLPOP, BRPOP: Blocking list pop - BRPOPLPUSH: Blocking list move - XREAD BLOCK: Blocking stream read - WAIT: Waiting for replica acknowledgment

Reduce blocking operation usage:

```python # WRONG: Long blocking timeout redis.blpop('queue', timeout=0) # Blocks indefinitely

# CORRECT: Use reasonable timeout value = redis.blpop('queue', timeout=30) # 30 second timeout if value is None: # Handle empty queue, connection returned to pool pass ```

### 9. Implement connection pool monitoring

Set up monitoring for early detection:

```python # Python - expose pool stats from redis import ConnectionPool from prometheus_client import Gauge

pool_connections = Gauge('redis_pool_connections', 'Active connections in pool') pool_available = Gauge('redis_pool_available', 'Available connections in pool')

def monitor_pool(pool): pool_connections.set(pool._in_use_connections) pool_available.set(pool.max_connections - pool._in_use_connections)

# Call periodically import threading threading.Timer(10, lambda: monitor_pool(pool)).start() ```

Redis-side monitoring:

```bash # Export Redis metrics redis-cli INFO clients | grep -E "connected_clients|maxclients"

# With redis_exporter for Prometheus docker run --rm -p 9121:9121 oliver006/redis_exporter --redis.addr=localhost:6379

# Key metrics: # redis_connected_clients # redis_max_clients # redis_client_recent_max_input_buffer # redis_client_recent_max_output_buffer ```

Alert thresholds: - connected_clients / maxclients > 0.8: Warning - connected_clients / maxclients > 0.9: Critical - blocked_clients > 10: Investigate blocking operations

### 10. Scale Redis for high connection counts

For very high connection requirements:

```bash # Enable Redis Cluster for horizontal scaling # redis.conf cluster-enabled yes cluster-config-file nodes.conf cluster-node-timeout 5000

# Or use Redis Sentinel for HA # sentinel.conf sentinel monitor mymaster 127.0.0.1 6379 2 sentinel down-after-milliseconds mymaster 5000 sentinel failover-timeout mymaster 60000 sentinel parallel-syncs mymaster 1 ```

Connection pooling at proxy layer:

``` # Use Redis proxy (Twemproxy, Codis, or Redis Cluster) # Proxy maintains persistent connections to Redis # Applications connect to proxy with separate pool

# Twemproxy (nutcracker) configuration # nutcracker.yml cluster1: listen: 127.0.0.1:22122 hash: fnv1a_64 distribution: ketama timeout: 400 redis: true servers: - 127.0.0.1:6379:1 ```

Prevention

  • Set maxclients based on expected concurrency + headroom
  • Configure client-side connection pool with appropriate limits
  • Implement connection timeout to auto-release idle connections
  • Monitor connected_clients as leading indicator
  • Use connection pooling libraries, not raw connections
  • Implement retry logic with exponential backoff
  • Use Redis Cluster for horizontal scaling
  • **ERR maxclients reached**: Redis server connection limit hit
  • **Connection refused**: Redis not accepting connections (may be OOM)
  • **TimeoutError**: Connection pool timeout waiting for available connection