Introduction

After a Redis server restart, Rails applications often encounter persistent Redis::CannotConnectError exceptions even after Redis is back online. The issue stems from stale connections in the connection pool that were established before the restart and never properly reconnected.

Symptoms

  • Redis::CannotConnectError: Error connecting to Redis on 127.0.0.1:6379
  • Errors continue after Redis is confirmed running
  • Sidekiq workers fail to process jobs
  • Rails cache reads return errors
  • Connection pool holds dead socket connections

Example error: `` Redis::CannotConnectError: Error connecting to Redis on redis.internal:6379 (redis.exceptions.ConnectionError) from /path/to/redis-4.8.1/lib/redis/client.rb:398:in rescue in establish_connection' from /path/to/redis-4.8.1/lib/redis/client.rb:379:in establish_connection' from /path/to/connection_pool-2.4.1/lib/connection_pool.rb:104:in block in with' ```

Common Causes

  • Connection pool holds sockets connected to the old Redis instance
  • Sidekiq does not reconnect automatically after fork
  • Rails.cache holds a reference to a dead connection
  • TCP keepalive not configured, so dead connections are not detected quickly
  • Redis ACL authentication fails after password rotation during restart

Step-by-Step Fix

  1. 1.Check Redis connectivity from the application server:
  2. 2.```bash
  3. 3.redis-cli -h redis.internal -p 6379 -a $REDIS_PASSWORD ping
  4. 4.# Should return: PONG
  5. 5.`
  6. 6.Restart the Rails application to clear connection pools:
  7. 7.```bash
  8. 8.# For Puma
  9. 9.kill -USR2 $(cat tmp/pids/server.pid) # Graceful restart

# For Capistrano bundle exec cap production deploy:restart

# On Heroku heroku restart ```

  1. 1.Add automatic reconnection with retry logic:
  2. 2.```ruby
  3. 3.# config/initializers/redis.rb
  4. 4.require 'redis'

$redis = Redis.new( url: ENV['REDIS_URL'], password: ENV['REDIS_PASSWORD'], reconnect_attempts: [ 0.05, 0.1, 0.5, 1.0, 2.0, 5.0 # Exponential backoff in seconds ], connect_timeout: 5, read_timeout: 5, write_timeout: 5 ) ```

  1. 1.Configure Sidekiq reconnection:
  2. 2.```ruby
  3. 3.# config/initializers/sidekiq.rb
  4. 4.Sidekiq.configure_server do |config|
  5. 5.config.redis = {
  6. 6.url: ENV['REDIS_URL'],
  7. 7.password: ENV['REDIS_PASSWORD'],
  8. 8.size: 10,
  9. 9.network_timeout: 5
  10. 10.}

config.on(:startup) do # Warm up connections after restart Sidekiq.redis { |r| r.ping } end

config.on(:failure) do |ex, ctx| if ex.is_a?(Redis::CannotConnectError) Rails.logger.warn "Redis connection lost, Sidekiq will auto-reconnect" end end end ```

  1. 1.Add Redis health check endpoint:
  2. 2.```ruby
  3. 3.# config/routes.rb
  4. 4.get '/health/redis', to: 'health#redis'

# app/controllers/health_controller.rb class HealthController < ApplicationController def redis begin result = $redis.ping if result == 'PONG' render json: { status: 'healthy' }, status: :ok else render json: { status: 'unhealthy', detail: result }, status: :service_unavailable end rescue Redis::CannotConnectError => e render json: { status: 'unhealthy', detail: e.message }, status: :service_unavailable end end end ```

Prevention

  • Use reconnect_attempts with exponential backoff (Redis gem 4.6+)
  • Configure TCP keepalive: tcp-keepalive 60 in Redis config
  • Add health check endpoints that verify Redis connectivity
  • Monitor Redis connection count in your observability platform
  • Use Redis Sentinel or cluster for automatic failover
  • Test Redis restart scenarios in staging regularly