Fix Puma Phased Upgrade Timeout | Zero-Downtime Deployment

Introduction

Puma's phased restart (SIGUSR2) is designed for zero-downtime deployments by restarting workers one at a time while the master process continues accepting connections. However, old workers can fail to shut down if they have long-running requests, stuck threads, or database connections that do not close. This results in both old and new code running simultaneously, memory growing unbounded, and the new code never fully taking over.

Symptoms

pumactl phased-restart hangs and eventually times out
ps aux | grep puma shows workers from different app versions
Memory usage grows continuously during phased restart
New code changes not reflected after phased restart
Puma logs show Old worker 1234 did not terminate, sending SIGKILL

Check worker status: ```bash # List all Puma processes ps aux | grep puma # Master process puma 5.6.7 (tcp://0.0.0.0:3000) [myapp]

# Workers from different times # Worker 1 (PID 1234) - started 2 hours ago (OLD) # Worker 2 (PID 5678) - started 2 minutes ago (NEW) ```

Common Causes

Long-running requests (file uploads, report generation) blocking shutdown
Thread pool exhaustion preventing worker from finishing active requests
Database connections not released during worker shutdown hook
External HTTP calls with no timeout waiting indefinitely
worker_timeout set too high or disabled

Step-by-Step Fix

1.Configure worker timeout and shutdown behavior:
2.```ruby
3.# config/puma.rb

# Number of seconds to wait for a worker to shut down worker_timeout 60 worker_boot_timeout 30

# Grace period for old workers during phased restart # After this, old workers are force-killed worker_shutdown_timeout 20

# Prune workers that exceed memory limit max_fast = 3 max_fast_window = 60 ```

1.Add proper shutdown hooks for cleanup:
2.```ruby
3.# config/puma.rb
4.on_worker_shutdown do
5.# Close database connections
6.ActiveRecord::Base.connection_pool.disconnect!

# Stop background job processors Sidekiq.drain if defined?(Sidekiq)

# Close Redis connections Rails.cache.redis.close if Rails.cache.respond_to?(:redis)

# Flush any pending log writes Rails.logger.flush if Rails.logger.respond_to?(:flush) end

on_worker_boot do # Reconnect database for new worker ActiveRecord::Base.establish_connection

# Reconnect Redis Rails.cache.reconnect if Rails.cache.respond_to?(:reconnect) end ```

1.Use hot_restart instead of phased_restart for full reload:
2.```bash
3.# phased_restart: restarts workers one at a time (may leave old workers)
4.pumactl phased-restart

# hot_restart: restarts all workers immediately (brief connection interruption) pumactl hot-restart

# For deployments where code changed significantly, use hot restart # phased_restart only works when the master process has not changed ```

1.Force kill stuck old workers:
2.```bash
3.# Find old workers
4.ps aux | grep "puma: cluster worker"

# Send SIGTERM to specific old worker kill -SIGTERM <old_worker_pid>

# If still running after worker_shutdown_timeout, force kill kill -SIGKILL <old_worker_pid>

# Or use pumactl to check status pumactl -F config/puma.rb stats ```

1.**Add deployment script with phased restart fallback":
2.```bash
3.#!/bin/bash
4.# deploy.sh

echo "Deploying new release..." cd /var/www/myapp/current

# Try phased restart first (zero downtime) echo "Attempting phased restart..." if bundle exec pumactl -F config/puma.rb phased-restart 2>/dev/null; then echo "Phased restart successful" else echo "Phased restart failed, falling back to hot restart" bundle exec pumactl -F config/puma.rb hot-restart

# Wait and verify sleep 5 worker_count=$(ps aux | grep "puma: cluster worker" | grep -v grep | wc -l) if [ "$worker_count" -lt 2 ]; then echo "WARNING: Not enough workers running. Starting Puma." bundle exec puma -C config/puma.rb -d fi fi ```

Prevention

Set worker_shutdown_timeout to a reasonable value (15-30 seconds)
Add on_worker_shutdown hooks to release all resources
Monitor worker memory and PID ages to detect stuck workers
Use pumactl stats in health checks to verify worker count
Configure tag in puma.rb to identify worker app version
Prefer container-based deployments (Docker) with rolling restart over phased restart

Fix Puma Phased Upgrade Timeout Old Worker Not Shutting Down

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Ruby Troubleshooting Guides

Fix Ruby Bundler Could Not Find Gem in Any of the Sources

Fix Ruby NoMethodError Undefined Method on Nil From ActiveRecord

Fix Ruby Gem ConflictError From Incompatible Dependency Versions

Fix Rails Credentials Edit Missing RAILS_MASTER_KEY

Fix Rake Task Namespace Collision with Same Task Name

Fix Ruby 3 Keyword Arguments Breaking Change Warnings