Introduction

Gunicorn monitors worker health with a timeout mechanism. If a worker does not respond within the configured timeout (default 30 seconds), the master process sends SIGKILL to the worker and spawns a replacement. This causes in-flight requests to be aborted and clients receive connection reset errors. The timeout exists to recover hung workers, but legitimate long-running requests (file uploads, report generation, slow database queries) can trigger false positive kills. The key is distinguishing between genuinely hung workers and requests that simply need more time.

Symptoms

Gunicorn master logs:

bash
[CRITICAL] WORKER TIMEOUT (pid:12345)
[WARNING] Worker with pid 12345 was signaled via SIGKILL
[INFO] Booting worker with pid: 12346

Client-side:

bash
curl: (52) Empty reply from server
# OR
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Application logs show abrupt termination:

bash
[2026-04-09 10:00:30 +0000] [12345] [INFO] Starting worker
# No shutdown log - worker was killed mid-request

Common Causes

  • Default 30s timeout too short: Report generation or data exports take minutes
  • Blocking I/O in sync workers: Sync workers blocked on slow database or API calls
  • Deadlock in application code: Thread deadlock, database lock contention
  • Memory leak causing GC pauses: Large heap causes long garbage collection pauses
  • External service timeout not configured: Upstream service hangs without timeout
  • Too many workers causing contention: Workers competing for CPU or database connections

Step-by-Step Fix

Step 1: Tune timeout and worker configuration

```python # gunicorn.conf.py import multiprocessing

# Timeout - increase for long-running requests timeout = 120 # 2 minutes instead of 30 seconds graceful_timeout = 60 # Time to wait for in-flight requests on shutdown keepalive = 5 # Keep connections alive for 5 seconds

# Worker configuration workers = multiprocessing.cpu_count() * 2 + 1 worker_class = 'gthread' # Threaded workers for I/O bound workloads threads = 4 # 4 threads per worker max_requests = 1000 # Restart worker after 1000 requests (prevent memory leaks) max_requests_jitter = 50 # Randomize restart to avoid all workers restarting at once ```

Step 2: Use async workers for I/O bound applications

```python # gunicorn.conf.py for async workloads worker_class = 'uvicorn.workers.UvicornWorker' # For async/Starlette/FastAPI timeout = 120

# Or use gevent for traditional WSGI with async I/O # pip install gevent # worker_class = 'gevent' # worker_connections = 1000 ```

Step 3: Add request-level timeout handling

```python from functools import wraps import signal

class RequestTimeout(Exception): pass

def timeout_handler(signum, frame): raise RequestTimeout("Request exceeded time limit")

def request_timeout(seconds=60): """Decorator that limits individual request execution time.""" def decorator(f): @wraps(f) def wrapper(*args, **kwargs): # Set the signal handler and a time limit old_handler = signal.signal(signal.SIGALRM, timeout_handler) signal.alarm(seconds) try: return f(*args, **kwargs) except RequestTimeout: return {"error": "Request timeout"}, 504 finally: signal.alarm(0) # Cancel the alarm signal.signal(signal.SIGALRM, old_handler) return wrapper return decorator

# Usage @app.route('/export') @request_timeout(seconds=120) def export_data(): return generate_large_report() ```

Prevention

  • Set timeout based on your slowest legitimate request, measured in production
  • Use async or threaded workers for I/O-bound applications
  • Implement request-level timeouts that return proper error responses instead of SIGKILL
  • Add max_requests to periodically recycle workers and prevent memory leaks
  • Monitor Gunicorn worker age and request duration with Prometheus metrics
  • Use a reverse proxy (nginx) with its own timeout as a second line of defense
  • Add structured logging to track which endpoints are approaching timeout thresholds