Fix Gunicorn Worker Timeout SIGKILL and Request Abortion

Introduction

Gunicorn monitors worker health with a timeout mechanism. If a worker does not respond within the configured timeout (default 30 seconds), the master process sends SIGKILL to the worker and spawns a replacement. This causes in-flight requests to be aborted and clients receive connection reset errors. The timeout exists to recover hung workers, but legitimate long-running requests (file uploads, report generation, slow database queries) can trigger false positive kills. The key is distinguishing between genuinely hung workers and requests that simply need more time.

Symptoms

Gunicorn master logs:

bash

[CRITICAL] WORKER TIMEOUT (pid:12345)
[WARNING] Worker with pid 12345 was signaled via SIGKILL
[INFO] Booting worker with pid: 12346

Client-side:

bash

curl: (52) Empty reply from server
# OR
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))

Application logs show abrupt termination:

bash

[2026-04-09 10:00:30 +0000] [12345] [INFO] Starting worker
# No shutdown log - worker was killed mid-request

Common Causes

Default 30s timeout too short: Report generation or data exports take minutes
Blocking I/O in sync workers: Sync workers blocked on slow database or API calls
Deadlock in application code: Thread deadlock, database lock contention
Memory leak causing GC pauses: Large heap causes long garbage collection pauses
External service timeout not configured: Upstream service hangs without timeout
Too many workers causing contention: Workers competing for CPU or database connections

Step-by-Step Fix

Step 1: Tune timeout and worker configuration

```python # gunicorn.conf.py import multiprocessing

# Timeout - increase for long-running requests timeout = 120 # 2 minutes instead of 30 seconds graceful_timeout = 60 # Time to wait for in-flight requests on shutdown keepalive = 5 # Keep connections alive for 5 seconds

# Worker configuration workers = multiprocessing.cpu_count() * 2 + 1 worker_class = 'gthread' # Threaded workers for I/O bound workloads threads = 4 # 4 threads per worker max_requests = 1000 # Restart worker after 1000 requests (prevent memory leaks) max_requests_jitter = 50 # Randomize restart to avoid all workers restarting at once ```

Step 2: Use async workers for I/O bound applications

```python # gunicorn.conf.py for async workloads worker_class = 'uvicorn.workers.UvicornWorker' # For async/Starlette/FastAPI timeout = 120

# Or use gevent for traditional WSGI with async I/O # pip install gevent # worker_class = 'gevent' # worker_connections = 1000 ```

Step 3: Add request-level timeout handling

```python from functools import wraps import signal

class RequestTimeout(Exception): pass

def timeout_handler(signum, frame): raise RequestTimeout("Request exceeded time limit")

def request_timeout(seconds=60): """Decorator that limits individual request execution time.""" def decorator(f): @wraps(f) def wrapper(*args, **kwargs): # Set the signal handler and a time limit old_handler = signal.signal(signal.SIGALRM, timeout_handler) signal.alarm(seconds) try: return f(*args, **kwargs) except RequestTimeout: return {"error": "Request timeout"}, 504 finally: signal.alarm(0) # Cancel the alarm signal.signal(signal.SIGALRM, old_handler) return wrapper return decorator

# Usage @app.route('/export') @request_timeout(seconds=120) def export_data(): return generate_large_report() ```

Prevention

Set timeout based on your slowest legitimate request, measured in production
Use async or threaded workers for I/O-bound applications
Implement request-level timeouts that return proper error responses instead of SIGKILL
Add max_requests to periodically recycle workers and prevent memory leaks
Monitor Gunicorn worker age and request duration with Prometheus metrics
Use a reverse proxy (nginx) with its own timeout as a second line of defense
Add structured logging to track which endpoints are approaching timeout thresholds

Fix Gunicorn Worker Timeout SIGKILL and Request Abortion

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Step 1: Tune timeout and worker configuration

Step 2: Use async workers for I/O bound applications

Step 3: Add request-level timeout handling

Prevention

Share this guide

More Python Troubleshooting Guides

Fix matplotlib Headless Server Rendering Display Error

Fix Werkzeug Debugger PIN Security Risk in Production

Fix urllib3 InsecureRequestWarning and SSL Warnings

Fix SQLAlchemy QueuePool Connection Exhaustion InvalidRequestError

Fix scikit-learn Model Serialization Pickle Version Mismatch

Fix Python requests MaxRetriesExceededError Connection Pool Exhaustion