Introduction
An HTTP 503 Service Unavailable error from a reverse proxy (Nginx, HAProxy, Apache) indicates that the proxy cannot get a response from the upstream application server (Gunicorn, uWSGI, Node.js, Tomcat) within the configured timeout period. The application server may be overloaded, stuck in a deadlock, performing a long-running operation, or completely unresponsive. Unlike a 502 Bad Gateway (connection refused), the 503 with upstream timeout means the proxy connected but the backend did not respond in time.
Symptoms
- Browser shows
503 Service Unavailable - Nginx error log shows
upstream timed out (110: Connection timed out) - HAProxy logs show
sC(server timeout in client phase) - Application server process is running but not responding to requests
- Site works for some endpoints but times out on specific slow endpoints
Common Causes
- Application server worker threads all blocked on slow database queries
- Deadlock in application code causing requests to hang indefinitely
- Reverse proxy timeout too short for legitimately slow operations
- Garbage collection pause in JVM/Go runtime freezing all request processing
- Application server process count insufficient for current traffic
Step-by-Step Fix
- 1.Check the reverse proxy error logs:
- 2.```bash
- 3.# Nginx
- 4.sudo tail -50 /var/log/nginx/error.log | grep "upstream timed out"
- 5.# HAProxy
- 6.sudo tail -50 /var/log/haproxy.log | grep "SC"
- 7.
` - 8.Check if the application server is responsive:
- 9.```bash
- 10.# Direct request to the application server (bypassing the proxy)
- 11.curl -v http://localhost:8080/health
- 12.# If this also times out, the application server is the problem
- 13.
` - 14.Check application server worker status:
- 15.```bash
- 16.# Gunicorn
- 17.sudo systemctl status gunicorn
- 18.ps aux | grep gunicorn
# Node.js (PM2) pm2 status pm2 monit
# Check for stuck threads ps -T -p $(pgrep -f "gunicorn|node|java") -o pid,tid,stat,wchan ```
- 1.Restart the application server:
- 2.```bash
- 3.sudo systemctl restart gunicorn
- 4.# Or for PM2:
- 5.pm2 reload all
- 6.# For systemd direct service:
- 7.sudo systemctl restart myapp
- 8.
` - 9.Increase the upstream timeout if the application legitimately needs more time:
- 10.```nginx
- 11.# Nginx
- 12.proxy_read_timeout 120s;
- 13.proxy_connect_timeout 30s;
- 14.proxy_send_timeout 30s;
- 15.
` - 16.Check for slow database queries causing the timeout:
- 17.```bash
- 18.# PostgreSQL
- 19.sudo -u postgres psql -c "SELECT pid, now() - pg_stat_activity.query_start AS duration, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY duration DESC LIMIT 10;"
- 20.# MySQL
- 21.mysql -e "SHOW FULL PROCESSLIST;" | sort -k6 -n -r | head -10
- 22.
`
Prevention
- Implement health check endpoints and monitor application server responsiveness
- Set proxy timeouts based on the 99th percentile of response times, not the average
- Configure circuit breakers in the application to fail fast instead of hanging
- Use connection pooling to prevent database connection exhaustion
- Monitor upstream response times and alert when approaching proxy timeout limits