Introduction

A 504 Gateway Timeout error from Nginx indicates the load balancer successfully connected to a backend server but did not receive a response within the configured timeout period. This differs from a 502 error (invalid response) or 503 error (service unavailable). The timeout can occur at different stages: connection establishment, header reading, or body reading.

Symptoms

Error messages in Nginx error log:

bash
upstream timed out (110: Connection timed out) while reading response header from upstream
upstream timed out (110: Connection timed out) while connecting to upstream
upstream timed out (110: Connection timed out) while reading upstream
recv() failed (104: Connection reset by peer) while reading response header

Client-facing symptoms: - HTTP 504 Gateway Timeout - Requests taking exactly the timeout duration before failing - Intermittent timeouts for slow endpoints - Timeouts during high load or for specific operations

Common Causes

  1. 1.Backend processing too slow - Application takes longer than timeout to respond
  2. 2.Timeout values too aggressive - Configured timeout shorter than needed
  3. 3.Backend resource exhaustion - Server overloaded, slow to respond
  4. 4.Network latency - High latency between Nginx and backend
  5. 5.Backend hanging - Application deadlocked or stuck
  6. 6.Large file uploads/downloads - Body timeout exceeded for large transfers
  7. 7.Keepalive connections timing out - Idle connections in pool expired

Step-by-Step Fix

Step 1: Identify Timeout Location

```bash # Check Nginx error logs tail -100 /var/log/nginx/error.log | grep -i timeout

# Look for specific timeout patterns grep "timed out" /var/log/nginx/error.log | tail -50

# Check for upstream errors grep "upstream" /var/log/nginx/error.log | tail -50 ```

Step 2: Check Current Timeout Configuration

```bash # Find timeout settings nginx -T 2>/dev/null | grep -E "timeout|proxy_read|proxy_send|proxy_connect"

# Or check specific config file grep -r "timeout" /etc/nginx/ ```

Step 3: Measure Backend Response Times

```bash # Test backend directly curl -w "Connect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \ http://backend-server:8080/api/slow-endpoint

# Check backend health and load ssh backend-server 'top -bn1 | head -20' ssh backend-server 'free -m' ssh backend-server 'iostat -x 1 5' ```

Step 4: Adjust Timeout Configuration

```nginx http { # Connection establishment timeout proxy_connect_timeout 60s;

# Time to wait for response headers proxy_read_timeout 60s;

# Time to wait when sending request to backend proxy_send_timeout 60s;

# For slow backends, increase these values proxy_connect_timeout 120s; proxy_read_timeout 300s; proxy_send_timeout 300s;

upstream backend_servers { server 10.0.0.1:8080; server 10.0.0.2:8080;

# Keepalive connections keepalive 32; }

server { listen 80;

location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Override for specific slow endpoints proxy_read_timeout 600s; }

# Fast timeouts for health checks location /health { proxy_pass http://backend_servers/health; proxy_connect_timeout 5s; proxy_read_timeout 5s; }

# Very long timeout for file uploads location /upload { proxy_pass http://backend_servers/upload; client_max_body_size 100M; proxy_read_timeout 1800s; proxy_send_timeout 1800s; } } } ```

Step 5: Configure Keepalive Properly

```nginx upstream backend_servers { server 10.0.0.1:8080; server 10.0.0.2:8080;

# Number of keepalive connections per worker keepalive 32;

# Timeout for keepalive connections keepalive_timeout 60s;

# Maximum requests per keepalive connection keepalive_requests 1000; }

server { location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Don't timeout keepalive connections prematurely proxy_read_timeout 60s; } } ```

Step 6: Add Retry Logic

```nginx upstream backend_servers { server 10.0.0.1:8080 max_fails=3 fail_timeout=30s; server 10.0.0.2:8080 max_fails=3 fail_timeout=30s;

keepalive 32; }

server { location / { proxy_pass http://backend_servers;

# Retry on timeout errors proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 3; proxy_next_upstream_timeout 60s; } } ```

Step 7: Test and Verify

```bash # Test configuration nginx -t

# Reload Nginx systemctl reload nginx

# Monitor upstream status tail -f /var/log/nginx/error.log | grep -i upstream

# Test slow endpoint time curl http://load-balancer/api/slow-endpoint ```

Advanced Diagnosis

Enable Upstream Status Monitoring

```nginx # Add status endpoint server { listen 8080;

location /nginx_status { stub_status on; allow 127.0.0.1; deny all; }

location /upstream_status { upstream_show on; } } ```

Log Slow Requests

```nginx log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for" ' 'rt=$request_time uct="$upstream_connect_time" ' 'uht="$upstream_header_time" urt="$upstream_response_time"';

access_log /var/log/nginx/access.log main;

# Log requests taking longer than 5 seconds map $request_time $loggable { default 0; ~^[5-9] 1; ~^[0-9]{2,} 1; }

access_log /var/log/nginx/slow.log main if=$loggable; ```

Check Backend Thread/Worker Status

```bash # For Node.js backends curl http://backend:8080/debug/threads

# For Java backends jstack $(pgrep java) | grep -A5 "BLOCKED|WAITING"

# For Python backends curl http://backend:8080/debug/worker-status ```

Common Pitfalls

  • Different timeout per location - Forgetting to set timeout for specific slow endpoints
  • Keepalive timeout > backend timeout - Connections expire before reuse
  • Only proxy_read_timeout set - Missing proxy_connect_timeout and proxy_send_timeout
  • Timeout after large upload - Not setting client_max_body_size with timeout
  • No upstream retry - Single failure causes client error
  • Logging only access logs - Missing detailed upstream timing in logs

Best Practices

```nginx http { # Default timeouts proxy_connect_timeout 60s; proxy_read_timeout 60s; proxy_send_timeout 60s;

# Keepalive configuration keepalive_timeout 65s; keepalive_requests 1000;

# Buffer configuration for slow responses proxy_buffer_size 128k; proxy_buffers 8 128k; proxy_busy_buffers_size 256k;

upstream backend_servers { zone backend 64k; least_conn;

server 10.0.0.1:8080 max_fails=3 fail_timeout=30s; server 10.0.0.2:8080 max_fails=3 fail_timeout=30s; server 10.0.0.3:8080 backup;

keepalive 32; }

server { listen 80;

location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Retry configuration proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 3; } } } ```

  • HAProxy Maxconn Reached
  • Nginx Upstream Not Balancing
  • AWS ALB Target Unhealthy
  • 504 Gateway Timeout