Fix Nginx Load Balancer Upstream Timeout 504

Introduction

A 504 Gateway Timeout error from Nginx indicates the load balancer successfully connected to a backend server but did not receive a response within the configured timeout period. This differs from a 502 error (invalid response) or 503 error (service unavailable). The timeout can occur at different stages: connection establishment, header reading, or body reading.

Symptoms

Error messages in Nginx error log:

bash

upstream timed out (110: Connection timed out) while reading response header from upstream
upstream timed out (110: Connection timed out) while connecting to upstream
upstream timed out (110: Connection timed out) while reading upstream
recv() failed (104: Connection reset by peer) while reading response header

Client-facing symptoms: - HTTP 504 Gateway Timeout - Requests taking exactly the timeout duration before failing - Intermittent timeouts for slow endpoints - Timeouts during high load or for specific operations

Common Causes

1.Backend processing too slow - Application takes longer than timeout to respond
2.Timeout values too aggressive - Configured timeout shorter than needed
3.Backend resource exhaustion - Server overloaded, slow to respond
4.Network latency - High latency between Nginx and backend
5.Backend hanging - Application deadlocked or stuck
6.Large file uploads/downloads - Body timeout exceeded for large transfers
7.Keepalive connections timing out - Idle connections in pool expired

Step-by-Step Fix

Step 1: Identify Timeout Location

```bash # Check Nginx error logs tail -100 /var/log/nginx/error.log | grep -i timeout

# Look for specific timeout patterns grep "timed out" /var/log/nginx/error.log | tail -50

# Check for upstream errors grep "upstream" /var/log/nginx/error.log | tail -50 ```

Step 2: Check Current Timeout Configuration

```bash # Find timeout settings nginx -T 2>/dev/null | grep -E "timeout|proxy_read|proxy_send|proxy_connect"

# Or check specific config file grep -r "timeout" /etc/nginx/ ```

Step 3: Measure Backend Response Times

```bash # Test backend directly curl -w "Connect: %{time_connect}s\nTTFB: %{time_starttransfer}s\nTotal: %{time_total}s\n" \ http://backend-server:8080/api/slow-endpoint

# Check backend health and load ssh backend-server 'top -bn1 | head -20' ssh backend-server 'free -m' ssh backend-server 'iostat -x 1 5' ```

Step 4: Adjust Timeout Configuration

```nginx http { # Connection establishment timeout proxy_connect_timeout 60s;

# Time to wait for response headers proxy_read_timeout 60s;

# Time to wait when sending request to backend proxy_send_timeout 60s;

# For slow backends, increase these values proxy_connect_timeout 120s; proxy_read_timeout 300s; proxy_send_timeout 300s;

upstream backend_servers { server 10.0.0.1:8080; server 10.0.0.2:8080;

# Keepalive connections keepalive 32; }

server { listen 80;

location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Override for specific slow endpoints proxy_read_timeout 600s; }

# Fast timeouts for health checks location /health { proxy_pass http://backend_servers/health; proxy_connect_timeout 5s; proxy_read_timeout 5s; }

# Very long timeout for file uploads location /upload { proxy_pass http://backend_servers/upload; client_max_body_size 100M; proxy_read_timeout 1800s; proxy_send_timeout 1800s; } } } ```

Step 5: Configure Keepalive Properly

```nginx upstream backend_servers { server 10.0.0.1:8080; server 10.0.0.2:8080;

# Number of keepalive connections per worker keepalive 32;

# Timeout for keepalive connections keepalive_timeout 60s;

# Maximum requests per keepalive connection keepalive_requests 1000; }

server { location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Don't timeout keepalive connections prematurely proxy_read_timeout 60s; } } ```

Step 6: Add Retry Logic

```nginx upstream backend_servers { server 10.0.0.1:8080 max_fails=3 fail_timeout=30s; server 10.0.0.2:8080 max_fails=3 fail_timeout=30s;

keepalive 32; }

server { location / { proxy_pass http://backend_servers;

# Retry on timeout errors proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 3; proxy_next_upstream_timeout 60s; } } ```

Step 7: Test and Verify

```bash # Test configuration nginx -t

# Reload Nginx systemctl reload nginx

# Monitor upstream status tail -f /var/log/nginx/error.log | grep -i upstream

# Test slow endpoint time curl http://load-balancer/api/slow-endpoint ```

Advanced Diagnosis

Enable Upstream Status Monitoring

```nginx # Add status endpoint server { listen 8080;

location /nginx_status { stub_status on; allow 127.0.0.1; deny all; }

location /upstream_status { upstream_show on; } } ```

Log Slow Requests

```nginx log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for" ' 'rt=$request_time uct="$upstream_connect_time" ' 'uht="$upstream_header_time" urt="$upstream_response_time"';

access_log /var/log/nginx/access.log main;

# Log requests taking longer than 5 seconds map $request_time $loggable { default 0; ~^[5-9] 1; ~^[0-9]{2,} 1; }

access_log /var/log/nginx/slow.log main if=$loggable; ```

Check Backend Thread/Worker Status

```bash # For Node.js backends curl http://backend:8080/debug/threads

# For Java backends jstack $(pgrep java) | grep -A5 "BLOCKED|WAITING"

# For Python backends curl http://backend:8080/debug/worker-status ```

Common Pitfalls

Different timeout per location - Forgetting to set timeout for specific slow endpoints
Keepalive timeout > backend timeout - Connections expire before reuse
Only proxy_read_timeout set - Missing proxy_connect_timeout and proxy_send_timeout
Timeout after large upload - Not setting client_max_body_size with timeout
No upstream retry - Single failure causes client error
Logging only access logs - Missing detailed upstream timing in logs

Best Practices

```nginx http { # Default timeouts proxy_connect_timeout 60s; proxy_read_timeout 60s; proxy_send_timeout 60s;

# Keepalive configuration keepalive_timeout 65s; keepalive_requests 1000;

# Buffer configuration for slow responses proxy_buffer_size 128k; proxy_buffers 8 128k; proxy_busy_buffers_size 256k;

upstream backend_servers { zone backend 64k; least_conn;

server 10.0.0.1:8080 max_fails=3 fail_timeout=30s; server 10.0.0.2:8080 max_fails=3 fail_timeout=30s; server 10.0.0.3:8080 backup;

keepalive 32; }

server { listen 80;

location / { proxy_pass http://backend_servers; proxy_http_version 1.1; proxy_set_header Connection "";

# Retry configuration proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 3; } } } ```

HAProxy Maxconn Reached
Nginx Upstream Not Balancing
AWS ALB Target Unhealthy
504 Gateway Timeout

Nginx Load Balancer Upstream Timeout

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Step 1: Identify Timeout Location

Step 2: Check Current Timeout Configuration

Step 3: Measure Backend Response Times

Step 4: Adjust Timeout Configuration

Step 5: Configure Keepalive Properly

Step 6: Add Retry Logic

Step 7: Test and Verify

Advanced Diagnosis

Enable Upstream Status Monitoring

Log Slow Requests

Check Backend Thread/Worker Status

Common Pitfalls

Best Practices

Related Issues

Share this guide

More Load Balancer Troubleshooting Guides

Azure Front Door Routing Rule Not Matching

Azure Front Door Backend Unavailable

Azure Application Gateway SSL Certificate Missing

Azure Application Gateway WAF Blocks Legitimate

Azure Application Gateway 502

Azure Load Balancer Outbound Rule Not Working