Your application works fine most of the time, but occasionally Nginx returns 502 errors with "connection reset by peer" in the logs. The frustrating part is the backend seems healthy—it's running and responding to health checks. But something is causing connections to drop unexpectedly.
This error is harder to diagnose than a simple "connection refused" because it means the connection was established but then abruptly terminated. Let's trace through the possible causes.
Understanding Connection Reset by Peer
The error appears in /var/log/nginx/error.log:
2026/04/04 14:00:00 [error] 1234#1234: *5678 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: 192.168.1.100, server: api.example.com, request: "GET /users HTTP/1.1", upstream: "http://127.0.0.1:3000/users"Or during connection:
2026/04/04 14:00:00 [error] 1234#1234: *5678 connect() failed (104: Connection reset by peer) while connecting to upstream- 1.This means:
- 2.Nginx successfully initiated a connection to the backend
- 3.The backend accepted the connection
- 4.The connection was abruptly closed (RST packet sent)
Step 1: Check Backend Process Health
First, verify your backend isn't crashing or restarting:
```bash # Check if backend process exists and is stable ps aux | grep -E "(node|python|gunicorn|php-fpm|uwsgi)"
# Monitor process restarts watch -n 1 'ps aux | grep -E "node|python" | grep -v grep'
# Check system logs for OOM killer dmesg | grep -i "killed process" journalctl -xe | grep -i "out of memory" ```
For Node.js applications:
# Check for uncaught exceptions
pm2 logs --err
# or for systemd services
journalctl -u your-app -n 100 --no-pagerFor Python applications:
```bash # Check Gunicorn logs journalctl -u gunicorn -n 100
# Check for worker timeouts grep -E "timeout|killed|worker" /var/log/gunicorn/*.log ```
Step 2: Analyze Backend Crash Logs
The connection reset usually happens because the backend crashed mid-request:
```bash # Application-specific log locations # Node.js with PM2 pm2 logs
# Python Gunicorn journalctl -u gunicorn --since "10 minutes ago"
# PHP-FPM tail -f /var/log/php-fpm/error.log
# Java applications tail -f /var/log/tomcat/catalina.out # or journalctl -u spring-boot-app ```
Look for: - Stack traces - Memory errors - Timeout errors - Worker process deaths
Step 3: Fix Keepalive Mismatches
One of the most common causes is a keepalive timeout mismatch. Nginx keeps connections open for reuse, but the backend closes them:
Check Nginx upstream keepalive:
upstream backend {
server 127.0.0.1:3000;
keepalive 64; # Keep 64 connections open
}Check your backend's keepalive timeout:
For Node.js:
``javascript
// Default server
const server = app.listen(3000);
server.keepAliveTimeout = 65000; // milliseconds
server.headersTimeout = 66000; // slightly higher than keepAliveTimeout
For Gunicorn:
``bash
gunicorn --keep-alive 65 --timeout 120 app:app
The fix: Set backend keepalive timeout higher than Nginx:
```nginx # Nginx config upstream backend { server 127.0.0.1:3000; keepalive 64; keepalive_timeout 60s; # Nginx timeout }
server { location / { proxy_pass http://backend; proxy_http_version 1.1; proxy_set_header Connection "";
# Add these proxy_connect_timeout 60s; proxy_send_timeout 60s; proxy_read_timeout 60s; } } ```
// Node.js backend - set higher than Nginx
server.keepAliveTimeout = 65000; // 65 secondsStep 4: Check for Request/Response Buffer Overflows
Large requests or responses can cause resets if buffers are too small:
Error indicating buffer issues:
``
upstream sent too big header while reading response header from upstream
Fix buffer sizes:
```nginx location / { proxy_pass http://backend;
# Increase buffer sizes proxy_buffer_size 128k; proxy_buffers 4 256k; proxy_busy_buffers_size 256k;
# For large headers from upstream proxy_buffering on; proxy_max_temp_file_size 0; } ```
For FastCGI (PHP):
```nginx location ~ \.php$ { fastcgi_pass unix:/run/php/php8.2-fpm.sock;
# Increase FastCGI buffers fastcgi_buffer_size 128k; fastcgi_buffers 4 256k; fastcgi_busy_buffers_size 256k; } ```
Step 5: Investigate Network Issues
Connection resets can come from network middleboxes:
```bash # Check for packet loss ping -c 100 backend-server
# Check MTU issues (can cause resets on large packets) ping -M do -s 1472 backend-server # Test MTU
# Capture traffic during error tcpdump -i any port 3000 -w /tmp/backend.pcap
# Analyze captured traffic tcpdump -r /tmp/backend.pcap -n | grep -i reset ```
For Docker/container networking:
```bash # Check if containers are on same network docker network inspect bridge
# Try using host networking # docker run --network host ...
# Check container DNS docker exec nginx-container nslookup backend-service ```
Step 6: Handle Backend Overload
When backends are overloaded, they may accept connections but fail to process them:
```bash # Check backend resource usage top -p $(pgrep -f "node|python")
# Check connection queue ss -tlnp | grep 3000
# Check backlog cat /proc/sys/net/core/somaxconn ```
If the listen queue is full, connections get reset:
# Increase Nginx's listen queue
server {
listen 80 backlog=65535;
}# System-wide listen queue
sysctl -w net.core.somaxconn=65535For Node.js:
```javascript server.listen(3000, () => { console.log('Server running'); }).on('error', (err) => { console.error('Server error:', err); });
// Set max connections server.maxConnections = 10000; ```
Step 7: Fix Protocol Mismatches
HTTP/1.0 vs HTTP/1.1 issues can cause resets:
# Always use HTTP/1.1 for keepalive
location / {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Connection "";
}For WebSocket upgrades:
location /ws {
proxy_pass http://backend;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 86400; # Long timeout for WebSocket
}For HTTP/2 backends (rare, usually backend is HTTP/1.1):
location / {
proxy_pass http://backend;
proxy_http_version 1.1; # Backend typically uses 1.1
}Step 8: Debug with Connection Logging
Add detailed logging to understand the reset:
```nginx log_format upstream_debug '$remote_addr - $status - $upstream_addr ' '$upstream_status $upstream_response_time ' '$upstream_connect_time $request_time';
server { access_log /var/log/nginx/upstream.log upstream_debug;
location / { proxy_pass http://backend; proxy_http_version 1.1; proxy_set_header Connection "";
# Add debugging headers add_header X-Upstream-Addr $upstream_addr always; add_header X-Upstream-Status $upstream_status always; } } ```
Analyze patterns:
```bash # Find requests that had upstream issues grep -E "499|502|504" /var/log/nginx/upstream.log
# Group by upstream response awk '{print $5}' /var/log/nginx/upstream.log | sort | uniq -c | sort -rn ```
Step 9: Check SELinux/AppArmor
On systems with mandatory access control, connections may be reset:
```bash # Check for SELinux denials ausearch -m avc -ts recent | grep nginx
# Allow network connections setsebool -P httpd_can_network_connect 1
# Check AppArmor aa-status ```
Step 10: Implement Retry Logic
For transient resets, implement retry logic:
```nginx upstream backend { server 127.0.0.1:3000 max_fails=3 fail_timeout=30s; server 127.0.0.1:3001 backup; }
server { location / { proxy_pass http://backend; proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 3; proxy_connect_timeout 5s; } } ```
This configuration: - Marks a server as failed after 3 errors in 30 seconds - Tries the next upstream on error - Limits to 3 retry attempts
Complete Troubleshooting Configuration
A robust configuration that handles resets gracefully:
```nginx upstream backend { server 127.0.0.1:3000 max_fails=3 fail_timeout=30s; keepalive 64; }
server { listen 80;
# Logging for debugging log_format main '$remote_addr - $status $upstream_status ' '$upstream_response_time $request_time ' '$upstream_connect_time';
access_log /var/log/nginx/access.log main; error_log /var/log/nginx/error.log warn;
location / { proxy_pass http://backend;
# HTTP/1.1 with keepalive proxy_http_version 1.1; proxy_set_header Connection "";
# Timeouts proxy_connect_timeout 60s; proxy_send_timeout 60s; proxy_read_timeout 60s;
# Buffers proxy_buffer_size 128k; proxy_buffers 4 256k;
# Retry logic proxy_next_upstream error timeout http_502 http_503 http_504; proxy_next_upstream_tries 2;
# Headers proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; } } ```
Quick Diagnosis Flowchart
Connection Reset by Peer
|
v
Is backend process running?
| |
No Yes
| |
Start it Check logs
| |
v v
Is backend Any crashes
crashing? or OOM?
| |
Yes Yes
| |
Fix crash Add memory
issue or fix code
| |
v v
Check keepalive Check network
timeout match (MTU, firewalls)
| |
v v
Adjust timeouts Fix network
issuesConnection reset by peer is almost always a backend issue crashing, timing out, or misconfigured keepalive. Focus your investigation on the backend first, then adjust Nginx configuration for resilience.