Introduction

Nginx upstream dynamic DNS resolution fails when resolver directive is missing or TTL expired. This guide provides step-by-step diagnosis and resolution with specific commands and configuration examples.

Symptoms

Typical symptoms and error messages when this issue occurs:

bash
Load balancer error: backend unavailable
Check health check configuration
Verify backend server status

Observable indicators: - Load balancer returns 5xx errors to clients - Backend servers marked as unhealthy - Traffic not reaching expected backends

Common Causes

  1. 1.Nginx load balancer issues are typically caused by:
  2. 2.Upstream server configuration errors
  3. 3.Health check not enabled or misconfigured
  4. 4.Proxy timeout values too low
  5. 5.Missing or incorrect resolver for dynamic backends

Step-by-Step Fix

Step 1: Check Current State

bash
nginx -t && nginx -T | grep upstream

Step 2: Identify Root Cause

bash
curl -s localhost/nginx_status

Step 3: Apply Primary Fix

``` # Nginx upstream with health checks (requires NGINX Plus or open source patch) upstream backend { zone backend 64k; server 10.0.0.1:8080 max_fails=3 fail_timeout=30s; server 10.0.0.2:8080 max_fails=3 fail_timeout=30s;

keepalive 32; }

server { location / { proxy_pass http://backend; proxy_http_version 1.1; proxy_set_header Connection ""; proxy_connect_timeout 10s; proxy_read_timeout 60s; } } ```

Apply this configuration and reload the load balancer.

Step 4: Apply Alternative Fix (If Needed)

bash
# Alternative fix: adjust timeouts
proxy_connect_timeout 10s;
proxy_read_timeout 60s;
proxy_send_timeout 60s;

Step 5: Verify the Fix

After applying the fix, verify with:

bash
curl -s http://localhost/nginx_status && tail -10 /var/log/nginx/access.log

Expected output should show healthy backends and successful request routing.

Common Pitfalls

  • Forgetting proxy_http_version 1.1 for keepalive
  • Health check not enabled in OSS version
  • DNS cache causing stale upstream IPs
  • Weight ignored with IP hash method

Best Practices

  • Use keepalive for connection pooling
  • Configure appropriate proxy timeouts
  • Enable health checks with zones
  • Use resolver for dynamic upstream DNS
  • Nginx Upstream No Live Servers
  • Nginx Upstream Timeout
  • Nginx Rate Limit Burst Exceeded
  • Nginx Upstream Keepalive Not Reusing