Introduction WebSocket connections that are unexpectedly dropped by load balancers cause real-time application failures. This is often caused by idle timeout settings that do not account for long-lived WebSocket connections.

Symptoms - WebSocket connections drop after fixed interval (e.g., 60 seconds) - Real-time notifications not received - Chat messages not delivered - Error: "WebSocket connection closed: 1006" - Works on direct connection but fails through load balancer

Common Causes - Load balancer idle timeout shorter than WebSocket ping interval - HTTP upgrade not properly handled - Load balancer not supporting WebSocket protocol - Backend not sending WebSocket ping frames - Connection draining closing active WebSocket connections

Step-by-Step Fix 1. **Increase load balancer idle timeout': ```bash # AWS ALB (max 4000 seconds) aws elbv2 modify-target-group-attributes \ --target-group-arn <arn> \ --attributes Key=stickiness.enabled,Value=false \ Key=deregistration_delay.timeout_seconds,Value=300 # For Classic ELB, increase idle timeout aws elb modify-load-balancer-attributes --load-balancer-name my-lb \ --load-balancer-attributes "{"ConnectionSettings":{"IdleTimeout":3600}}" ```

  1. 1.**Configure WebSocket ping from client':
  2. 2.```javascript
  3. 3.const ws = new WebSocket('wss://api.example.com/ws');
  4. 4.const pingInterval = setInterval(() => {
  5. 5.if (ws.readyState === WebSocket.OPEN) {
  6. 6.ws.send(JSON.stringify({ type: 'ping' }));
  7. 7.}
  8. 8.}, 30000); // Every 30 seconds
  9. 9.`

Prevention - Set idle timeout longer than WebSocket ping interval - Implement application-level ping/pong - Monitor WebSocket connection duration - Use dedicated WebSocket endpoint through load balancer - Test WebSocket through load balancer in staging