# HAProxy Session Stuck: Diagnose and Fix Connection Issues

Stuck sessions in HAProxy can lead to resource exhaustion, unresponsive backends, and poor user experience. Sessions that don't properly terminate consume connection slots and can eventually prevent new connections from being established.

Understanding Stuck Sessions

A stuck session typically manifests as: - Connections remaining in ESTABLISHED state indefinitely - Backend servers accumulating idle connections - HAProxy showing high session counts that don't decrease - Clients experiencing timeouts while HAProxy shows active sessions

Diagnosing Stuck Sessions

Check Current Sessions

Use the HAProxy socket to inspect active sessions:

bash
echo "show sess" | socat stdio /var/run/haproxy.sock

Sample output showing stuck sessions:

bash
0x7f8c9c001230: proto=htx fe=frontend_main be=web_servers srv=web1
  age=3600s calls=2 rq[f=800000h,i=0,an=00h,rx=0s,wx=0s]
  rp[f=800000h,i=0,an=00h,rx=0s,wx=0s]
  af=0.0.0.0:80 cf=/assets/stylesheet.css
  af=192.168.1.100:54321

The age=3600s indicates a 1-hour old session - likely stuck.

Check Server State

View server statistics:

bash
echo "show stat" | socat stdio /var/run/haproxy.sock | grep web1

Look at these columns: - scur - current sessions - smax - max sessions observed - qlen - queue length

If scur is high and not decreasing, you have stuck sessions.

Examine Network Connections

From the HAProxy server:

```bash # Count connections to each backend ss -tn state established 'sport = :80' | grep 192.168.1.10 | wc -l

# View connection details ss -tnp 'sport = :80' | grep haproxy ```

From backend servers:

bash
# Check connections from HAProxy
ss -tn 'sport = :80 and src = 192.168.1.5'

Common Causes and Solutions

Cause 1: Missing or Incorrect Timeouts

The most common cause is improperly configured timeouts that allow connections to linger.

Problem configuration:

haproxy
defaults
    timeout client 1h      # Too long!
    timeout server 1h      # Too long!
    timeout connect 5s

Corrected configuration:

haproxy
defaults
    timeout connect 5s
    timeout client 30s
    timeout server 30s
    timeout http-request 10s
    timeout http-keep-alive 10s
    timeout queue 30s

Timeout recommendations: - timeout connect - How long to wait for backend connection (5-10s) - timeout client - Client inactivity timeout (30-60s for web apps) - timeout server - Server response timeout (30-60s for web apps) - timeout http-request - Time to receive full request (5-30s) - timeout http-keep-alive - Keep-alive connection timeout (1-10s)

Cause 2: Keep-Alive Connections Not Closing

HTTP keep-alive connections can remain open indefinitely.

Add keep-alive management:

haproxy
defaults
    mode http
    option http-server-close
    option http-keep-alive
    timeout http-keep-alive 10s

The option http-server-close closes the connection to the server after each request while allowing client keep-alive.

For complete control:

haproxy
defaults
    mode http
    option httpclose  # Force close after each request

Cause 3: WebSocket Connections Without Timeouts

WebSocket connections need special timeout handling.

Problem:

haproxy
backend websocket_servers
    server ws1 192.168.1.10:8080 check

WebSocket connections stay open and use default timeouts.

Solution:

haproxy
backend websocket_servers
    timeout server 1h
    timeout tunnel 1h
    server ws1 192.168.1.10:8080 check

Or use ACLs to set different timeouts:

```haproxy frontend http_front bind *:80

# Detect WebSocket upgrade acl is_websocket hdr(Upgrade) -i websocket

# Use appropriate backend use_backend websocket_servers if is_websocket default_backend web_servers

backend websocket_servers timeout server 1h timeout tunnel 1h server ws1 192.168.1.10:8080 check

backend web_servers timeout server 30s server web1 192.168.1.10:80 check ```

Cause 4: Sticky Sessions Without Expiry

Session persistence can cause sessions to stick to unhealthy servers.

Problem:

haproxy
backend web_servers
    balance roundrobin
    cookie SERVERID insert indirect
    server web1 192.168.1.10:80 check cookie s1
    server web2 192.168.1.11:80 check cookie s2

Sessions persist indefinitely.

Solution - Add cookie maxidle and maxlife:

haproxy
backend web_servers
    balance roundrobin
    cookie SERVERID insert indirect maxidle 30m maxlife 8h
    server web1 192.168.1.10:80 check cookie s1
    server web2 192.168.1.11:80 check cookie s2
  • maxidle 30m - Cookie expires after 30 minutes of inactivity
  • maxlife 8h - Cookie expires after 8 hours total

Cause 5: Client Not Properly Closing Connections

Some clients don't send FIN packets when closing.

Enable TCP keepalive:

haproxy
defaults
    option clitcpka
    timeout client-fin 1s
  • option clitcpka - Enable TCP keepalive on client side
  • timeout client-fin - Time to wait for client FIN after server closes

Clearing Stuck Sessions

Kill Specific Sessions

Using the HAProxy socket:

```bash # Show all sessions echo "show sess" | socat stdio /var/run/haproxy.sock

# Kill a specific session echo "shutdown session 0x7f8c9c001230" | socat stdio /var/run/haproxy.sock ```

Kill All Sessions on a Server

```bash # Put server in maintenance (drains sessions) echo "set server web_servers/web1 state maint" | socat stdio /var/run/haproxy.sock

# Wait for sessions to drain sleep 30

# Bring server back echo "set server web_servers/web1 state ready" | socat stdio /var/run/haproxy.sock ```

Clear Sessions by Backend

bash
# Shutdown sessions for a specific backend
echo "shutdown sessions server web_servers/web1" | socat stdio /var/run/haproxy.sock

Connection Draining Best Practices

Graceful Server Shutdown

When removing a server from rotation:

```bash # 1. Put server in draining mode echo "set server web_servers/web1 state drain" | socat stdio /var/run/haproxy.sock

# 2. Wait for existing sessions to complete while [ $(echo "show stat" | socat stdio /var/run/haproxy.sock | grep web1 | awk -F',' '{print $5}') -gt 0 ]; do echo "Waiting for sessions to drain..." sleep 5 done

# 3. Put in maintenance echo "set server web_servers/web1 state maint" | socat stdio /var/run/haproxy.sock ```

Configure Graceful Shutdown in HAProxy

```haproxy backend web_servers # Enable graceful shutdown option http-server-close

# Don't accept new connections when shutting down server web1 192.168.1.10:80 check on-error mark-down ```

Monitoring and Alerting

Track Stuck Sessions

Create a monitoring script:

```bash #!/bin/bash # /usr/local/bin/check-stuck-sessions.sh

STUCK_THRESHOLD=300 # 5 minutes

# Get sessions older than threshold STUCK=$(echo "show sess" | socat stdio /var/run/haproxy.sock | \ grep -oE "age=[0-9]+s" | \ awk -F'[=s]' '$2 > 300 {count++} END {print count+0}')

if [ "$STUCK" -gt 10 ]; then echo "ALERT: $STUCK stuck sessions detected (older than ${STUCK_THRESHOLD}s)" exit 2 fi

echo "OK: No stuck sessions detected" exit 0 ```

Track Session Age Distribution

bash
#!/bin/bash
# Show session age distribution
echo "show sess" | socat stdio /var/run/haproxy.sock | \
    grep -oE "age=[0-9]+s" | \
    awk -F'[=s]' '
{
    if ($2 < 60) bucket["0-1min"]++
    else if ($2 < 300) bucket["1-5min"]++
    else if ($2 < 600) bucket["5-10min"]++
    else bucket[">10min"]++
}
END {
    for (b in bucket) print b": "bucket[b]
}'

Verification

After applying fixes, verify the changes:

```bash # Watch session counts in real-time watch -n 1 'echo "show stat" | socat stdio /var/run/haproxy.sock | grep -E "pxname|web_servers"'

# Monitor connection states watch -n 1 'ss -tn state established "sport = :80" | wc -l'

# Check for sessions older than 5 minutes echo "show sess" | socat stdio /var/run/haproxy.sock | grep -E "age=[3-9][0-9]{2}s|age=[0-9]{4,}s" ```

Sessions should now properly close after the configured timeouts, and scur should decrease as clients complete their requests.