Introduction
HAProxy health checks are critical for ensuring traffic only flows to healthy backend servers. When health checks fail incorrectly, healthy servers are removed from the pool causing unnecessary capacity reduction or complete outages. Health check failures can result from endpoint misconfiguration, timing issues, response mismatches, or network problems.
Symptoms
Error messages in HAProxy logs:
Health check for server web_servers/web1 failed
Check: tcp://10.0.0.1:8080 result: Layer7 wrong status
Check: tcp://10.0.0.1:8080 result: Layer4 timeout
http-check expect status 200 but got 404Observable indicators:
- Servers marked DOWN despite serving traffic correctly
- Intermittent flapping between UP and DOWN states
- Specific check_status values: L4TOUT, L7TOUT, L7STS, L7OK
- Manual curl to backend succeeds but health check fails
Common Causes
- 1.Wrong health check endpoint - Checking
/healthbut app uses/healthz - 2.Incorrect expected response - Expecting 200 but app returns 201
- 3.Host header missing - Application requires specific Host header
- 4.Timeout too short - Health check endpoint slower than configured timeout
- 5.Check interval too aggressive - Backend overwhelmed by health checks
- 6.SSL/TLS mismatch - HTTP check on HTTPS endpoint or vice versa
- 7.Port mismatch - Checking wrong port for health endpoint
Step-by-Step Fix
Step 1: Identify Health Check Status
```bash # Check server status echo "show stat" | socat stdio /var/run/haproxy.sock | grep -E "pxname|web"
# Look for check_status and last_chk echo "show stat" | socat stdio /var/run/haproxy.sock
# Check detailed server info echo "show servers state" | socat stdio /var/run/haproxy.sock ```
Step 2: Manually Test Health Endpoint
```bash # Test exactly what HAProxy is checking curl -v http://10.0.0.1:8080/health
# Test with Host header if required curl -v -H "Host: example.com" http://10.0.0.1:8080/health
# Test with HTTPS if needed curl -vk https://10.0.0.1:8443/health
# Check response timing curl -w "Time: %{time_total}s\n" http://10.0.0.1:8080/health ```
Step 3: Review Health Check Configuration
```haproxy backend web_servers balance roundrobin
# Option 1: Simple TCP check server web1 10.0.0.1:8080 check
# Option 2: HTTP GET check option httpchk GET /health http-check expect status 200 server web1 10.0.0.1:8080 check
# Option 3: Full HTTP check with headers option httpchk GET /health HTTP/1.1\r\nHost:\ example.com\r\nConnection:\ close http-check expect status 200 server web1 10.0.0.1:8080 check ```
Step 4: Fix Common Configuration Issues
Fix: Wrong Endpoint
``haproxy
backend web_servers
# Change from /health to correct endpoint
option httpchk GET /healthz HTTP/1.1\r\nHost:\ example.com
http-check expect status 200
Fix: Multiple Acceptable Status Codes
``haproxy
backend web_servers
option httpchk GET /health
# Accept 200-299 status codes
http-check expect status 200-399
# Or accept specific codes
http-check expect status 200 201 202
Fix: Response Body Check
``haproxy
backend web_servers
option httpchk GET /health
# Check for specific string in response body
http-check expect string "healthy"
# Or use regex
http-check expect rstatus "^[23][0-9]{2}$"
Fix: Timeout Settings
``haprorix
backend web_servers
option httpchk GET /health
http-check expect status 200
# Increase check timeout for slow endpoints
server web1 10.0.0.1:8080 check inter 10s fall 3 rise 2 timeout check 5s
Step 5: Fix SSL Health Checks
```haproxy backend web_servers balance roundrobin
# SSL health check with SNI server web1 10.0.0.1:8443 check ssl verify required ca-file /etc/ssl/certs/ca.crt sni str(backend.example.com)
# Or use HTTP check on SSL port option httpchk GET /health server web1 10.0.0.1:8443 check-ssl ssl verify none ```
Step 6: Verify the Fix
```bash # Reload configuration haproxy -c -f /etc/haproxy/haproxy.cfg systemctl reload haproxy
# Watch health check status watch -n 1 'echo "show stat" | socat stdio /var/run/haproxy.sock | grep web'
# Monitor logs tail -f /var/log/haproxy.log | grep -i health ```
Advanced Diagnosis
Debug Health Check Process
```bash # Run HAProxy in debug mode haproxy -d -f /etc/haproxy/haproxy.cfg
# Check agent health if configured echo "show sess" | socat stdio /var/run/haproxy.sock ```
External Health Check Script
backend web_servers
# Use external check script
option external-check
external-check path "/usr/local/bin:/bin:/sbin"
external-check command /usr/local/bin/health-check.sh
server web1 10.0.0.1:8080 check#!/bin/bash
# /usr/local/bin/health-check.sh
# $1 = address, $2 = port
curl -sf http://$1:$2/health && exit 0 || exit 1MySQL Health Check Example
backend mysql_servers
mode tcp
option mysql-check user haproxy_check
server mysql1 10.0.0.1:3306 checkRedis Health Check Example
backend redis_servers
mode tcp
option redis-check
server redis1 10.0.0.1:6379 checkCommon Pitfalls
- Redirect loops - Health endpoint returns 301/302 instead of 200
- Authentication required - Endpoint needs auth but check doesn't provide it
- IPv6 vs IPv4 - Health check using wrong IP family
- Wrong check port -
check-portdirective overriding server port - Maintenance mode - Server administratively disabled
- Agent check conflicts - Agent and HTTP check giving different results
Best Practices
```haproxy defaults # Global health check defaults timeout check 5s default-server inter 5s fall 3 rise 2
backend web_servers balance roundrobin
# Dedicated health endpoint option httpchk GET /health HTTP/1.1\r\nHost:\ example.com\r\nConnection:\ close http-check expect status 200-399
# Appropriate intervals based on app response time default-server inter 10s fall 3 rise 2
# Multiple servers for redundancy server web1 10.0.0.1:8080 check server web2 10.0.0.2:8080 check
# Enable observe for passive health checks server web1 10.0.0.1:8080 check observe layer7 error-limit 5 on-error mark-down ```
Related Issues
- HAProxy Backend Down
- HAProxy Connection Refused
- HAProxy SSL Handshake Failed
- AWS ALB Health Check Failing