Introduction An ALB returns 503 Service Unavailable when there are no healthy targets in the target group. This means either all backend instances failed health checks, targets were deregistered, or the ALB cannot reach the targets due to networking issues.

Symptoms - HTTP 503 response from ALB DNS name - ALB access logs show elb_status_code: 503 - Target group shows all targets as "unhealthy" or "unused" - CloudWatch metric HealthyHostCount drops to 0 - UnHealthyHostCount metric spikes

Common Causes - Health check path returns non-200 status code - Security group on targets blocks ALB health check traffic - Target application crashed or stopped listening - Health check timeout too short for slow-starting applications - NACL rules blocking traffic between ALB subnet and target subnet

Step-by-Step Fix 1. **Check target health status**: ```bash aws elbv2 describe-target-health --target-group-arn arn:aws:elasticloadbalancing:... ``` Look for: "unhealthy", "draining", "unused", or "initial".

  1. 1.Test health check endpoint directly:
  2. 2.```bash
  3. 3.curl -v http://<target-private-ip>:<port>/health
  4. 4.`
  5. 5.Check if the response matches the expected health check matcher.
  6. 6.Verify security groups allow health check traffic:
  7. 7.```bash
  8. 8.aws ec2 describe-security-groups --group-ids <target-sg-id>
  9. 9.`
  10. 10.The target SG must allow inbound TCP on the health check port from the ALB SG.
  11. 11.Adjust health check parameters:
  12. 12.```bash
  13. 13.aws elbv2 modify-target-group \
  14. 14.--target-group-arn <arn> \
  15. 15.--health-check-interval-seconds 30 \
  16. 16.--health-check-timeout-seconds 10 \
  17. 17.--healthy-threshold-count 3 \
  18. 18.--health-check-path /healthz
  19. 19.`
  20. 20.Check deregistration delay:
  21. 21.```bash
  22. 22.aws elbv2 modify-target-group-attributes \
  23. 23.--target-group-arn <arn> \
  24. 24.--attributes Key=deregistration_delay.timeout_seconds,Value=300
  25. 25.`

Prevention - Set health check interval to 10-30 seconds with threshold of 3 - Use dedicated /healthz endpoint that checks all dependencies - Configure deregistration delay of 60-300 seconds - Monitor HealthyHostCount with CloudWatch alarms - Implement blue-green deployments with separate target groups