Introduction
A 503 Service Unavailable response means the server is reachable but is not ready to handle the request right now. Sometimes that is planned maintenance. Often it is unplanned overload, upstream failure, or exhausted worker capacity.
Symptoms
- Visitors see
503 Service Unavailable - The issue affects many routes at once rather than one broken page
- Health checks or uptime monitors show intermittent recovery and relapse
- The error started during maintenance, traffic spikes, or service restarts
- Logs reference unavailable upstreams, maintenance mode, or overloaded workers
Common Causes
- The application is in maintenance mode or deploy transition
- Worker processes, thread pools, or connection pools are exhausted
- The web tier cannot reach required upstream services
- Autoscaling or container scheduling left too few healthy instances
- Rate limiting or WAF rules accidentally present 503 responses to legitimate traffic
Step-by-Step Fix
- Check whether the 503 is intentional by looking for maintenance pages, deploy windows, or explicit unavailable responses configured at the edge.
- Review application and infrastructure health to see whether workers, pods, or instances are running and marked healthy.
- Inspect CPU, memory, connection pools, and queue depth for overload signals that prevent the app from accepting more requests.
- Verify dependencies such as databases, caches, or internal APIs are reachable and not causing the app to fail readiness checks.
- If the issue follows a deploy, compare the new release with the last healthy version and inspect startup or readiness logs.
- Remove stale maintenance flags, restore healthy instances, or scale capacity only after identifying the actual bottleneck.
- Re-test the main user flows and monitor whether 503 rates drop under normal traffic.
- Confirm load balancers and health checks now see the service as ready before declaring the incident resolved.
- Keep alerting on readiness failures and worker exhaustion so future 503s are visible early.