Introduction When Prometheus targets show as "DOWN", metrics are not being collected, creating blind spots in monitoring. This can mask real issues and cause alerting gaps.
Symptoms - Prometheus UI shows target status: "DOWN" with "last error" message - Error: "context deadline exceeded" (scrape timeout) - Error: "connection refused" - Error: "server returned HTTP status 401 Unauthorized" - Metrics gap in Grafana dashboards
Common Causes - Target application crashed or restarted - Network policy blocking Prometheus to target traffic - TLS certificate expired on target metrics endpoint - Scrape timeout too short for slow metrics endpoint - Service discovery returning stale targets
Step-by-Step Fix 1. **Check target status in Prometheus UI**: Navigate to Status > Targets and look at the "Last Error" column.
- 1.Test metrics endpoint manually:
- 2.```bash
- 3.curl -v http://<target-ip>:<port>/metrics
- 4.# For TLS targets
- 5.curl -vk https://<target-ip>:<port>/metrics
- 6.
` - 7.Check Prometheus scrape configuration:
- 8.```yaml
- 9.scrape_configs:
- 10.- job_name: 'my-app'
- 11.scrape_interval: 15s
- 12.scrape_timeout: 10s
- 13.static_configs:
- 14.- targets: ['my-app:8080']
- 15.
` - 16.Check service discovery:
- 17.Navigate to Status > Service Discovery in Prometheus UI to see discovered targets.