Introduction When Alertmanager fails to fire alerts or route notifications correctly, incidents go undetected. This is one of the most dangerous monitoring failures because it creates a false sense of security.

Symptoms - Alert firing in Prometheus but no notification received - Alertmanager UI shows alerts as "Silenced" or "Inhibited" - Notification routed to wrong receiver (wrong Slack channel) - PagerDuty not creating incidents for critical alerts - Alertmanager logs show "notification unsuccessful"

Common Causes - Silence accidentally created and not expired - Inhibition rules suppressing alerts - Route configuration sending alerts to wrong receiver - Notification provider credentials expired - Alertmanager unable to connect to notification endpoint

Step-by-Step Fix 1. **Check active silences**: ```bash amtool silence query # Expire a silence amtool silence expire <silence-id> ```

  1. 1.Check inhibition rules:
  2. 2.Review alertmanager.yml for inhibition_rules that may be suppressing alerts.
  3. 3.Test notification receiver:
  4. 4.```bash
  5. 5.# Send test alert
  6. 6.curl -X POST http://alertmanager:9093/api/v2/alerts \
  7. 7.-H 'Content-Type: application/json' \
  8. 8.-d '[{"labels":{"alertname":"TestAlert","severity":"critical"},"annotations":{"summary":"Test"}}]'
  9. 9.`

Prevention - Require approval for creating silences in production - Set expiration on all silences (never use permanent) - Test notification routing regularly with test alerts - Monitor Alertmanager notification success rate - Use multiple notification channels for critical alerts