Introduction False positive alerts from synthetic monitoring erode trust in the alerting system and cause alert fatigue. When checks fail due to transient issues rather than real problems, engineers start ignoring alerts.
Symptoms - Alerts firing and resolving within minutes - Alert correlates with known network blips - Same check fails from one location but passes from others - HTTP check fails with timeout but service is healthy - Alert rate too high for the actual incident rate
Common Causes - Check timeout too aggressive - Single location check hitting network issues - DNS resolution intermittent - SSL certificate check with wrong validation - Check not accounting for CDN cache miss latency
Step-by-Step Fix 1. **Adjust check timeout and retries': Increase timeout from 10s to 30s and add retry count.
- 1.**Use multiple check locations':
- 2.Configure checks from at least 3 geographic locations. Alert only when 2+ locations fail.
- 3.**Add alert deduplication and cooldown':
- 4.Configure alert cooldown period (e.g., 15 minutes) before re-firing.
- 5.**Review check configuration':
- 6.```yaml
- 7.# Example synthetic check config
- 8.http:
- 9.url: https://api.example.com/health
- 10.method: GET
- 11.timeout: 30s
- 12.retries: 2
- 13.locations:
- 14.- us-east-1
- 15.- eu-west-1
- 16.- ap-southeast-1
- 17.assertion:
- 18.- type: statusCode
- 19.value: 200
- 20.- type: responseTime
- 21.value: 5000 # 5 seconds
- 22.
`