Introduction CloudWatch alarms that do not trigger despite threshold breaches create dangerous monitoring gaps. This can be caused by missing metrics, incorrect dimensions, or alarm configuration errors.
Symptoms - Alarm state shows "INSUFFICIENT_DATA" persistently - Metric value exceeds threshold but alarm stays "OK" - Alarm never transitions to "ALARM" state - SNS notification not sent for known issues - Custom metrics not appearing in CloudWatch
Common Causes - Metric not being published to CloudWatch - Wrong namespace or dimensions in alarm configuration - Evaluation period too long for transient issues - Metric resolution (1-min vs 5-min) mismatch - Cross-region alarm referencing wrong region metrics
Step-by-Step Fix 1. **Check if metric exists': ```bash aws cloudwatch list-metrics --namespace AWS/EC2 --metric-name CPUUtilization \ --dimensions Name=InstanceId,Value=i-1234567890 ```
- 1.**Check alarm configuration':
- 2.```bash
- 3.aws cloudwatch describe-alarms --alarm-names MyAlarm
- 4.# Check: Namespace, Dimensions, Threshold, Period, EvaluationPeriods
- 5.
` - 6.**Verify metric data points exist':
- 7.```bash
- 8.aws cloudwatch get-metric-statistics \
- 9.--namespace AWS/EC2 --metric-name CPUUtilization \
- 10.--dimensions Name=InstanceId,Value=i-1234567890 \
- 11.--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
- 12.--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
- 13.--period 300 --statistics Average
- 14.
`