Fix AWS ECR Image Pull Failed Lifecycle Policy | Container Registry

Introduction

When an ECR lifecycle policy deletes images that are still referenced by ECS task definitions, Kubernetes deployments, or Docker Compose files, subsequent image pulls fail with ImagePullBackOff or ManifestUnknown errors. This commonly happens when lifecycle rules are too aggressive with the "untagged" or "image count more than N" settings.

Symptoms

ECS task fails with: CannotPullContainerError: failed to resolve reference "account.dkr.ecr.region.amazonaws.com/repo:tag": manifest unknown
Kubernetes pod shows ErrImagePull with manifest unknown or image not found
ECR describe-images returns no results for the expected tag
CloudTrail shows BatchDeleteImage events around the time of the failure

Common Causes

Lifecycle rule with "image count more than 10" deletes tagged images when tag is not excluded
"Untagged" image rule removes images that lost their tag during redeployment
Multiple lifecycle rules with conflicting priorities
CI/CD pipeline reuses tags (e.g., "latest") causing older images to become untagged
No "tagged" image exclusion on count-based rules

Step-by-Step Fix

1.Identify deleted images in CloudTrail:
2.```bash
3.aws cloudtrail lookup-events \
4.--lookup-attributes AttributeKey=EventName,AttributeValue=BatchDeleteImage \
5.--start-time $(date -d '24 hours ago' +%s) \
6.--query 'Events[*].{Time:EventTime,User:Username}'
7.`
8.Check current lifecycle policy:
9.```bash
10.aws ecr get-lifecycle-policy --repository-name my-repo
11.`
12.Look for rules with "selection": {"tagStatus": "untagged"} or "countType": "imageCountMoreThan" without tag protection.
13.Update lifecycle policy to protect tagged images:
14.```bash
15.aws ecr put-lifecycle-policy \
16.--repository-name my-repo \
17.--lifecycle-policy-text '{
18."rules": [
19.{
20."rulePriority": 1,
21."description": "Keep last 20 tagged images",
22."selection": {
23."tagStatus": "tagged",
24."tagPrefixList": ["prod", "staging"],
25."countType": "imageCountMoreThan",
26."countNumber": 20
27.},
28."action": {"type": "expire"}
29.},
30.{
31."rulePriority": 2,
32."description": "Delete untagged images older than 7 days",
33."selection": {
34."tagStatus": "untagged",
35."countType": "sinceImagePushed",
36."countUnit": "days",
37."countNumber": 7
38.},
39."action": {"type": "expire"}
40.}
41.]
42.}'
43.`
44.Rebuild and push the missing image:
45.```bash
46.docker build -t account.dkr.ecr.region.amazonaws.com/repo:tag .
47.aws ecr get-login-password | docker login --username AWS --password-stdin account.dkr.ecr.region.amazonaws.com
48.docker push account.dkr.ecr.region.amazonaws.com/repo:tag
49.`
50.Restart affected services:
51.```bash
52.aws ecs update-service --cluster my-cluster --service my-service --force-new-deployment
53.`

Prevention

Always use immutable tags (include commit SHA or build number)
Add tagPrefixList to protect production and staging image tags
Set minimum image count thresholds above your rollback window needs
Enable ECR repository scanning for lifecycle policy changes
Use imageDigest in task definitions instead of mutable tags

AWS ECR Image Pull Failed Lifecycle Policy

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More AWS Troubleshooting Guides

AWS DynamoDB Contributor Insights Not Showing

AWS DynamoDB DAX Cache Miss

AWS DynamoDB Global Table Replication Lag

AWS Step Functions Workflow Stuck Waiting

AWS Step Functions Execution Throttled

AWS EventBridge Pipe Source Error