Introduction

ECS Exec depends on AWS Systems Manager. A task can be healthy and serving traffic while aws ecs execute-command still fails because exec is disabled, the task lacks SSM connectivity, or the required IAM permissions are missing.

Symptoms

  • aws ecs execute-command returns TargetNotConnectedException or a generic execute command failure
  • The ECS service is healthy, but shell access into the task never starts
  • Only some tasks in the same service support exec while others do not
  • The issue begins after a new task definition revision or subnet change

Common Causes

  • ECS Exec was not enabled on the service or task definition
  • The operator or task role lacks the required SSM permissions
  • Private subnets cannot reach SSM endpoints through NAT or VPC endpoints
  • Tasks were started before the execute-command capability was available

Step-by-Step Fix

  1. 1.Confirm exec is enabled for the service and task
  2. 2.Check the service flag and task details first so you do not debug IAM for a feature that was never enabled.
bash
aws ecs describe-services --cluster my-cluster --services my-service --query 'services[0].enableExecuteCommand'
aws ecs describe-tasks --cluster my-cluster --tasks <task-id>
  1. 1.Review operator and task IAM permissions
  2. 2.The caller and the task both need SSM-related permissions for ECS Exec to succeed.
bash
aws iam list-attached-role-policies --role-name ecsTaskExecutionRole
aws sts get-caller-identity
  1. 1.Validate network reachability to SSM endpoints
  2. 2.Private tasks need either outbound internet through NAT or interface endpoints for ssm, ssmmessages, and ec2messages.
bash
aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=<vpc-id>
aws ec2 describe-route-tables --route-table-ids <rtb-id>
  1. 1.Force a new deployment after fixing the root cause
  2. 2.Existing tasks may need replacement even after IAM or networking is corrected.
bash
aws ecs update-service --cluster my-cluster --service my-service --force-new-deployment
aws ecs execute-command --cluster my-cluster --task <task-id> --container app --interactive --command "/bin/sh"

Prevention

  • Enable ECS Exec deliberately in service definitions
  • Standardize SSM endpoint requirements for private ECS subnets
  • Test exec on every new service before relying on it during incidents
  • Review both operator IAM and task IAM during rollout checks