What's Actually Happening

AWS EC2 instances become unreachable via SSH when network configurations block access, the instance is in a failed state, or SSH service is not running. Connection attempts timeout or are refused.

The Error You'll See

SSH timeout:

```bash $ ssh -i key.pem ec2-user@ec2-1-2-3-4.compute-1.amazonaws.com

ssh: connect to host ec2-1-2-3-4.compute-1.amazonaws.com port 22: Connection timed out ```

Connection refused:

```bash $ ssh -i key.pem ec2-user@1.2.3.4

ssh: connect to host 1.2.3.4 port 22: Connection refused ```

Permission denied:

```bash $ ssh -i key.pem ec2-user@1.2.3.4

Permission denied (publickey). ```

Why This Happens

  1. 1.Security group blocks port 22 - No inbound rule for SSH
  2. 2.Network ACL denies - Stateless ACL blocking traffic
  3. 3.Instance not running - Instance stopped or terminated
  4. 4.Wrong key pair - Using incorrect private key
  5. 5.SSH service stopped - sshd not running on instance
  6. 6.CPU/IO exhaustion - Instance too busy to respond

Step 1: Check Instance Status

```bash # Check instance state aws ec2 describe-instances --instance-ids i-1234567890abcdef0 --query 'Reservations[0].Instances[0].State.Name'

# Should return: "running"

# Check instance status checks aws ec2 describe-instance-status --instance-ids i-1234567890abcdef0

# Look for: # SystemStatus: passed # InstanceStatus: passed

# Get instance details aws ec2 describe-instances --instance-ids i-1234567890abcdef0 --output table

# Check instance console output aws ec2 get-console-output --instance-id i-1234567890abcdef0 --output text

# If instance stopped, start it aws ec2 start-instances --instance-ids i-1234567890abcdef0 ```

Step 2: Check Security Group Rules

```bash # Get instance security groups aws ec2 describe-instances --instance-ids i-1234567890abcdef0 \ --query 'Reservations[0].Instances[0].SecurityGroups[*].GroupId' --output text

# Check security group inbound rules aws ec2 describe-security-groups --group-ids sg-12345678 \ --query 'SecurityGroups[0].IpPermissions[?FromPort==22]'

# Should show rule allowing port 22 # { # "FromPort": 22, # "IpProtocol": "tcp", # "IpRanges": [{"CidrIp": "0.0.0.0/0"}], # "ToPort": 22 # }

# Add SSH rule if missing aws ec2 authorize-security-group-ingress \ --group-id sg-12345678 \ --protocol tcp \ --port 22 \ --cidr 0.0.0.0/0

# Or restrict to your IP aws ec2 authorize-security-group-ingress \ --group-id sg-12345678 \ --protocol tcp \ --port 22 \ --cidr $(curl -s checkip.amazonaws.com)/32 ```

Step 3: Check Network ACLs

```bash # Get subnet network ACL aws ec2 describe-instances --instance-ids i-1234567890abcdef0 \ --query 'Reservations[0].Instances[0].SubnetId' --output text

# Get ACL for subnet aws ec2 describe-network-acls \ --filters Name=association.subnet-id,Values=subnet-12345678

# Check ACL rules # Must have ALLOW rule for port 22 inbound and outbound # ACLs are stateless - need both directions

# Add inbound rule if missing aws ec2 create-network-acl-entry \ --network-acl-id acl-12345678 \ --rule-number 100 \ --protocol tcp \ --port-range From=22,To=22 \ --cidr-block 0.0.0.0/0 \ --rule-action allow \ --direction ingress

# Add outbound rule for response aws ec2 create-network-acl-entry \ --network-acl-id acl-12345678 \ --rule-number 100 \ --protocol tcp \ --port-range From=1024,To=65535 \ --cidr-block 0.0.0.0/0 \ --rule-action allow \ --direction egress ```

Step 4: Verify Key Pair

```bash # Check key pair name on instance aws ec2 describe-instances --instance-ids i-1234567890abcdef0 \ --query 'Reservations[0].Instances[0].KeyName'

# Must match the .pem file you're using # If wrong key, need to: # 1. Stop instance # 2. Detach root volume # 3. Attach to another instance # 4. Update authorized_keys # 5. Reattach and start

# Verify key file permissions ls -la key.pem # Should be: -rw------- (400)

# Fix permissions chmod 400 key.pem

# Test SSH with verbose output ssh -v -i key.pem ec2-user@1.2.3.4 ```

Step 5: Use EC2 Instance Connect

```bash # If key pair issue, use EC2 Instance Connect (if enabled)

# Push temporary SSH key to instance aws ec2-instance-connect send-ssh-public-key \ --instance-id i-1234567890abcdef0 \ --instance-os-user ec2-user \ --ssh-public-key file://my-key.pub

# Then SSH within 60 seconds ssh -o "IdentitiesOnly=yes" -i my-key ec2-user@1.2.3.4

# Requires Instance Connect installed on instance # Amazon Linux 2 and Ubuntu have it by default ```

Step 6: Use Systems Manager Session Manager

```bash # Alternative to SSH - uses SSM agent

# Check if SSM agent is installed aws ssm describe-instance-information \ --filters Key=InstanceIds,Values=i-1234567890abcdef0

# If available, connect via Session Manager aws ssm start-session --target i-1234567890abcdef0

# Or via AWS Console: # EC2 > Instances > Select instance > Connect > Session Manager

# Start SSH over Session Manager aws ssm start-session \ --target i-1234567890abcdef0 \ --document-name AWS-StartSSHSession \ --parameters portNumber=22

# Then SSH to localhost:localPort ```

Step 7: Check Instance CPU and Memory

```bash # Check CloudWatch metrics for instance aws cloudwatch get-metric-statistics \ --namespace AWS/EC2 \ --metric-name CPUUtilization \ --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 300 \ --statistics Average

# If CPU at 100%, instance may be unresponsive # Consider rebooting

# Reboot instance aws ec2 reboot-instances --instance-ids i-1234567890abcdef0

# For hard reset, stop and start aws ec2 stop-instances --instance-ids i-1234567890abcdef0 aws ec2 start-instances --instance-ids i-1234567890abcdef0 # Note: Public IP may change after stop/start ```

Step 8: Check Route Tables

```bash # Get subnet route table aws ec2 describe-route-tables \ --filters Name=association.subnet-id,Values=subnet-12345678

# Check for route to internet gateway # 0.0.0.0/0 -> igw-xxxxxx

# If missing, add route aws ec2 create-route \ --route-table-id rtb-12345678 \ --destination-cidr-block 0.0.0.0/0 \ --gateway-id igw-12345678

# Or if using NAT gateway aws ec2 create-route \ --route-table-id rtb-12345678 \ --destination-cidr-block 0.0.0.0/0 \ --nat-gateway-id nat-12345678

# Verify instance has public IP aws ec2 describe-instances --instance-ids i-1234567890abcdef0 \ --query 'Reservations[0].Instances[0].PublicIpAddress' ```

Step 9: Connect via Recovery Instance

```bash # If instance completely unresponsive:

# 1. Stop the instance aws ec2 stop-instances --instance-ids i-1234567890abcdef0

# 2. Detach root volume aws ec2 detach-volume --volume-id vol-12345678

# 3. Create rescue instance or use existing one # 4. Attach volume to rescue instance aws ec2 attach-volume \ --volume-id vol-12345678 \ --instance-id i-rescue \ --device /dev/sdf

# 5. Mount and fix SSH issues # On rescue instance: sudo mkdir /mnt/recovery sudo mount /dev/xvdf1 /mnt/recovery sudo cat /mnt/recovery/home/ec2-user/.ssh/authorized_keys # Add your key sudo nano /mnt/recovery/etc/ssh/sshd_config # Check SSH config

# 6. Detach and reattach to original instance aws ec2 detach-volume --volume-id vol-12345678 aws ec2 attach-volume --volume-id vol-12345678 --instance-id i-1234567890abcdef0 --device /dev/xvda

# 7. Start original instance aws ec2 start-instances --instance-ids i-1234567890abcdef0 ```

Step 10: Enable Detailed Monitoring

```bash # Enable detailed monitoring (1-minute intervals) aws ec2 monitor-instances --instance-ids i-1234567890abcdef0

# Create CloudWatch alarm for status check failures aws cloudwatch put-metric-alarm \ --alarm-name "ec2-status-check-i-1234567890abcdef0" \ --alarm-description "EC2 status check failed" \ --metric-name StatusCheckFailed \ --namespace AWS/EC2 \ --statistic Maximum \ --period 60 \ --threshold 1 \ --comparison-operator GreaterThanOrEqualToThreshold \ --dimensions Name=InstanceId,Value=i-1234567890abcdef0 \ --evaluation-periods 2 \ --alarm-actions arn:aws:sns:us-east-1:123456789012:my-alerts

# This alerts when instance becomes unresponsive ```

EC2 SSH Troubleshooting Checklist

CheckCommandExpected
Instance runningdescribe-instancesState: running
Status checksdescribe-instance-statuspassed/passed
Security groupdescribe-security-groupsPort 22 allowed
Network ACLdescribe-network-aclsAllow inbound/outbound
Key pairdescribe-instancesMatches your key

Verify the Fix

```bash # After fixing configuration issues

# 1. Test SSH connection ssh -i key.pem ec2-user@1.2.3.4 # Should connect successfully

# 2. Check security group aws ec2 describe-security-groups --group-ids sg-xxxxx | grep -A 5 port-22 # Should show SSH rule

# 3. Test connectivity nc -zv 1.2.3.4 22 # Should succeed

# 4. Check instance console aws ec2 get-console-output --instance-id i-xxx # Should show normal boot messages

# 5. Verify via Session Manager (backup method) aws ssm start-session --target i-xxx # Should connect

# 6. Check SSH service running ssh -i key.pem ec2-user@1.2.3.4 "sudo systemctl status sshd" # Should show active ```

  • [Fix AWS EC2 Instance Connection Refused](/articles/fix-aws-ec2-instance-connection-refused)
  • [Fix AWS Security Group Not Working](/articles/fix-aws-security-group-not-working)
  • [Fix AWS EC2 Key Pair Permission Denied](/articles/fix-aws-ec2-key-pair-permission-denied)