Fix Ansible Unreachable Host Error - Complete Troubleshooting

Your playbook runs fine for several tasks, then suddenly a host becomes unreachable. The initial connection worked, but subsequent tasks fail. This intermittent unreachability is frustrating because it can happen mid-deployment, leaving your systems in an inconsistent state.

The Error

bash

fatal: [webserver]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ssh: connect to host 192.168.1.50 port 22: Connection timed out",
    "unreachable": true
}

Or during playbook run:

bash

TASK [Deploy application] *******************************************************
fatal: [webserver]: UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: Shared connection to 192.168.1.50 closed.",
    "unreachable": true
}

Quick Diagnosis

Check the host status immediately:

```bash # Basic connectivity ping -c 3 192.168.1.50

# SSH connection test ssh -o ConnectTimeout=10 user@192.168.1.50 "echo alive"

# Check if SSH is listening nc -zv 192.168.1.50 22 -w 5 ```

For Ansible-specific testing:

```bash # Test with ping module ansible webserver -m ping -u deploy

# Test with wait_for to check port availability ansible webserver -m wait_for -a "host=192.168.1.50 port=22 timeout=30" ```

Common Causes and Fixes

Network Connectivity Issues

The host might have intermittent network problems.

Add retry logic to your playbook:

```yaml - hosts: webservers tasks: - name: Deploy application block: - name: Copy files copy: src: ./app/ dest: /opt/app/ rescue: - name: Wait for host to recover wait_for_connection: delay: 10 timeout: 300

name: Retry copy
copy:
src: ./app/
dest: /opt/app/
`

**Or use the until loop:**

yaml

- name: Ensure service is running
  service:
    name: myapp
    state: started
  register: result
  until: result is success
  retries: 3
  delay: 10

SSH Connection Drops Mid-Play

Long-running tasks or network issues can drop SSH connections.

Enable SSH keepalives in ansible.cfg:

ini

[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o ServerAliveInterval=30 -o ServerAliveCountMax=3
pipelining = True

**Or in your SSH config (~/.ssh/config):**

bash

Host *
    ServerAliveInterval 30
    ServerAliveCountMax 3
    TCPKeepAlive yes

Target Host Reboot or Restart

If a task triggers a reboot, subsequent tasks fail.

Handle reboots properly:

yaml

- name: Reboot server
  reboot:
    msg: "Rebooting for kernel update"
    connect_timeout: 5
    reboot_timeout: 300
    pre_reboot_delay: 0
    post_reboot_delay: 30

Or manually with wait_for_connection:

```yaml - name: Restart server shell: sleep 2 && shutdown -r now "Ansible reboot" async: 1 poll: 0 become: yes

name: Wait for server to restart
wait_for_connection:
delay: 30
timeout: 300
`

The async: 1 and poll: 0 pattern tells Ansible to fire-and-forget the reboot command, preventing it from hanging on the closed connection.

Firewall Changes Blocking SSH

If a task modifies firewall rules and locks itself out.

Use check mode first:

yaml

- name: Check firewall rules
  iptables:
    chain: INPUT
    protocol: tcp
    destination_port: 22
    jump: ACCEPT
  check_mode: yes
  diff: yes

Add a rescue to restore access:

yaml

- name: Configure firewall
  block:
    - name: Add firewall rule
      iptables:
        chain: INPUT
        protocol: tcp
        destination_port: 80
        jump: ACCEPT
  rescue:
    - name: Emergency SSH restore
      local_action:
        module: shell
        cmd: ssh {{ ansible_user }}@{{ ansible_host }} "iptables -A INPUT -p tcp --dport 22 -j ACCEPT"

Target Resource Exhaustion

The host runs out of memory or file descriptors, causing SSH to fail.

Check target health first:

```yaml - name: Check available memory shell: free -m | grep Mem | awk '{print $7}' register: free_memory

name: Fail if low memory
fail:
msg: "Less than 100MB free memory"
when: free_memory.stdout | int < 100

name: Check disk space
shell: df / | tail -1 | awk '{print $5}' | tr -d '%'
register: disk_usage

name: Fail if disk full
fail:
msg: "Disk usage above 90%"
when: disk_usage.stdout | int > 90
`

SSH Session Limits

Too many parallel connections can overwhelm the target.

Limit parallelism:

```bash # Run with fewer forks ansible-playbook site.yml --forks 5

# Or in ansible.cfg [defaults] forks = 5 ```

Use serial execution for rolling updates:

yaml

- hosts: webservers
  serial: 2  # Process 2 hosts at a time
  tasks:
    - name: Deploy app
      # ...

Target Python Missing or Broken

Ansible requires Python on the target. If Python is missing or broken:

bash

# Test Python on target
ansible webserver -m raw -a "python3 --version"

Bootstrap Python first:

```yaml - hosts: all gather_facts: false tasks: - name: Install Python raw: test -e /usr/bin/python3 || (apt update && apt install -y python3) changed_when: false

name: Gather facts
setup:
`

Handling Unreachable Hosts in Playbooks

Ignore unreachable hosts and continue:

yaml

- hosts: webservers
  ignore_unreachable: yes
  tasks:
    - name: Deploy app
      # ...

Check for unreachable after the play:

```yaml - hosts: webservers tasks: - name: Deploy app # ...

hosts: localhost
tasks:
- name: Report unreachable
debug:
msg: "Unreachable hosts: {{ ansible_play_hosts_all | difference(ansible_play_hosts) }}"
`

Set max failure percentage:

yaml

- hosts: webservers
  max_fail_percentage: 30  # Stop if more than 30% fail
  tasks:
    - name: Deploy app
      # ...

Verification

After making changes:

```bash # Run with verbose output to see connection details ansible-playbook site.yml -vvv --limit webserver

# Check connection persistence ansible webserver -m ping -u deploy -f 1

# Run simple connectivity test ansible webserver -m wait_for_connection -a "timeout=60" ```

Prevention

1.Enable keepalives: Configure SSH keepalives in ansible.cfg
2.Use serial execution: Process hosts in batches
3.Add retry logic: Use until, retries, and rescue
4.Handle reboots: Use wait_for_connection after reboots
5.Monitor resources: Check disk and memory before deployment

Fix Ansible Task Failed: Unreachable Host

The Error

Quick Diagnosis

Common Causes and Fixes

Network Connectivity Issues

SSH Connection Drops Mid-Play

Target Host Reboot or Restart

Firewall Changes Blocking SSH

Target Resource Exhaustion

SSH Session Limits

Target Python Missing or Broken

Handling Unreachable Hosts in Playbooks

Verification

Prevention

Share this guide

More Ansible Troubleshooting Guides

Use Ansible selinux Module

Use Ansible iptables Module

Use Ansible firewalld Module

Use Ansible lvol Module

Use Ansible filesystem Module

Use Ansible mount Module