What's Actually Happening

You've configured SSH connection multiplexing (ControlMaster) to speed up connections, but new connections fail with socket-related errors. The control socket might be stale, have wrong permissions, or be in an inaccessible location. These errors prevent both new connections and the cleanup of old ones.

The Error You'll See

When attempting to connect:

bash
Control socket connect(/home/user/.ssh/sockets/user@host-22): Connection refused

Or:

bash
ControlPath /home/user/.ssh/sockets/user@host-22 is not a socket

Or permission errors:

bash
Control socket connect(/home/user/.ssh/sockets/user@host-22): Permission denied

Or with verbose output:

bash
ssh -v user@server

Shows:

bash
debug1: Control socket "/home/user/.ssh/sockets/user@host-22" does not exist
debug1: Creating new control socket
debug1: control_persist_detach: forked into background
debug1: channel 0: new session
unix_listener: cannot bind to path /home/user/.ssh/sockets/user@host-22: No such file or directory

Why This Happens

ControlMaster creates a Unix socket file that subsequent connections use. Problems occur when:

  1. 1.The socket file exists but the master process has died (stale socket)
  2. 2.The socket directory doesn't exist or has wrong permissions
  3. 3.The socket file has wrong ownership (created by different user)
  4. 4.Disk is full or filesystem doesn't support sockets
  5. 5.The previous connection wasn't cleaned up properly

Step 1: Identify Control Socket Path

Check your SSH configuration for the socket path:

bash
grep -E "ControlPath|ControlMaster" ~/.ssh/config

Common configurations:

bash
Host *
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p
    ControlPersist 600

The %r is remote user, %h is hostname, %p is port.

Step 2: Check Socket Directory

Verify the socket directory exists:

bash
ls -la ~/.ssh/sockets/

If it doesn't exist:

bash
mkdir -p ~/.ssh/sockets
chmod 700 ~/.ssh/sockets

The directory must have restrictive permissions (700 or 755).

Step 3: Check for Stale Sockets

List existing socket files:

bash
ls -la ~/.ssh/sockets/

You'll see something like:

bash
srw-------  1 user user 0 Apr 3 10:00 user@server-22

The 's' at the start indicates a socket file.

Check if the master process is still running:

bash
ssh -O check user@server

If it returns:

bash
Control socket connect(/home/user/.ssh/sockets/user@server-22): Connection refused

The socket is stale. Remove it:

bash
rm ~/.ssh/sockets/user@server-22

Or remove all stale sockets at once:

bash
find ~/.ssh/sockets -type s -exec rm {} \;

Step 4: Fix Socket Permissions

If socket files have wrong permissions:

bash
ls -la ~/.ssh/sockets/

Should show:

bash
srw-------  1 user user 0 Apr 3 10:00 user@server-22

If owned by root or another user:

bash
sudo rm ~/.ssh/sockets/user@server-22

Fix directory ownership:

bash
chown -R $USER:$USER ~/.ssh/sockets
chmod 700 ~/.ssh/sockets

Step 5: Check Running Master Processes

Find any SSH master processes:

bash
ps aux | grep ssh | grep -v grep

Look for:

bash
user  12345  0.0  0.0  12345  6789 ?  Ss  10:00  0:00 ssh: /home/user/.ssh/sockets/user@server-22 [mux]

The [mux] indicates a ControlMaster process.

Kill zombie master processes:

bash
ssh -O exit user@server

If that fails, kill the process:

bash
kill 12345

Then remove the stale socket.

Step 6: Check Disk Space and Filesystem

Ensure the filesystem can create sockets:

bash
df -h ~/.ssh/sockets/

If disk is full, clean up:

bash
rm -f ~/.ssh/sockets/*

Some filesystems (like NTFS or FAT) don't support Unix sockets. Check your mount:

bash
mount | grep $(df ~/.ssh | tail -1 | awk '{print $1}')

If you see ntfs or vfat, create the socket directory elsewhere:

bash
mkdir -p /tmp/ssh-sockets-$USER
chmod 700 /tmp/ssh-sockets-$USER

Update your SSH config:

bash
ControlPath /tmp/ssh-sockets-%r@%h-%p

Step 7: Reset All ControlMaster Connections

To completely reset your multiplexing setup:

```bash # Stop all master connections for socket in ~/.ssh/sockets/*; do ssh -O exit $(basename "$socket" | sed 's/-/:/' | sed 's/@/:/g') done 2>/dev/null

# Remove all socket files rm -f ~/.ssh/sockets/*

# Kill any remaining SSH processes (use carefully) pkill -u $USER -f 'ssh.*ControlMaster' ```

Step 8: Update SSH Config for Reliability

Use a robust configuration:

bash
Host *
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p
    ControlPersist 600

The ControlPersist 600 keeps the master running for 10 minutes after the last session closes, allowing for reconnection without the stale socket issue.

For even better reliability:

bash
Host *
    ControlMaster auto
    ControlPath ~/.ssh/sockets/%r@%h-%p-%C
    ControlPersist yes

The %C creates a unique hash, preventing conflicts when parameters differ slightly.

Step 9: Test Multiplexing

Test that multiplexing works:

```bash # First connection creates master time ssh user@server exit

# Second connection uses existing master time ssh user@server exit ```

The second connection should be much faster (under 0.1 seconds).

Check the socket exists:

bash
ls -la ~/.ssh/sockets/

Verify the Fix

ControlMaster is working correctly when:

  1. 1.ssh -O check user@server returns "Master running"
  2. 2.ls ~/.ssh/sockets/ shows socket files with correct permissions
  3. 3.New connections complete almost instantly
  4. 4.No "Connection refused" or "not a socket" errors

Create a test script:

bash
#!/bin/bash
echo "Testing ControlMaster..."
ssh -O check user@server && echo "Master is running" || echo "No master found"
ssh user@server 'echo "Connection successful"'
ssh -O exit user@server && echo "Master stopped"

If you frequently have stale socket issues, add cleanup to your shell profile:

bash
# Add to ~/.bashrc or ~/.zshrc
ssh-cleanup() {
    find ~/.ssh/sockets -type s -mtime +1 -delete 2>/dev/null
}
# Run on login
ssh-cleanup