Introduction After a network partition is resolved, MySQL replication replicas may remain stuck in the `Connecting` state instead of automatically reconnecting to the primary. This can be caused by stale connections, expired authentication, or the primary having moved on past the replica's requested binlog position.

Symptoms - `SHOW REPLICA STATUS\G` shows `Slave_IO_Running: Connecting` - `Last_IO_Error` shows `error reconnecting to master` - Replica has been stuck for hours after network recovery - Primary shows no connection attempt from the replica - `SHOW PROCESSLIST` on primary shows no replication connections

Common Causes - Replica DNS resolution still failing after network recovery - Primary has rotated binlog past the replica's requested position - Replication user password expired or account locked - Firewall rules still blocking the replica after network recovery - TCP connection in half-closed state preventing reconnection

Step-by-Step Fix 1. **Check replica status details": ```sql SHOW REPLICA STATUS\G -- Key fields: -- Slave_IO_Running: Connecting -- Last_IO_Error: error reconnecting -- Master_Log_File: binlog.000100 -- Read_Master_Log_Pos: 12345 ```

  1. 1.**Test network connectivity from replica to primary":
  2. 2.```bash
  3. 3.# Test DNS resolution
  4. 4.nslookup primary.example.com

# Test TCP connectivity nc -zv primary.example.com 3306

# Test MySQL authentication mysql -h primary.example.com -u repl_user -p'repl_password' -e "SELECT 1" ```

  1. 1.**Restart the replica IO thread":
  2. 2.```sql
  3. 3.STOP REPLICA IO_THREAD;
  4. 4.START REPLICA IO_THREAD;

-- Check status after 10 seconds SHOW REPLICA STATUS\G -- Look for: Slave_IO_Running: Yes ```

  1. 1.**If the binlog position is too old, re-sync the replica":
  2. 2.```sql
  3. 3.STOP REPLICA;

-- Find the earliest available binlog on the primary -- On primary: SHOW BINARY LOGS;

-- Reset and restart from earliest available RESET REPLICA; CHANGE REPLICATION SOURCE TO SOURCE_HOST = 'primary.example.com', SOURCE_USER = 'repl_user', SOURCE_PASSWORD = 'repl_password', SOURCE_LOG_FILE = 'binlog.000100', SOURCE_LOG_POS = 4;

START REPLICA; ```

  1. 1.**Check if the replication account is locked or expired":
  2. 2.```sql
  3. 3.-- On the primary
  4. 4.SELECT user, host, account_locked, password_expired
  5. 5.FROM mysql.user
  6. 6.WHERE user = 'repl_user';

-- Unlock if needed ALTER USER 'repl_user'@'%' ACCOUNT UNLOCK; ALTER USER 'repl_user'@'%' IDENTIFIED BY 'new_repl_password'; ```

Prevention - Implement heartbeat monitoring between primary and replica - Set up alerting on `Slave_IO_Running: Connecting` state - Use stable hostnames with low DNS TTL for replication endpoints - Configure `MASTER_CONNECT_RETRY` and `MASTER_RETRY_COUNT` in replica settings - Test network recovery scenarios regularly - Use persistent replication connections with TCP keepalive - Monitor network latency between primary and replica continuously