Introduction MySQL GTID-based replication uses global transaction identifiers to track which transactions have been executed on each server. After a failover, replicas may have executed different sets of transactions, causing GTID set mismatches that prevent replication from resuming correctly.

Symptoms - `Last_IO_Error: The slave IO thread connects to master, but the master has purged binary logs containing GTIDs that the slave requires` - `Last_SQL_Error: Error executing row event: 'Duplicate entry'` - `SHOW REPLICA STATUS\G` shows `Last_Error` with GTID-related messages - `Retrieved_Gtid_Set` and `Executed_Gtid_Set` are out of sync - Replication stops with `ER_SLAVE_HAS_MORE_GTIDS_THAN_MASTER`

Common Causes - Primary failover with replicas at different replication positions - Binlog purge on new primary before all replicas caught up - Manual GTID manipulation (`SET GLOBAL gtid_purged`) done incorrectly - Parallel replication causing out-of-order execution on replica - Network partition during failover causing some replicas to miss transactions

Step-by-Step Fix 1. **Compare GTID sets on primary and replica": ```sql -- On the new primary SELECT @@global.gtid_executed;

-- On the replica SHOW REPLICA STATUS\G -- Look for: -- Retrieved_Gtid_Set -- Executed_Gtid_Set -- Retrieved_Gtid_Set ```

  1. 1.**Check for missing GTIDs on the replica":
  2. 2.```sql
  3. 3.-- On the replica, compare
  4. 4.SELECT
  5. 5.@@global.gtid_executed AS replica_executed,
  6. 6.(SELECT @@global.gtid_executed FROM mysql.slave_relay_log_info) AS relay_info;

-- Use gtid_subtract to find missing transactions SELECT GTID_SUBSET( '3E11FA47-71CA-11E1-9E33-C80AA9429562:1-50', '3E11FA47-71CA-11E1-9E33-C80AA9429562:1-45' ) AS is_subset; -- Returns 0 if the second set is NOT a subset of the first ```

  1. 1.**Re-sync the replica using mysqldump from the new primary":
  2. 2.```bash
  3. 3.# On the new primary
  4. 4.mysqldump --all-databases --single-transaction --triggers --routines --events \
  5. 5.--set-gtid-purged=ON > /tmp/full_backup.sql

# On the replica mysql < /tmp/full_backup.sql ```

  1. 1.**Skip conflicting transactions if only a few are mismatched":
  2. 2.```sql
  3. 3.STOP REPLICA;

-- Skip the specific GTID SET GTID_NEXT = '3E11FA47-71CA-11E1-9E33-C80AA9429562:46'; BEGIN; COMMIT; SET GTID_NEXT = 'AUTOMATIC';

START REPLICA; ```

  1. 1.**Reconfigure replication from the correct position":
  2. 2.```sql
  3. 3.STOP REPLICA;
  4. 4.RESET REPLICA ALL;

CHANGE REPLICATION SOURCE TO SOURCE_HOST = 'new-primary.example.com', SOURCE_USER = 'repl_user', SOURCE_PASSWORD = 'repl_password', SOURCE_AUTO_POSITION = 1;

START REPLICA; ```

Prevention - Use `super_read_only = ON` on all replicas to prevent accidental writes - Monitor replication lag and GTID positions continuously - During failover, ensure the most up-to-date replica becomes the new primary - Set `binlog_expire_logs_seconds` long enough to cover failover recovery - Test failover procedures regularly to identify GTID alignment issues - Use `gtid_mode = ON` and `enforce_gtid_consistency = ON` on all servers - Implement automated GTID comparison checks in monitoring