虽然学习并实验过 mysql GTID 主从复制,但是,实际运行起来,还是会出现各种问题,有些是学习时遇到过的,有些是没有遇到过的,比方说这次,之前的错误处理记录中就没有,只好网上各种搜索。

主服务器版本

$ mysql -V
mysql  Ver 8.0.33-0ubuntu0.20.04.2 for Linux on x86_64 ((Ubuntu))

从服务器版本

mysql> select version();
+-----------+
| version() |
+-----------+
| 8.0.34    |
+-----------+
1 row in set (0.00 sec)

本次完全依照以下文章解决

MySQL主从复制中的问题(Coordinator stopped beacause there were errors in the workers…)

错误 1032

mysql> show replica status\G;

...
                Relay_Log_Pos: 367020
        Relay_Source_Log_File: binlog.000001
           Replica_IO_Running: Yes
          Replica_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1032
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'be41e172-f145-11ec-9b3c-00155d5aa636:30' at source log binlog.000001, end_log_pos 367097. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
                 ...

定位错误 1032

mysql> select * from performance_schema.replication_applier_status_by_worker\G

mysql> select * from performance_schema.replication_applier_status_by_worker\G

...

                                          SERVICE_STATE: OFF
                                      LAST_ERROR_NUMBER: 1032
                                     LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'be41e172-f145-11ec-9b3c-00155d5aa636:30' at source log binlog.000001, end_log_pos 367097; Could not execute Delete_rows event on table XXX.sys_locker_sms; Can't find record in 'sys_locker_sms', Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event's source log binlog.000001, end_log_pos 367097
                                   LAST_ERROR_TIMESTAMP: 2023-07-26 21:13:37.673641
...

重点:

Could not execute Delete_rows event on table XXX.sys_locker_sms; Can’t find record in ‘sys_locker_sms’, Error_code: 1032;

在 master 上删除一条记录,在 replica 上找不到

尝试解决 1032

  1. 按照参考文章中的第一个办法

set global sql_slave_skip_counter=1;

mysql> set global sql_slave_skip_counter=1;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

跳过一条记录,sql 没有报错,但是,也没有解决以上问题!

  1. 按照参考文章中的第二个办法, OK!

     mysql> STOP REPLICA;
     Query OK, 0 rows affected (0.02 sec)
     
     mysql> SET @@SESSION.GTID_NEXT='be41e172-f145-11ec-9b3c-00155d5aa636:30';
     Query OK, 0 rows affected (0.00 sec)
     
     mysql> BEGIN;
     Query OK, 0 rows affected (0.00 sec)
     
     mysql> COMMIT;
     Query OK, 0 rows affected (0.03 sec)
     
     mysql> SET SESSION GTID_NEXT=AUTOMATIC;
     Query OK, 0 rows affected (0.00 sec)
     
     mysql> START REPLICA;
     Query OK, 0 rows affected (0.29 sec)
     
     mysql> show replica status \G;
     *************************** 1. row ***************************
                  Replica_IO_State: Waiting for source to send event
                       Source_Host: 192.168.0.201
                       Source_User: root
                       Source_Port: 3306
                     Connect_Retry: 60
                   Source_Log_File: binlog.000002
               Read_Source_Log_Pos: 8682708
                    Relay_Log_File: ubuntu2004-195-relay-bin.000002
                     Relay_Log_Pos: 2015574
             Relay_Source_Log_File: binlog.000001
                Replica_IO_Running: Yes
               Replica_SQL_Running: Yes
               ...
    

大概理解一下 : 用一条空事务代替这一条 delete 操作,跳过了这个错误地点

1). SET @@SESSION.GTID_NEXT 到错误提示中报告的 GTID    :'be41e172-f145-11ec-9b3c-00155d5aa636:30'  

2) . BEGIN; 
	COMMIT;
	提交一条空事务
	
3). SET SESSION GTID_NEXT=AUTOMATIC;
	恢复GTID , 继续

错误 1062

mysql> show replica status \G;

*************************** 1. row ***************************
             Replica_IO_State: Waiting for source to send event

	...
                   Last_Errno: 1062
                   Last_Error: Coordinator stopped because there were error(s) in the worker(s). The most recent failure being: Worker 1 failed executing transaction 'be41e172-f145-11ec-9b3c-00155d5aa636:247' at source log binlog.000002, end_log_pos 50007. See error log and/or performance_schema.replication_applier_status_by_worker table for more details about this failure or others, if any.
                 Skip_Counter: 0
...

定位错误 1062

mysql> select * from performance_schema.replication_applier_status_by_worker\G

*************************** 1. row ***************************
...

                                      LAST_ERROR_NUMBER: 1062
                                     LAST_ERROR_MESSAGE: Worker 1 failed executing transaction 'be41e172-f145-11ec-9b3c-00155d5aa636:247' at source log binlog.000002, end_log_pos 50007; Could not execute Write_rows event on table lyd-test.zt_debuglog; Duplicate entry '18666' for key 'zt_debuglog.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's source log binlog.000002, end_log_pos 50007
                                   LAST_ERROR_TIMESTAMP: 2023-07-27 08:43:30.727645
...

重点 :
Duplicate entry ‘18666’ for key ‘zt_debuglog.PRIMARY’,

key 重复了?

在从服务器上已经有该key,现在又插入了同一条记录,冲突了!

尝试解决 1062

按照以上 1032 的解决办法,直接继续!

mysql> STOP REPLICA;
Query OK, 0 rows affected (0.06 sec)

mysql> SET @@SESSION.GTID_NEXT='be41e172-f145-11ec-9b3c-00155d5aa636:247';
Query OK, 0 rows affected (0.00 sec)

mysql> BEGIN; COMMIT;
Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.06 sec)

mysql> SET SESSION GTID_NEXT=AUTOMATIC;
Query OK, 0 rows affected (0.01 sec)

mysql> START REPLICA;
Query OK, 0 rows affected (0.19 sec)

接下来,居然连续出现好多次 Duplicate,照方抓药!

后记

既然直接空事务跳过去,那么还不如直接让 replica 忽略这个错误

$ sudo vim /etc/mysql/mysql.conf.d/mysqld.cnf

...
# 跳过指定error no类型的错误
# slave-skip-errors=1062,1053,1146
replica-skip-errors=1032,1062,1053,1146
...

$ sudo service mysql restart

再次查看就 OK 了!

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐