时间:2021-07-01 10:21:17 帮助过:20人阅读
从官方解释我们知道, sql_slave_skip_counter以event为单位skip,直到skip完第N个event所在的event group才停止。对于事务表,一个event group对应一个事务;对于非事务表,一个event group对应一条SQL语句。一个event group包含多个events。
这里我只针对显式事务模拟insert遇到Duplicate entry(1062错误),知道了问题本质,delete/update中的1032错误类似去分析
主库创建一个事务表和一个非事务表,然后从库往各表写入id=1的记录
# 主库创建测试表 mydba@192.168.85.132,3306 [replcrash]> create table repl_innodb(id int primary key,name1 char(10),name2 char(10)) engine=innodb; mydba@192.168.85.132,3306 [replcrash]> create table repl_myisam(id int primary key,name1 char(10),name2 char(10)) engine=myisam; # 从库往测试表中添加数据,不记入binlog mydba@192.168.85.133,3306 [replcrash]> set sql_log_bin=0; mydba@192.168.85.133,3306 [replcrash]> insert into repl_innodb(id,name1,name2) values(1,‘s1062-1‘,‘s1062-1‘); mydba@192.168.85.133,3306 [replcrash]> insert into repl_myisam(id,name1,name2) values(1,‘s1062-1‘,‘s1062-1‘); mydba@192.168.85.133,3306 [replcrash]> set sql_log_bin=1;View Code
主库往事务表中添加数据
# 主库往事务表中添加数据 mydba@192.168.85.132,3306 [replcrash]> begin; mydba@192.168.85.132,3306 [replcrash]> insert into repl_innodb(id,name1,name2) values(1,‘m1062-1‘,‘m1062-1‘); mydba@192.168.85.132,3306 [replcrash]> insert into repl_innodb(id,name1,name2) values(2,‘m1062-2‘,‘m1062-2‘); mydba@192.168.85.132,3306 [replcrash]> commit; mydba@192.168.85.132,3306 [replcrash]> select * from repl_innodb; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | m1062-1 | m1062-1 | | 2 | m1062-2 | m1062-2 | +----+---------+---------+View Code
很明显,从库先写入数据占用id=1,主库再写入数据,复制将主库id=1的写入记录传递到从库,造成从库key冲突(1062错误)
我们尝试使用sql_slave_skip_counter跳过错误(实际遇到1062写入key冲突,我们应该根据 Duplicate entry 删除从库对应记录)
# 从库跳过“1个”错误,并启动sql_thread mydba@192.168.85.133,3306 [replcrash]> set global sql_slave_skip_counter=1; mydba@192.168.85.133,3306 [replcrash]> start slave sql_thread; mydba@192.168.85.133,3306 [replcrash]> select * from repl_innodb; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | s1062-1 | s1062-1 | +----+---------+---------+View Code
从库不仅跳过了id=1的记录,还跳过了id=2的记录
分析:主库上的begin..commit之间对事务表的操作记录为一个事务,对应一个event group。id=1应用于从库遇到Duplicate entry错误,我们使用sql_slave_skip_counter跳过这个event之后,还在此group内,需要继续跳过此group中的后续events。因此在从库不会有id=2的记录~
[root@ZST1 logs]# mysqlbinlog -v --base64-output=decode-rows mysql-bin.000125 --start-position=1869 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 1869 #171201 10:15:11 server id 1323306 end_log_pos 1934 CRC32 0x3a86cd44 Anonymous_GTID last_committed=5 sequence_number=6 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= ‘ANONYMOUS‘/*!*/; # at 1934 #171201 10:14:43 server id 1323306 end_log_pos 2011 CRC32 0x83c239df Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1512094483/*!*/; SET @@session.pseudo_thread_id=4/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; ==================== repl_innodb表写入id=1、2的记录,在一个事务中Start ==================== BEGIN /*!*/; # at 2011 #171201 10:14:43 server id 1323306 end_log_pos 2076 CRC32 0x0f3612fe Table_map: `replcrash`.`repl_innodb` mapped to number 263 # at 2076 #171201 10:14:43 server id 1323306 end_log_pos 2132 CRC32 0x01de5dbd Write_rows: table id 263 flags: STMT_END_F ### INSERT INTO `replcrash`.`repl_innodb` ### SET ### @1=1 ### @2=‘m1062-1‘ ### @3=‘m1062-1‘ # at 2132 #171201 10:14:50 server id 1323306 end_log_pos 2197 CRC32 0xf838b054 Table_map: `replcrash`.`repl_innodb` mapped to number 263 # at 2197 #171201 10:14:50 server id 1323306 end_log_pos 2253 CRC32 0xbd9ae02a Write_rows: table id 263 flags: STMT_END_F ### INSERT INTO `replcrash`.`repl_innodb` ### SET ### @1=2 ### @2=‘m1062-2‘ ### @3=‘m1062-2‘ # at 2253 #171201 10:15:11 server id 1323306 end_log_pos 2284 CRC32 0x0292df6a Xid = 60 COMMIT/*!*/; ==================== repl_innodb表写入id=1、2的记录,在一个事务中End ==================== SET @@SESSION.GTID_NEXT= ‘AUTOMATIC‘ /* added by mysqlbinlog */ /*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; [root@ZST1 logs]#View Binlog
主库往非事务表中添加数据
# 主库往非事务表中添加数据 mydba@192.168.85.132,3306 [replcrash]> begin; mydba@192.168.85.132,3306 [replcrash]> insert into repl_myisam(id,name1,name2) values(1,‘m1062-1‘,‘m1062-1‘); mydba@192.168.85.132,3306 [replcrash]> insert into repl_myisam(id,name1,name2) values(2,‘m1062-2‘,‘m1062-2‘); mydba@192.168.85.132,3306 [replcrash]> commit; mydba@192.168.85.132,3306 [replcrash]> select * from repl_myisam; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | m1062-1 | m1062-1 | | 2 | m1062-2 | m1062-2 | +----+---------+---------+View Code
同理,从库先写入数据占用id=1,主库再写入数据,复制将主库id=1的写入记录传递到从库,造成从库key冲突(1062错误)
我们尝试使用sql_slave_skip_counter跳过错误(实际遇到1062写入key冲突,我们应该根据 Duplicate entry 删除从库对应记录)
# 从库跳过“1个”错误,并启动sql_thread mydba@192.168.85.133,3306 [replcrash]> set global sql_slave_skip_counter=1; mydba@192.168.85.133,3306 [replcrash]> start slave sql_thread; mydba@192.168.85.133,3306 [replcrash]> select * from repl_myisam; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | s1062-1 | s1062-1 | | 2 | m1062-2 | m1062-2 | +----+---------+---------+View Code
从库跳过了id=1的记录,但复制了id=2的记录
分析:主库上的begin..commit之间对非事务表的操作记录为多个事务,每一条SQL语句对应一个event group。id=1应用于从库遇到Duplicate entry错误,我们使用sql_slave_skip_counter跳过这个event之后,已经到了此group的末尾。sql_thread直接从下一个event group开始,这里就是repl_myisam.id=2的那条语句。因此在从库会有id=2的记录~
实际它在执行第一条insert语句后,从库就报1062错误;前面的transactional tables需要在事务commit后从库才报错
[root@ZST1 logs]# mysqlbinlog -v --base64-output=decode-rows mysql-bin.000125 --start-position=2284 /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=1*/; /*!50003 SET @OLD_COMPLETION_TYPE=@@COMPLETION_TYPE,COMPLETION_TYPE=0*/; DELIMITER /*!*/; # at 2284 #171201 10:30:31 server id 1323306 end_log_pos 2349 CRC32 0x5d208979 Anonymous_GTID last_committed=6 sequence_number=7 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= ‘ANONYMOUS‘/*!*/; # at 2349 #171201 10:30:31 server id 1323306 end_log_pos 2426 CRC32 0xe4ce4da8 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1512095431/*!*/; SET @@session.pseudo_thread_id=4/*!*/; SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/; SET @@session.sql_mode=1436549152/*!*/; SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/; /*!\C utf8 *//*!*/; SET @@session.character_set_client=33,@@session.collation_connection=33,@@session.collation_server=33/*!*/; SET @@session.lc_time_names=0/*!*/; SET @@session.collation_database=DEFAULT/*!*/; ==================== repl_myisam表写入id=1的记录Start ==================== BEGIN /*!*/; # at 2426 #171201 10:30:31 server id 1323306 end_log_pos 2491 CRC32 0x76a45e15 Table_map: `replcrash`.`repl_myisam` mapped to number 261 # at 2491 #171201 10:30:31 server id 1323306 end_log_pos 2547 CRC32 0xd187097a Write_rows: table id 261 flags: STMT_END_F ### INSERT INTO `replcrash`.`repl_myisam` ### SET ### @1=1 ### @2=‘m1062-1‘ ### @3=‘m1062-1‘ # at 2547 #171201 10:30:31 server id 1323306 end_log_pos 2625 CRC32 0xc8210551 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1512095431/*!*/; COMMIT /*!*/; # at 2625 ==================== repl_myisam表写入id=1的记录End ==================== #171201 10:30:44 server id 1323306 end_log_pos 2690 CRC32 0x22b268fd Anonymous_GTID last_committed=7 sequence_number=8 rbr_only=yes /*!50718 SET TRANSACTION ISOLATION LEVEL READ COMMITTED*//*!*/; SET @@SESSION.GTID_NEXT= ‘ANONYMOUS‘/*!*/; # at 2690 #171201 10:30:44 server id 1323306 end_log_pos 2767 CRC32 0x43061ce5 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1512095444/*!*/; ==================== repl_myisam表写入id=2的记录Start ==================== BEGIN /*!*/; # at 2767 #171201 10:30:44 server id 1323306 end_log_pos 2832 CRC32 0xe1c084b9 Table_map: `replcrash`.`repl_myisam` mapped to number 261 # at 2832 #171201 10:30:44 server id 1323306 end_log_pos 2888 CRC32 0x56bacb73 Write_rows: table id 261 flags: STMT_END_F ### INSERT INTO `replcrash`.`repl_myisam` ### SET ### @1=2 ### @2=‘m1062-2‘ ### @3=‘m1062-2‘ # at 2888 #171201 10:30:44 server id 1323306 end_log_pos 2966 CRC32 0x6527c3b6 Query thread_id=4 exec_time=0 error_code=0 SET TIMESTAMP=1512095444/*!*/; COMMIT /*!*/; ==================== repl_myisam表写入id=2的记录End ==================== SET @@SESSION.GTID_NEXT= ‘AUTOMATIC‘ /* added by mysqlbinlog */ /*!*/; DELIMITER ; # End of log file /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; /*!50530 SET @@SESSION.PSEUDO_SLAVE_MODE=0*/; [root@ZST1 logs]#View Binlog
为了方便,我将表中数据置为初始状态,主库两表为空,从库两表各有id=1的记录
主库往事务表和非事务表中添加数据
# 主库往事务表、非事务表中添加数据 mydba@192.168.85.132,3306 [replcrash]> begin; mydba@192.168.85.132,3306 [replcrash]> insert into repl_innodb(id,name1,name2) values(1,‘m1062-1‘,‘m1062-1‘); mydba@192.168.85.132,3306 [replcrash]> insert into repl_innodb(id,name1,name2) values(2,‘m1062-2‘,‘m1062-2‘); mydba@192.168.85.132,3306 [replcrash]> insert into repl_myisam(id,name1,name2) values(1,‘m1062-1‘,‘m1062-1‘); mydba@192.168.85.132,3306 [replcrash]> insert into repl_myisam(id,name1,name2) values(2,‘m1062-2‘,‘m1062-2‘); mydba@192.168.85.132,3306 [replcrash]> commit; mydba@192.168.85.132,3306 [replcrash]> select * from repl_innodb; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | m1062-1 | m1062-1 | | 2 | m1062-2 | m1062-2 | +----+---------+---------+ mydba@192.168.85.132,3306 [replcrash]> select * from repl_myisam; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | m1062-1 | m1062-1 | | 2 | m1062-2 | m1062-2 | +----+---------+---------+View Code
根据前面的分析,我们知道从库上的repl_innodb、repl_myisam表都存在key冲突(1062错误)
我们尝试使用sql_slave_skip_counter跳过错误(实际遇到1062写入key冲突,我们应该根据 Duplicate entry 删除从库对应记录)
# 从库跳过“1个”错误,并启动sql_thread mydba@192.168.85.133,3306 [replcrash]> set global sql_slave_skip_counter=1; mydba@192.168.85.133,3306 [replcrash]> start slave sql_thread; mydba@192.168.85.133,3306 [replcrash]> select * from repl_innodb; +----+---------+---------+ | id | name1 | name2 | +----+---------+---------+ | 1 | s1062-1 | s1062-1 | +----+---------+---------+ mydba@192.168.85.133,3306 [replcrash]> select * from