当复制中断的时候,我们常用的方法是跳过错误,比如SET GLOBAL SQL_SLAVE_SKIP_COUNTER =1或者直接slave-skip-errors=1062,这样确实解决了问题,恢复了复制。但是久而久之主从数据相差就很大了。对于复制正常以后,我们还需要使用pt-table-checksum数据效验,以及pt-table-sync同步数据,今天线上一个从库中断,查看了错误,发现是主键冲突,至于为何会发现这些情况,请阅读MySQL Replication需要注意的问题
Last_Error: Error ‘Duplicate entry ‘192442‘ for key ‘PRIMARY‘‘ on query. Default database: ‘xxxxxxxxxxx‘. Query: ‘INSERT INTO xxxxxxxxxxxxx(playerId, `type`, `count`) VALUES( NAME_CONST(‘pPlayerId‘,629014986), NAME_CONST(‘pType‘,8), 0)‘ Skip_Counter: 0 Exec_Master_Log_Pos: 1499475 Relay_Log_Space: 4829077512 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1062 Last_SQL_Error: Error ‘Duplicate entry ‘192442‘ for key ‘PRIMARY‘‘ on query. Default database: ‘xxxxxxxxxxxxxx‘. Query: ‘INSERT INTO xxxxxxxxxxx(playerId, `type`, `count`) VALUES( NAME_CONST(‘pPlayerId‘,629014986), NAME_CONST(‘pType‘,8), 0)‘ 1 row in set (0.00 sec)
于是根据提示,以及查看表结构,发现果然是主键冲突了,但是表的记录却不一致,所以往往我们跳过错误就会导致主从数据不一致的问题。
主库上的记录:
mysql> select * from xxxx.xxxx where id=192442; +--------+-----------+------+-------+-------+---------------+ | id | playerId | type | count | total | lastResetTime | +--------+-----------+------+-------+-------+---------------+ | 192442 | 629014986 | 8 | 0 | 0 | 0 | +--------+-----------+------+-------+-------+---------------+ 1 row in set (0.00 sec) mysql>
从库上的记录:
mysql> select * from xxxx.xxxx where id=192442; +--------+-----------+------+-------+-------+---------------+ | id | playerId | type | count | total | lastResetTime | +--------+-----------+------+-------+-------+---------------+ | 192442 | 629015414 | 8 | 0 | 0 | 0 | +--------+-----------+------+-------+-------+---------------+ 1 row in set (0.00 sec) mysql>
有时候我会手动删除从库上提示的相应记录,但是往往是比较麻烦的。因为可能不止一条记录重复,可能N条,所以我就简单的写了一个脚本,删除主键冲突的记录,这个脚本只适合主键冲突的情况。
#!/bin/bash #Delete duplicate records primary key conflict #Write by yayun 2014-05-17 mysql=/usr/local/mysql-5.1.66/bin/mysql sock=/data/mysql-slave-3311/mysql.sock passwd=123456 while true do SQL_THREAD=`$mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | egrep ‘Slave_SQL_Running‘ | awk ‘{print $2}‘` LAST_ERROR=`$mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | egrep Last_Errno | awk ‘{print $2}‘` duplicate=`$mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | grep Last_Error | awk ‘/Duplicate entry/{print $5}‘ | awk -F "‘" ‘{print $2}‘` DATABASE=`$mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | grep Last_Error | awk ‘{print $13}‘ | awk -F "‘" ‘{print $2}‘` TABLE=`$mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | grep Last_Error | awk -F ":" ‘{print $4}‘ | awk -F "(" ‘{print $1}‘ | awk ‘{print $NF}‘` $mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | grep HA_ERR_FOUND_DUPP_KEY if [ $? -eq 1 ] then if [ "$SQL_THREAD" == No ] && [ "$LAST_ERROR" == 1062 ] then FILED=`$mysql -uroot -p$passwd -S $sock -Nse "desc $DATABASE.$TABLE" | grep PRI | awk ‘{print $1}‘` $mysql -uroot -p$passwd -S $sock -e "delete from $DATABASE.$TABLE where $FILED=$duplicate" $mysql -uroot -p$passwd -S $sock -e "start slave sql_thread" else echo "====================== ok ========================" $mysql -uroot -p$passwd -S $sock -e ‘show slave status\G‘ | egrep ‘Slave_.*_Running‘ echo "====================== ok ========================" break fi fi done
脚本随便写的。大家可以再自己相应的扩展一下。^_^
原文:http://www.cnblogs.com/gomysql/p/3734087.html