mysql - SQL slave thread is not applying changes
I have master/slave replication with 5 slaves and 1 master. The master is mysql 5.1.37 and the slaves are 5.5.8.
Two days ago one of the slave stopped working. In "show slave status" I see that both IO thread and SQL thread are running. The IO thread is generating relay-log files but the SQL thread is not applying the changes... The "seconds behind master" is showing "0" even though I know it's far behind (Checking the binlog using mysqlbinlog).
All the other slaves are working properly.
Don't know what to look for (no errors in mysql log file and no errors in system logs...)
any advice? See below slave status output
mysql> show slave status\G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: master-db
Master_User: replication
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.020839
Read_Master_Log_Pos: 56173153
Relay_Log_File: research-relay-bin.000002
Relay_Log_Pos: 252
Relay_Master_Log_File: mysql-bin.020828
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB: db1,db2,db3,db4
Replicate_Ignore_DB: db5,db6
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 975734937
Relay_Log_Space: 10714389571
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 300
There are some quirks to look out for in MySQL Replication
1) Use of replicate-do-db and replicate-ignore-db at the same time
There is flowchart on the page to show the order of processing. Personally, I do not use replicate-do-db and replicate-ignore-db at the same time. I use one or the other. If other slaves do not have this same issue, then rule this out.
2) Doing LOAD DATA INFILE
The way MySQL Replication handles it is appalling. Whenever a LOAD DATA INFILE is performed in a Master, the entire input file is deposited in the Master's binary logs. The slave gathers in the input file in its relay logs. The slave rematerializes the data file in the /tmp folder and then executes the LOAD DATA INFILE on the slave. This is not counted as repliction lag during this process. As a MySQL DBA, I know this works, but that's tacky !!!
3) Slave IO Thread Communication Breakdown
Sometimes, due to firewall changes, network routing, or some other networking anomaly, the Slave IO thread may simply stop getting entries to populate its relay logs. You may also want to check that the Slave IO thread is visible in the processlist of the Master. To make sure your slave's IO thread is alive, simply do the following on all slaves:
SHOW SLAVE STATUS\G
Watch for the Relay_Log_Space. It should be growing. If it stops growing, MySQL may simply freeze without an error for another crazy reason, which leads to suggestion #4.
4) Slave is Out of Diskspace
I wrote a post about how MySQL freezes when performing a MyISAM operation. MySQL uses MyISAM tables as temp tables. Check yout default tmp table directory (tmpdir variable in MySQL)
I hope these suggestions help !!!