Wednesday, September 12, 2012

How to lag a slave behind to avoid a disaster

How to lag a slave behind to avoid a disaster:
MySQL Replication is useful and easy to setup. It is used for very different purposes. For example:

  • split read and writes

  • run data mining or reporting processes on them

  • disaster recovery


Is important to mention that a replication server is not a backup by itself. A mistake on the master, for example a DROP DATABASE or an UPDATE without a WHERE clause will be replicated nearly instantly to all slave servers and just having a slave is not going to be helpful here. How can we avoid that kind of mistakes? Having a slave server lagging behind.


pt-slave-delay


pt-slave-delay is another tool that comes with Percona Toolkit and it is very easy to use. It works stopping and starting the SQL Thread in order to archive the lag that you have specified in command line. Lets see an example:
root@debian:~# date
Mon Sep 10 12:40:28 CEST 2012

root@debian:~# pt-slave-delay --delay=10m u=root,p=msandbox,h=127.0.0.1,P=21983
2012-09-10T12:40:31 Reconnected to slave
2012-09-10T12:40:31 slave running 83 seconds behind
2012-09-10T12:40:31 STOP SLAVE until 2012-09-10T12:49:08 at master position mysql-bin.000001/83915317
2012-09-10T12:41:31 Reconnected to slave
2012-09-10T12:41:31 slave stopped at master position mysql-bin.000001/100072820
[...]
2012-09-10T12:49:31 START SLAVE until master 2012-09-10T12:39:08 mysql-bin.000001/83915317
2012-09-10T12:50:31 Reconnected to slave
2012-09-10T12:50:31 slave running 672 seconds behind
2012-09-10T12:50:31 slave running 672 seconds behind at master position mysql-bin.000001/233713063

slave1 [localhost] {msandbox} ((none)) > show slave status\G
*************************** 1. row ***************************
[...]
Slave_IO_Running: Yes
Slave_SQL_Running: No
[...]

With “–delay=10m” I’m asking the tool to lag the slave 10 minutes behind. The rest of options are just to specify how to connect to the slave. Piece of cake :) If you run the tool in order to lag a slave, for example 2 hours behind, you could have enough time to stop the replication and skip an offending query executed by mistake on the master.


Conclusion


We can improve the security and availability of our infrastructure using delayed replication slave. Is useful not only for disaster recovery but also to test how our application deal with lagged replication.
With MySQL 5.6 this technique will be included natively:
http://dev.mysql.com/doc/refman/5.6/en/replication-delayed.html



CHANGE MASTER TO MASTER_DELAY = N;



That command will do the magic, until we have 5.6 as GA pt-slave-delay can help us to get the same functionality.

DIGITAL JUICE

No comments:

Post a Comment

Thank's!