Point in time backup and recovery is a crucial component of any part of any MySQL environment. This article describes how to implement the basic point in time recovery and describes a few mechanisms to accomplish this goal.
As all of you know, point in time backup and recovery is a crucial component
of any part of any MySQL environment. From LVM (Logical Volume Manager)
snapshots to MySQLdump
there are many ways to accomplish point in time recovery. Some methods are more
reliable and some are easier to work with than the next. In any case, all of
them need to be correct at the end of a recovery. In this article, I will
describe how to implement the basic point in time recovery and describe a few
mechanisms to accomplish this goal, one is an older method and one is a newer
LVM is a great tool and it is easy to setup and work with. LVM has
many benefits, for example:
- It can resize volume groups online by absorbing new
physical volumes (PV)
- Resize logical volumes’ (LV) online by concatenating
extents onto LVs or truncating extents from them
- Stripe whole or parts of LVs across multiple PVs
- Mirror whole or parts of LVs
- Move online LVs between PVs
- Split or merge volume groups
For backup purposes, LVM comes with the ability to create read and
read-write snapshots of logical volumes. Percona
great article to get you started: Using LVM for MySQL Backup and Replication Setup
. Below are the very basic steps to
accomplish an LVM snapshot for a MySQL server.
1. You need to freeze your database modifications (FLUSH TABLES WITH READ
2. Create LVM snapshot
3. FLUSH LOCK
4. If you are replicating you should save information about your replica
(i.e. SHOW SLAVE STATUS)
5. Mount snapshot and save your datadir files
6. Unmount and remove snapshot
If you are so inclined to use LVM’s snapshot capabilities you should look at
mylvmbackup. In the past, LVM
snapshots where arguably the best way to run a point in time backup, although
today, there are other more flexible ways to accomplish the same goal. So, now
that we have the gist of what LVM snapshots are all about, I would like to add
that LVM could become cumbersome to work with and slow, especially on extremely
large systems. Not to mention that in high write systems a flush table with
read lock will probably take a long time and interrupt service to your
customers in one way or another.
If you have read any of my previous articles you probably know where this
article is going, yep, you guessed it, Xtrabackup.
Xtrabackup is really a set of tools made up of, xtrabackup, innobackupex and tar4idb.
- Xtrabackup – is a complied C binary, which copies only
InnoDB and XtraDB data.
- Innobackupex – is a wrapper script that provides
functionality to backup a whole MySQL database instance with MyISAM,
InnoDB and XtraDB tables.
- Tar4idb – tars InnoDB data safely
For the example below, I will be using innobackupex on a slave MySQL
instance to get a point in time snapshot. Keep in mind that this is just one of
the many configurations you could use for your installation. If you have not
already, please check out my previous article, "Working
with MySQL Multi-master Replication – Keeping a True Hot Standby," for
the type of setup this backup system is implemented on.
For this set up, I am using MySQL 5.5.X rc with semi-synchronous
replication. It is important to understand what semi-synchronous
replication is before we get into this backup method. In MySQL 5.5, there is an
interface to semi-synchronous replication in addition to the built-in
asynchronous replication. Semi-synchronous replication is installed as a
plug-in and can be used as an alternative to asynchronous replication. This
type of replication works as follows:
1. A slave server, upon connecting to a master server, will inform the
master that it is has semi-synchronous replication activated.
2. When semi-synchronous replication is enabled on both the master and at
least one slave, the thread that performs a transaction commit on the master
blocks after the commit. After the commit, the thread waits until the
semi-synchronous slave acknowledges all events for the transaction on the
master, or a timeout occurs.
3. The slave acknowledges receipt of a transaction’s events only after the
events have been written to its relay log and flushed to disk.
4. This step is a bit scary but in the event of a timeout on the slave, the
master server reverts to traditional, asynchronous replication. This typically
occurs when a slave gets too far behind, well, further behind than
rpl_semi_sync_master_timeout. When the slave server catches up the master will
return to semi-synchronous replication.
5. The semi-synchronous plug-in must be installed and active on both the
master and at least one slave for this type of replication to work.
Needless to say that this is both a cool and scary (specifically around the
possible flapping between semi-synchronous and asynchronous replication)
feature of MySQL 5.5. Semi-synchronous replication is a great idea but I am
wondering at what performance or integrity cost(s). This is not the right
article to determine either performance or integrity implications’ so we’ll
assume, for now, that semi-synchronous replication is good enough based on what
is stated in the MySQL documentation:
"Compared to asynchronous replication, semisynchronous replication
provides improved data integrity. When a commit returns successfully, it is
known that the data exists in at least two places (on the master and at least
one slave). If the master commits but a crash occurs while the master is
waiting for acknowledgment from a slave, it is possible that the transaction
may not have reached any slave." (Oracle)
"Semisynchronous replication does have some performance impact because
commits are slower due to the need to wait for slaves. This is the tradeoff for
increased data integrity. The amount of slowdown is at least the TCP/IP roundtrip
time to send the commit to the slave and wait for the acknowledgment of receipt
by the slave. This means that semisynchronous replication works best for close
servers communicating over fast networks, and worst for distant servers
communicating over slow networks. " (Oracle)
The reason for the use of semi-synchronous replication in this example stems
from years of hearing people state that MySQL replication cannot be trusted in
any way. To that I often replied, "Then why even have a slave, and why are
you running your backups on it, and why are you letting your customers read
from it?" I digress so we’ll stop the argument there.
Basically, we are running backups on our slave servers with Xtrabackup and
want the best possible integrity we can muster short of DRBD! That said, lets move to the final part of
this article that explains how you can use Xtrabackup on your slave server to
achieve point in time backups.
Like any MySQL installation, especially when customers pay for service, it
is a good idea to have consistent backups and, in this case, point in time
backups. My actual point in time backup script will not be posted here but the
general process will be. As you may have guessed my point in time backups are
run on slave servers and I explicitly specify the –slave-info flag in
innobackupex. The –slave-info flag is defined as follows, take from
"This option is useful when backing up a replication slave server. It
prints the binary log position and name of the binary log file of the master
server. It also writes this information to the ‘ibbackup_slave_info’ file as a
‘CHANGE MASTER’ command. A new slave for this master can be set up by starting
a slave server on this backup and issuing a ‘CHANGE MASTER’ command with the
binary log position saved in the ‘ibbackup_slave_info’ file."
shell> innobackupex --defaults-file=/etc/my.cnf --user=someuser --password=some_password --slave-info /path/to/backup
You have to make sure that the backup completed ok so make sure that your
output file has the following line:
101101 20:55:36 innobackupex-x.x.x: completed OK!
In the output log, I look for the following lines and log them to a database
just in case the log file was erased.
innobackupex-x.x.x: Backup created in directory '/path/to/backup' innobackupex-x.x.x: MySQL binlog position: filename 'binary-logs.000001', position 107 innobackupex-x.x.x: MySQL slave binlog position: master host '0.0.0.0', filename 'binary-logs.000001', position 61560909
Then I run the following inside of the MySQL server:
mysql> FLUSH BINARY LOGS;
Now you need to backup all of the old binary logs, preferably to a location
off the MySQL cluster, like on a filer or any other JBOD server. You’ll
probably want to make sure that your backup and binary logs are in the same
place on the JBOD!
As always, mileage may vary depending on what you are doing, your system
capabilities and your service level agreements. Make sure you test the recovery
process at least once a quarter after implementation!