Inside MySQL Binary Logs
September 16, 2010
The MySQL Binary Log files are important to the proper functioning of the database and facilitate replication, and recovery. This article discusses some of the inner workings of these important files, and explains how ordering of transactions also plays into how they work.
The MySQL binary logs are an important part of your database server, so it's a good idea to get acquainted with them. Essentially, these files are logs of transactions that occur in your database. They can be used to restore the database to a point in time, and are also used by replication, as we'll see later.
What are Binary Logs?
The best way to understand binary logs is by looking at their contents. Let's follow an example using mysqlbinlog. This tool will read the binary log and return it's contents as readable text.
$ mysqlbinlog /var/lib/mysql/db1-bin.000004 # at 35804 #100802 20:47:31 server id 120 end_log_pos 35953 Query thread_id=6 exec_time=0 error_code=0 SET TIMESTAMP=1280782051/*!*/; ALTER TABLE event MODIFY name char(64) CHARACTER SET utf8 NOT NULL default '' /*!*/; # at 35953 #100802 20:47:31 server id 120 end_log_pos 36089 Query thread_id=6 exec_time=0 error_code=0 SET TIMESTAMP=1280782051/*!*/; ALTER TABLE event MODIFY COLUMN originator INT UNSIGNED NOT NULL /*!*/;
So now the binary logs become more visible. We can see that MySQL is dumping specific details about queries that have completed in the database. The "at" line indicates the starting position in the binary log file, end_log_pos is the ending position and so on. You can also see a timestamp, which will be used by replication, as we'll see later. Also, notice the server_id. This is the ID set in the my.cnf file, and should be unique among your MySQL servers. It's basically how MySQL replication avoids an infinite loop in various replication topologies.
How do they support replication?
The binary log is written to as transactions commit. The slave server monitors the master server for activity in its binary log file using its IO thread. If it finds a new transaction, it copies those lines into its own relay log file, which is similar to a binary log file in contents. Its SQL thread then applies those transactions in the order they occurred to the slave side.
How do they support point-in-time recovery?
The first thing you'll want to do is restore your database from your last backup, say last night, or last Sunday. You can use a dump file and import it, or perhaps a cold backup or volume snapshot. Whatever method you use, be sure you've restored to that point in time and make sure the database is up and running, and no new users can connect.
Binary logs support point-in-time recovery through creative use of the mysqlbinlog command. Using the command on a file by itself will simple dump out its contents. However, it also has other options. For instance, you can specify a beginning date and an ending date. For example:
$ mysqlbinlog --start-datetime='2010-08-01 11:00:00' --stop-datetime='2010-08-01 14:00:00' db1-bin.000004
So with this command we'll display all the transactions in the file on August 1st, between 11am and 2pm. We can start to see how point-in-time recovery can be achieved here. There are just a couple more pieces to fall into place. Next, we'll want to add the --to-last-log option, so the mysqlbinlog command will continue reading, even past the end of the file we pass on the command line.
The last piece to put into place is to use the Unix pipe command. This dumps the standard output - what would normally be output to the screen - into a pipe. At the other end of the pipe is another Unix program, in this case the MySQL client. So essentially, we'll send those commands directly into the MySQL server to be executed, by using the MySQL client program as follows:
$ mysqlbinlog --to-last-log --start-datetime='2010-08-01 11:00:00' --stop-datetime='2010-08-01 14:00:00' db1-bin.000004 | mysql -u root -ppass
That will start applying transactions to the running MySQL server starting from 11am August 1st and through to 2pm August 1st, reading all the binary logs it requires to get all transactions in that window.
Isolation Levels and Non-Transactional Tables
MySQL supports a number of different isolation levels for your transactions.
Repeatable Read is the default and means that inside of a transaction reads will be repeatable. Remember there are many other sessions changing rows of data in the database at the same time. If those changes can be seen by your transaction, at different states before a commit, that can cause your application problems.
Read Committed means that your transaction is in isolation, however if another transaction in another session commits, your transaction will then be able to see those changes immediately. So if you did a SELECT on rows affected by that other transaction earlier in your own, they may now be different when you SELECT again, hence you don't have repeatable reads anymore.
Read Uncommitted means that your transaction will have dirty reads. It will see work-in-progress from other transactions. This is usually a bad thing for concurrent database applications.
As it turns out only repeatable reads are supported for statement based and mixed replication. In earlier versions before MySQL 5.1, this would break replication, but no error was returned. As of newer versions of 5.1, you now get an error back to your application if you are not in Repeatable Read isolation level. This makes sense if you think about it, as replication correctness and consistency relies on the order of commits. If MySQL can't write the statements to the binary log in an absolutely consistent way, then replication slaves may apply them in a different order, and hence have databases in a different, potentially inconsistent state.
Binary logs are a core component of the MySQL database, providing point-in-time recovery, and facilitating replication as well. They also provide a window into the inner workings of the database, giving us some insight into how transactions work, and why they are important to replication.