Advanced MySQL Replication - Improving Performance
February 12, 2009
MySQL Replication is some sophisticated and flexible technology. As we discussed in our recent article "Fixing MySQL Replication", it can be made quite reliable and robust if the right tools are used to keep it running smoothly.
But what of sites who are experiencing enormous load on the primary server, which is overloading the slave server. Are there ways to speed up performance, so the slave can keep up? As it turns out there are quite a number of ways to do this. We'll discuss a number of them.
First, a quick discussion of the architecture. As you recall, there are two threads on the slave that are performing work. One is the IO thread, which copies transactions from the master servers binary log to the slaves relay log. The next is the SQL thread. This is the important one. It reads the relay logs and applies the SQL queries against the slave database in a serial fashion. Once again that means all writes are serialized on the slave database. This can become a performance bottleneck in high transaction environments.
Use Multiple Slaves
One method would be to break up data into multiple slaves. This could be done database by database (schema by schema) creating one slave database. Each slave would be pointing to the same master, and using replicate_do_db and replicate_ignore_db statements to specify which data to capture for it's slave. However it may be that your applications are not broken up cleanly that way, or that load favors certain tables, rather than certain schemas. So for your site there may be reason to do it on a table-by-table basis. You would then use replicate_do_table and replicate_ignore_table. You'll also need to keep in mind that more slaves mean more threads on the master database, which could add load there. If you run into that problem, consider creating a distribution master, and then slaving off of that. As a further consideration, realize that if you are running backups off of slave databases, your backup scripts will obviously be more complicated in this arrangement, so document well.
Priming the Cache
A second method would be to do what's called "priming the cache". Recall that each of the SQL statements, which executed on the master, is re-executed on the slave database. All the work involved in reading blocks of data, sorting, and caching query execution details has to happen again on the slave. However, what if we could do this all BEFORE the slave SQL thread gets to that query. That's what the maatkit mk-slave-prefetch tool was built to do. Some sites have had great success with this method, so it is worth investigating. As with anything, test, tweak, and evaluate before rolling out in production.
Move Writes Outside of Replication
A third method would be to move writes outside of replication. What does that mean? Suppose you have tables that load apache logs into your primary database. You can then use replicate_ignore_tables option to skip these heavy archive tables, and then just run a separate data load process on the slave to get that data into your slave database. Since this can also be done in parallel, it may provide surprising speedups.
Tune Disk IO Subsystem
A fourth method could be to look at the underlying disk subsystem. Are you using RAID? If not, consider it. Can you get faster disks, or add more spindles to your RAID array? If so, these options can speed up overall IO throughput to the volume where your datafiles are sitting.
Consider the MyISAM Storage Engine
A fifth option would be to look at using the MyISAM storage engine on the slave side for those high write-heavy tables. Recall that MyISAM is not transactional. There is a huge amount of code involved in providing the row-based locking, and ACID compliance you find in InnoDB and related transactional storage engines. MyISAM is blazingly fast because it doesn't have to check anything. It simply writes. Therefore, it is very very fast.
You might ask, but what about my transactional integrity. Remember though that all transactions are serialized on the slave, so there is no worry of other sessions reading or writing the same data. You only need to worry about the SQL thread.
Give Up Some Safety
The sixth option we'll mention involves giving up some safety and recoverability on the slave. Since your slave database is presumably a copy of data held elsewhere, recommending less safe options can be seen in that context. Obviously, be sure all your data has been sufficiently backed up in various places.
Firstly, you can disable the binary log on the slave. This will reduce the amount of data that needs to be written while the slave is executing queries from the SQL thread.
Secondly you can configure InnoDB to flush its changes less frequently using innodb_flush_log_at_trx_commit=2. In addition, you can set innodb_locks_unsafe_for_binlog=1. For MyISAM there is a setting delay_key_write=ALL which may help.
Since these options make your database less recoverable, you want to be VERY SURE to disable them if this slave becomes the master at some point.
If the above options don't work for you, consider looking at MySQL 5.1. The new version of MySQL includes a feature called row-based replication. In contrast with the existing statement-based replication, row-based can often avoid re-executing statements on the slave by passing along the change vector of actual data that changed on the primary. Time will tell if the overall performance is noticeably faster, but it's worth investigating.
When looking at ways to speed up the slave, keep in mind that lag on the slave server is normal. MySQL's out-of-the-box slave technology is not meant to be synchronous. If your application has that requirement, we recommend looking at the Google patches to provide semi-synchronous replication. We'll discuss that in more detail in next month's article as we continue our investigation of advanced MySQL replication techniques.
» See All Articles by Columnist Sean Hull