dcsimg

MySQL Replication Pitfalls

November 13, 2008

Introduction

MySQL has some powerful replication technology. It can be configured to replicate everything, or subsets of schemas or even individual tables. Those slave databases can themselves be masters as well, feeding still more databases a stream of transactions from the master.

Despite the technology's obvious power, it is not bulletproof. There are times when you find replication can break, throwing up errors in the slave error-log, and stopping the receiving of transactions in their tracks.

In this article we're going to show you some of the issues and problems with MySQL replication, and give you examples to show you why those are issues.

Mixed Transactional & Non-Transactional Tables

As you probably already know, MySQL supports a number of different storage engines. This powerful architecture allows MySQL to take advantage of the transactional power of InnoDB tables when you need that, and the lightening speed of MyISAM tables, when you don't need row-level locking and ACID compliance.

When it comes to replication, however, it can pose challenges. The binary log won't receive transactions that either fail, or otherwise rollback via a user request. That's good, so they won't be sent down the pike to the slave database; but what if you have a mixed query with MyISAM and InnoDB tables? Well it turns out that the transaction in that session will replicate fine. However if there are additional concurrent sessions updating that MyISAM table before that transaction completes, then those statements will get logged out of order.

There are also situations where you can get duplicates occurring in similar situations. The slave may just stop, putting those messages in the error log. But lets hope you don't have to rebuild from the master if this happens.

The long and short of this is that the MySQL docs specifically caution against mixing MyISAM and InnoDB tables in transactions. Your best bet is to convert your MyISAM tables to InnoDB. I would recommend this in general, except for the particular tables you really need that speed boost from. So I would default to the InnoDB engine in general.

Here's a way to find out how many tables you have of different types:

mysql> select count(*), engine from information_schema.tables  group by engine;
+----------+--------+
| count(*) | engine |
+----------+--------+
|        1 | InnoDB | 
|       13 | MEMORY | 
|        4 | MyISAM | 
+----------+--------+
3 rows in set (0.02 sec)

Convert a table to InnoDB this way:

mysql> ALTER TABLE mytable ENGINE = InnoDB;

Non-Deterministic Behavior

It turns out that transactions with mixed tables are not the only problems. There are also a number of functions which pose problems for MySQL statement-based replication. They are:

CURRENT_USER()
LOAD_FILE()
UUID()
UUID_SHORT()
USER()
FOUND_ROWS()
SYSDATE()
VERSION()

Non-deterministic is a technical way of saying something won't happen the same way all of the time. In this case, these functions won't behave the same way on the slave (or at least we can't guaranteed they will. In the case of sysdate, it is not affected by the SET TIMESTAMP that is placed in the binary log. There is a flag to fix this (--sysdate-is-now).

These functions may also cause problems.

GET_LOCK()
RELEASE_LOCK()
IS_FREE_LOCK()
IS_USED_LOCK()

It turns out there are even more nefarious ways things can get out of whack. If you are using ORDER BY with LIMIT clause in an UPDATE, and there are ties, the rows can be returned in different orders by the slaves. Woops. Does it mean you can't use ORDER BY? No it just means that you need to include additional columns to guarantee there won't be any ties in the ordering.

So if you want your replication to run right, audit your code for non-deterministic behavior. I also recommend digging into the MySQL docs on this question, in the section "Replication Features and Issues"

http://dev.mysql.com/doc/refman/5.0/en/replication-features.html

Other Problems

Statement-based replication may have other problems. For instance, if you have stored procedures, functions or triggers you may have problems with your replication setup.

There are also a number of issues related to using AUTO_INCREMENT. Specifically if you add an AUTO_INCREMENT to a column with ALTER TABLE, you can run into problems.

There are also some problems with replication and session variables. The example the documentation uses if you do:

mysql> SET MAX_JOIN_SIZE=1000
mysql> INSERT INTO mytable VALUES (@@MAX_JOIN_SIZE)

It won't replicate properly.

There are also issues if the master is being shutdown while replication is happening, or if the master instance crashes.

You can also run into issues with replication and floating-point values. The conversion from decimal to binary is approximate, and the architecture on the slave server could affect these calculations as well.

Again, review the document "Replication Features and Issues" and keep an eye on new releases to know specifically what issues are relevant to your installation.

Conclusion

Replication in MySQL promises a lot. It's fairly trivial to get up and running, almost too easy. However, there are many headaches you'll likely encounter along the way. From non-deterministic behavior, to system crashes on the master, there is a lot to watch out for.

In our next article, we'll discuss what to do to make your replication setup work reliably. In addition, we’ll help you keep your data in sync, and show you how to verify that it is the same on both master and slave.

» See All Articles by Columnist Sean Hull








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers