MySQL Replication Pitfalls

Introduction

MySQL
has some powerful replication technology. It can be configured to replicate
everything, or subsets of schemas or even individual tables. Those slave
databases can themselves be masters as well, feeding still more databases a stream of
transactions from the master.

Despite
the technology’s obvious power, it is not bulletproof. There are times when
you find replication can break, throwing up errors in the slave error-log, and
stopping the receiving of transactions in their tracks.

In
this article we’re going to show you some of the issues and problems with MySQL
replication, and give you examples to show you why those are issues.

Mixed Transactional & Non-Transactional Tables

As
you probably already know, MySQL supports a number of different storage
engines. This powerful architecture allows MySQL to take advantage of the
transactional power of InnoDB tables when you need that, and the lightening
speed of MyISAM tables, when you don’t need row-level locking and ACID
compliance.

When
it comes to replication, however, it can pose challenges. The binary log won’t
receive transactions that either fail, or otherwise rollback via a user
request. That’s good, so they won’t be sent down the pike to the slave
database; but what if you have a mixed query with MyISAM and InnoDB tables?
Well it turns out that the transaction in that session will replicate fine.
However if there are additional concurrent sessions updating that MyISAM table
before that transaction completes, then those statements will get logged out of
order.

There
are also situations where you can get duplicates occurring in similar
situations. The slave may just stop, putting those messages in the error log.
But lets hope you don’t have to rebuild from the master if this happens.

The
long and short of this is that the MySQL docs specifically caution against
mixing MyISAM and InnoDB tables in transactions. Your best bet is to convert
your MyISAM tables to InnoDB. I would recommend this in general, except for
the particular tables you really need that speed boost from. So I would
default to the InnoDB engine in general.

Here’s
a way to find out how many tables you have of different types:


mysql> select count(*), engine from information_schema.tables group by engine;
+———-+——–+
| count(*) | engine |
+———-+——–+
| 1 | InnoDB |
| 13 | MEMORY |
| 4 | MyISAM |
+———-+——–+
3 rows in set (0.02 sec)

Convert
a table to InnoDB this way:

mysql> ALTER TABLE mytable ENGINE = InnoDB;

Non-Deterministic Behavior

It
turns out that transactions with mixed tables are not the only problems. There
are also a number of functions which pose problems for MySQL statement-based
replication. They are:


CURRENT_USER()
LOAD_FILE()
UUID()
UUID_SHORT()
USER()
FOUND_ROWS()
SYSDATE()
VERSION()

Non-deterministic
is a technical way of saying something won’t happen the same way all of the
time. In this case, these functions won’t behave the same way on the slave (or
at least we can’t guaranteed they will. In the case of sysdate, it is not
affected by the SET TIMESTAMP that is placed in the binary log. There is a
flag to fix this (–sysdate-is-now).

These
functions may also cause problems.


GET_LOCK()
RELEASE_LOCK()
IS_FREE_LOCK()
IS_USED_LOCK()

It
turns out there are even more nefarious ways things can get out of whack. If
you are using ORDER BY with LIMIT clause in an UPDATE, and there are ties, the
rows can be returned in different orders by the slaves. Woops. Does it mean
you can’t use ORDER BY? No it just means that you need to include additional
columns to guarantee there won’t be any ties in the ordering.

So
if you want your replication to run right, audit your code for
non-deterministic behavior. I also recommend digging into the MySQL docs on
this question, in the section "Replication Features and Issues"

http://dev.mysql.com/doc/refman/5.0/en/replication-features.html

Other Problems

Statement-based
replication may have other problems. For instance, if you have stored
procedures, functions or triggers you may have problems with your replication
setup.

There
are also a number of issues related to using AUTO_INCREMENT. Specifically if
you add an AUTO_INCREMENT to a column with ALTER TABLE, you can run into
problems.

There
are also some problems with replication and session variables. The example the
documentation uses if you do:


mysql> SET MAX_JOIN_SIZE=1000
mysql> INSERT INTO mytable VALUES (@@MAX_JOIN_SIZE)

It
won’t replicate properly.

There
are also issues if the master is being shutdown while replication is happening,
or if the master instance crashes.

You
can also run into issues with replication and floating-point values. The
conversion from decimal to binary is approximate, and the architecture on the
slave server could affect these calculations as well.

Again,
review the document "Replication Features and Issues" and keep an eye
on new releases to know specifically what issues are relevant to your
installation.

Conclusion

Replication
in MySQL promises a lot. It’s fairly trivial to get up and running, almost too
easy. However, there are many headaches you’ll likely encounter along the
way. From non-deterministic behavior, to system crashes on the master, there
is a lot to watch out for.

In
our next article, we’ll discuss what to do to make your replication setup work
reliably. In addition, we’ll help you keep your data in sync, and show you how
to verify that it is the same on both master and slave.

»
See All Articles by Columnist Sean Hull

Sean Hull
Sean Hull
Sean Hull is a business and technology consultant, author and speaker. He is the founder and senior consultant at Heavyweight Internet Group in New York and has worked as a database consultant and expert for ten years. He is the author of "Oracle and Open Source" on O'Reilly & Associates and his articles have appeared in Oracle's own OTN website, DBA Zine, ChangeThis.com, and many others. Using techniques, technologies, and perspectives discussed in the book, articles, and seminars, he has taken his career into exciting areas. He has served as Technology Consultant, Project Manager, CEO, and mentor with a background in programming, systems integration & administration, project development and management. He can be reached at [email protected] or visit http://www.iheavy.com for more info about consulting services.

Latest Articles