SHARE

Survey of MySQL Storage Engines

Written By

Mar 11, 2010

4 minute read

MySQL has an interesting architecture that sets it apart from some other enterprise database systems. It allows you to plug in different modules to handle storage. What that means to end users is that it is quite flexible, offering an interesting array of different storage engines with different features, strengths, and tradeoffs.

Introduction
MySQL’s Storage Engine Architecture
MyISAM
Archive
InnoDB
Merge
Memory
CSV
NDB Cluster
Conclusions

Introduction

MySQL
has an interesting architecture that sets it apart from some other enterprise
database systems. It allows you to plug in different modules to handle
storage. What that means to end users is that it is quite flexible,
offering an interesting array of different storage engines with different
features, strengths, and tradeoffs.

MySQL’s Storage Engine Architecture

The
various storage engines as the name implies manage how data is stored and
retrieved on disk, and the features available therein. One important
thing to keep in mind however is that MySQL doesn’t manage transactions at the
server level, but at the storage engine level. This is great if you have
a mix of InnoDB tables in a query, or a mix of MyISAM ones, but if you have
both in the same transaction, not only might performance be impacted, but also
results may not be reliable. Although applications can use a mix of
different types of tables using different underlying engines, single queries
and transactions should not in general do this.

Do
you want to see what storage engines you have available? Just query your
MySQL variables as follows:

mysql&g; show variables like 'have%';
+-----------------------+----------+
| Variable_name         | Value    |
+-----------------------+----------+
| have_archive          | NO       | 
| have_bdb              | YES      | 
| have_blackhole_engine | NO       | 
| have_compress         | YES      | 
| have_crypt            | YES      | 
| have_csv              | NO       | 
| have_dynamic_loading  | YES      | 
| have_example_engine   | NO       | 
| have_federated_engine | NO       | 
| have_geometry         | YES      | 
| have_innodb           | YES      | 
| have_isam             | NO       | 
| have_merge_engine     | YES      | 
| have_ndbcluster       | NO       | 
| have_openssl          | DISABLED | 
| have_ssl              | DISABLED | 
| have_query_cache      | YES      | 
| have_raid             | NO       | 
| have_rtree_keys       | YES      | 
| have_symlink          | YES      | 
+-----------------------+----------+

Note
these are features denoted by "have_*" so some of these listed are
not storage engines.

MyISAM

This
storage engine was one of the first. Locking is done at the table level
so that can be a bottleneck very quickly for highly concurrent
applications. It does not support transactions; however, it does support
fulltext indexes, which can certainly come in handy for some
applications. With MyISAM, each table has three files, one for the table data,
one for index data, and one for the definition or frm file. MySQL can
allocate a key cache for MyISAM index data, however caching of the table data
is left to the filesystem. So be sure to leave enough memory to the OS if
you are using large MyISAM tables. MyISAM supports some table repair
after a crash, but does suffer from corruption more often than other table
types, such as InnoDB.

InnoDB

Perhaps
the best-known storage engine, InnoDB, supports row-level locking and
multi-version concurrency control. This means in high concurrency
applications such as web-facing ones, InnoDB is ideal. InnoDB stores data
in tablespaces although the underlying storage is managed internally. It
reads and writes data in chunks called blocks, and caches in a configurable
buffer cache both data and index blocks. InnoDB also supports crash
recovery. If you pull the plug on the box, and then reboot, when MySQL
starts up, it will rollback all standing and open transactions that were
interrupted by the power failure.

Merge

The
Merge storage engine operates on underlying MyISAM tables similar to how a VIEW
would. In other words, it’s not storing data itself per se, but providing
a container of sorts for them. As it turns out this code is very closely
related to how partitions are handled and created. Merge tables help you
manage large volumes of data more easily. They can be useful for
logging in applications as you can easily remove old data by dropping the
underlying tables. You can think of them as UNION ALL Views. One
thing to keep in mind about MERGE tables is that the create statement does not
check the underlying tables for compatibility so you may not find out they’re
incompatible until you use them.

Memory

These
tables as you might expect are stored in-memory. The structure will
persist past a database restart, but the data will not. They’re good for
lookup tables, or caching results during aggregation. Although access is
fast, concurrency may be limited because they do table-level locking.

CSV

This
storage engine is a little less known and kind of an interesting one. It
allows you to treat filesystem files in csv format as tables! Want to
select certain rows, do some manipulations on those, and insert them into a
table? This is a great way to do it. This is very similar to what
Oracle calls "External Tables".

NDB Cluster

Another
well-known storage engine. It allows one to cluster tables. That
means you have multiple masters, which all can do inserts, updates and deletes
on the same table. NDB has row-level locking, but not full multi-version
concurrency control. Due to some architectural limitations of how the
MySQL server executes joins, they perform rather poorly with NDB storage
engine. For that reason, it is best used for single table primary key
lookups. So, you would not likely port your entire web-facing database to
NDB Cluster for example.

Conclusions

The
MySQL architecture offers a rich selection of storage engines with different
characteristics, strengths, and weaknesses. We’ve covered some of the
most common ones here. Next month we’ll look at some of the newer and
also some more exotic ones such as Blackhole, Maria, PBXT, and more.

»

See All Articles by Columnist
Sean Hull

Sean Hull

Sean Hull is a business and technology consultant, author and speaker. He is the founder and senior consultant at Heavyweight Internet Group in New York and has worked as a database consultant and expert for ten years. He is the author of "Oracle and Open Source" on O'Reilly & Associates and his articles have appeared in Oracle's own OTN website, DBA Zine, ChangeThis.com, and many others. Using techniques, technologies, and perspectives discussed in the book, articles, and seminars, he has taken his career into exciting areas. He has served as Technology Consultant, Project Manager, CEO, and mentor with a background in programming, systems integration & administration, project development and management. He can be reached at shull@iheavy.com or visit http://www.iheavy.com for more info about consulting services.