MySQL Clustering in a Sandbox

Introduction

MySQL’s
unique architecture allows for plugin storage engines.  So for example,
there is the MyISAM storage engine, the original one, and there is the ARCHIVE
storage engine, which has characteristics that make it suitable for storing
archived data.  And of course we know the InnoDB storage engine, with its
transactional capabilities.  So it makes sense then that MySQL’s
clustering solution involves a storage engine as well, namely the NDB (Network DataBase)
storage engine.

At a
high level, MySQL’s NDB storage engine is an in-memory database, which can
manage data as big as your main memory of all of your nodes.  If you have
two nodes, each with 5G of memory devoted to MySQL, you can handle a 10G database –
data and index – for example.

Basics of MySQL NDB Cluster

In a
MySQL cluster, you have data nodes, management nodes, and SQL nodes.  The
SQL nodes you’re already familiar with.  Those are your mysqld processes
of non-clustered MySQL.   The management node or nodes monitor and control
the data or cluster nodes.  So what do these mysterious cluster nodes
do?  Well they keep track of this in-memory database.  They each
maintain a fragment of the entire database, so as we mentioned earlier, the
total size of your NDB cluster can be as big as the memory of all the data
nodes combined (minus what they are using for the operating system and other
tasks).

Setup Your Sandbox

For the
first time user of MySQL’s NDB cluster, you’ll probably want to get it up and
running on a single server, such as your desktop, or another server you have
available.  Once you understand how all the pieces work together, you can
then look at getting the hardware to handle a real cluster, with multiple
servers to handle a real application load.

1.
Download the sandbox from severalnines.com by visiting:

http://www.severalnines.com/sandbox/#packaged

and grab
mysqlcluster-63-small.tar.gz

2. Unpackage
the sandbox scripts:

$ tar xvzf mysqlcluster-63-small.tar.gz

Also
note, if you’re on Mac OS X, you’ll want to get the macos.sh script and run it:


$ cd mysqlcluster-63-small/scripts
$ sh macos.sh

3. Now
grab a copy of the cluster software from mysql.com here:

http://dev.mysql.com/downloads/cluster/7.0.html

4. Run
the sandbox script to unpack everything (assumes cluster package is in scripts
directory):


$ cd mysqlcluster-63-small/scripts
$ ./dist-tgz.sh mysql-cluster-gpl-7.0.5-osx10.5-x86.tar.gz

5. Now
run the bootstrap script:

$ ./bootstrap.sh

6.
Finally, start up the cluster:

$ ./start-cluster-initial.sh

It may take
a few minutes, but this whole process should start up your management node, two
data nodes, and then finally two sql nodes on 3306 and 3307.

Demonstrate Your Cluster

Now
what?  Well let’s create a table and add some data:


$ mysql –socket=/tmp/mysql.sock.3306
mysql> use test;
mysql> create table cluster_test (c1 varchar(64)) engine=NDB;
Query OK, 0 rows affected (0.84 sec)
mysql> insert into cluster_test values (‘hi there, I am a clustered table’);
Query OK, 1 row affected (0.00 sec)

Now on
the other node, verify that your data is there:


$ mysql –socket=/tmp/mysql.sock.3307
mysql> use test;
mysql> select * from cluster_test;
+———————————-+
| c1 |
+———————————-+
| hi there, I am a clustered table |
+———————————-+
1 row in set (0.00 sec)

Explore Your MySQL Cluster

Obviously,
this is a very simplistic setup, but all the moving parts are there, and that’s
the important thing.  For instance, take a look in the
mysqlcluster-63-small/datadir directory:


levithia:datadir shull$ ls
mysql_1 ndb_1_out.log ndb_3.pid
mysql_2 ndb_2.pid ndb_3_fs
ndb_1.pid ndb_2_fs ndb_3_out.log
ndb_1_cluster.log ndb_2_out.log

You see
various logfiles which are worth a look, and then you see your mysql_*
directories, which are your conventional mysqld datadir directories where you’ll
find InnoDB and MyISAM datafiles, etc.  

Also look
at your mysqlcluster-63-small/config directory, and you’ll find my.cnf.1 and
my.cnf.2.  These files will contain some new parameters for NDB such as:


ndbcluster
ndb-connectstring=localhost:1186

The
first one tells mysqld to load the hooks for using the NDB storage engine, and
the second tells the mysql data node how to reach the cluster, namely the
hostname and port.

Also,
fire up a "top" session, or use the ps command, and take a look at
what is happening. 

The
"scripts" directory also contains a status script, which you might take
a look at.  You should see something like this:


levithia:scripts root# ./status.sh
Connected to Management Server at: localhost:1186
Node 2: starting (Last completed phase 3) (mysql-5.1.32 ndb-7.0.5)
Node 3: starting (Last completed phase 3) (mysql-5.1.32 ndb-7.0.5)

I would
definitely recommend digging through these various scripts, and look at what
they are doing for you.  The more you understand what these scripts are
doing, the quicker you’ll be ready to setup your own cluster manually, and
manage it as well.

Further Reading

A Googly
MySQL Cluster – Tech Talk with Stewart Smith from MySQL AB

http://video.google.com/videoplay?docid=-4567104036778249401

Johan Andersson’s
Clustering Blog:

http://johanandersson.blogspot.com/

Several
Nines Sandbox:

http://www.severalnines.com/sandbox/

Conclusion

MySQL’s
clustering storage engine is a unique take on database clustering, unlike
Oracle’s or Microsoft SQL Servers solutions.  As with those, it has its
strengths and weaknesses, so you’ll of course need to test, test, and test some
more before you can decide if the technology is right for your application, and
organization.  New features that are in development include the ability to
add nodes while the cluster is up and running, storing data and index
information on disk, and much more.   

»
See All Articles by Columnist Sean Hull

Sean Hull
Sean Hull
Sean Hull is a business and technology consultant, author and speaker. He is the founder and senior consultant at Heavyweight Internet Group in New York and has worked as a database consultant and expert for ten years. He is the author of "Oracle and Open Source" on O'Reilly & Associates and his articles have appeared in Oracle's own OTN website, DBA Zine, ChangeThis.com, and many others. Using techniques, technologies, and perspectives discussed in the book, articles, and seminars, he has taken his career into exciting areas. He has served as Technology Consultant, Project Manager, CEO, and mentor with a background in programming, systems integration & administration, project development and management. He can be reached at [email protected] or visit http://www.iheavy.com for more info about consulting services.

Latest Articles