dcsimg

MySQL Clustering in a Sandbox

May 8, 2009

Introduction

MySQL's unique architecture allows for plugin storage engines.  So for example, there is the MyISAM storage engine, the original one, and there is the ARCHIVE storage engine, which has characteristics that make it suitable for storing archived data.  And of course we know the InnoDB storage engine, with its transactional capabilities.  So it makes sense then that MySQL's clustering solution involves a storage engine as well, namely the NDB (Network DataBase) storage engine.

At a high level, MySQL's NDB storage engine is an in-memory database, which can manage data as big as your main memory of all of your nodes.  If you have two nodes, each with 5G of memory devoted to MySQL, you can handle a 10G database - data and index - for example.

Basics of MySQL NDB Cluster

In a MySQL cluster, you have data nodes, management nodes, and SQL nodes.  The SQL nodes you're already familiar with.  Those are your mysqld processes of non-clustered MySQL.   The management node or nodes monitor and control the data or cluster nodes.  So what do these mysterious cluster nodes do?  Well they keep track of this in-memory database.  They each maintain a fragment of the entire database, so as we mentioned earlier, the total size of your NDB cluster can be as big as the memory of all the data nodes combined (minus what they are using for the operating system and other tasks).

Setup Your Sandbox

For the first time user of MySQL's NDB cluster, you'll probably want to get it up and running on a single server, such as your desktop, or another server you have available.  Once you understand how all the pieces work together, you can then look at getting the hardware to handle a real cluster, with multiple servers to handle a real application load.

1. Download the sandbox from severalnines.com by visiting:

http://www.severalnines.com/sandbox/#packaged

and grab mysqlcluster-63-small.tar.gz

2. Unpackage the sandbox scripts:

$ tar xvzf mysqlcluster-63-small.tar.gz

Also note, if you're on Mac OS X, you'll want to get the macos.sh script and run it:

$ cd mysqlcluster-63-small/scripts
$ sh macos.sh

3. Now grab a copy of the cluster software from mysql.com here:
http://dev.mysql.com/downloads/cluster/7.0.html

4. Run the sandbox script to unpack everything (assumes cluster package is in scripts directory):

$ cd mysqlcluster-63-small/scripts
$ ./dist-tgz.sh mysql-cluster-gpl-7.0.5-osx10.5-x86.tar.gz

5. Now run the bootstrap script:

$ ./bootstrap.sh

6. Finally, start up the cluster:

$ ./start-cluster-initial.sh

It may take a few minutes, but this whole process should start up your management node, two data nodes, and then finally two sql nodes on 3306 and 3307.

Demonstrate Your Cluster

Now what?  Well let's create a table and add some data:

$ mysql --socket=/tmp/mysql.sock.3306
mysql> use test;
mysql> create table cluster_test (c1 varchar(64)) engine=NDB;
Query OK, 0 rows affected (0.84 sec)
mysql> insert into cluster_test values ('hi there, I am a clustered table');
Query OK, 1 row affected (0.00 sec)

Now on the other node, verify that your data is there:

$ mysql --socket=/tmp/mysql.sock.3307
mysql> use test;
mysql> select * from cluster_test;
+----------------------------------+
| c1                               |
+----------------------------------+
| hi there, I am a clustered table | 
+----------------------------------+
1 row in set (0.00 sec)

Explore Your MySQL Cluster

Obviously, this is a very simplistic setup, but all the moving parts are there, and that's the important thing.  For instance, take a look in the mysqlcluster-63-small/datadir directory:

levithia:datadir shull$ ls
mysql_1		ndb_1_out.log	ndb_3.pid
mysql_2		ndb_2.pid		ndb_3_fs
ndb_1.pid		ndb_2_fs		ndb_3_out.log
ndb_1_cluster.log	ndb_2_out.log

You see various logfiles which are worth a look, and then you see your mysql_* directories, which are your conventional mysqld datadir directories where you'll find InnoDB and MyISAM datafiles, etc.  

Also look at your mysqlcluster-63-small/config directory, and you'll find my.cnf.1 and my.cnf.2.  These files will contain some new parameters for NDB such as:

ndbcluster
ndb-connectstring=localhost:1186

The first one tells mysqld to load the hooks for using the NDB storage engine, and the second tells the mysql data node how to reach the cluster, namely the hostname and port.

Also, fire up a "top" session, or use the ps command, and take a look at what is happening. 

The "scripts" directory also contains a status script, which you might take a look at.  You should see something like this:

levithia:scripts root# ./status.sh 
Connected to Management Server at: localhost:1186
Node 2: starting (Last completed phase 3) (mysql-5.1.32 ndb-7.0.5)
Node 3: starting (Last completed phase 3) (mysql-5.1.32 ndb-7.0.5)

I would definitely recommend digging through these various scripts, and look at what they are doing for you.  The more you understand what these scripts are doing, the quicker you'll be ready to setup your own cluster manually, and manage it as well.

Further Reading

A Googly MySQL Cluster - Tech Talk with Stewart Smith from MySQL AB

http://video.google.com/videoplay?docid=-4567104036778249401

Johan Andersson's Clustering Blog:

http://johanandersson.blogspot.com/

Several Nines Sandbox:

http://www.severalnines.com/sandbox/

Conclusion

MySQL's clustering storage engine is a unique take on database clustering, unlike Oracle's or Microsoft SQL Servers solutions.  As with those, it has its strengths and weaknesses, so you'll of course need to test, test, and test some more before you can decide if the technology is right for your application, and organization.  New features that are in development include the ability to add nodes while the cluster is up and running, storing data and index information on disk, and much more.   

» See All Articles by Columnist Sean Hull








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers