For the past few years, data centers have been gradually replacing their hard drives with faster flash storage. Meanwhile, database storage engines such as MyISAM and InnoDB continue to use forty-year old indexing algorithms for managing read and write operations. Tokutek has developed a new kind of data structure called a Fractal Tree that is purported to achieve insertion rates of five to twenty times faster in large databases, as in 100 million rows and above. Moreover, these benefits have been shown to continue right on up to fifty billion rows!
But what’s in it for us smaller users? Are databases of “only” a few million rows too small to benefit from TokuDB’s Fractal Tree indexing? I could find plenty of good studies showing how TokuDB benefitted data centers requiring high insertion workloads, fast response times, and that must simultaneously store and query large volumes of rapidly arriving data. In fact, Profile Technology is using TokuDB for their Advance Searches of Facebook. Other sizeable clients include Evidenzia (Peer-to-Peer Monitoring), Limelight Networks (Agile Cloud Storage), Intent Media (Online Advertising), Jawa (Mobile Gaming), and many others. There was a lot less information on smaller players, which lead me to decide to set out and find out for myself.
What I found was very encouraging…very encouraging indeed.
How TokuDB Works
While the traditional B-Tree indexing works quite well with newly written data, which is laid out sequentially on the disk, write times grow considerably as the data becomes scattered in different disk sectors over time. And that’s where TokuDB’s Fractal Tree indexing shines. It uses a new kind of indexing that outperforms both B-trees and Log-Structured Merge trees (LSMs) by using a better method for writing data to disk, which in turn improves how fast data can be read.
Platforms and Licencing
The latest version (6.5) is optimized for flash and is available for Linux only. As a Windows user, I was initially surprised at its exclusion, until I came across a comment from one of my readers who asked “Why would you run MySQL on Windows?” Clearly, Linux is by far the more popular MySQL platform.
For now, TokuDB is meant to fill the gap between the open source MySQL server and the über-expensive enterprise offerings of the big players like Oracle, IBM, and Microsoft. TokuDB is licensed at a modest $2,500 / 100GB / year based on usage capacity rounded to the nearest 100GB for production deployments larger than 50GB.
You can download the patched MySQL source and Fractal Tree Index library and compile MySQL yourself or you can download the fully functional binaries. I used the latter, which was as easy to set up as a standard MySQL server.
On disk, a table using the TokuDB storage engine is different from both InnoDB and MyISAM. In addition to having a .frm file, as all MySQL tables do, there is a directory, which contains a main.tokudb file with the data, a status.tokudb file, and a key-KEYNAME.tokudb file for each index:
The Test Environment
Being a Windows user did not unequivocally impede my testing the linux-based installation thanks to virtual machine software. I went with Oracle VirtualBox because it accepted Open Virtual Application (.ova) files. That’s a package that contains files describing a virtual machine; it includes an .OVF descriptor file, optional manifest (.MF) and certificate files, and other related files, saved in a single archive using .TAR packaging. When I told the folks at Tokutek about my article, they were good enough to send me a pristine testing environment, which included the Ubuntu OS with a basically configured empty TokuDB database.
The TokuDB database also includes the usual MySQL storage engines, so InnoDB and MyISAM tables are available within the server. By Default, TokuDB grabs fifty percent of available RAM on start-up for caching, so I had to make sure not to set the innodb_buffer_pool_size to a value that would cause the server to overallocate memory and swap – hence degrading performance. I was also constrained by the 10 GB of virtual disk allocated to the virtual host, so I couldn’t go overboard on table size. Finally, we must keep in mind that the virtualization software stands in between the software and the physical disk, so I/O operations are partly determined by the virtualization software and the host OS.
I used MySQL Workbench to communicate with the database.
The following results should not be taken as gospel, since my tests were far from rigid and lacked the quality control standards that would be required of stringent official QA testing. They will however provide some idea as to what an average user can expect to achieve “out of the box”.
I added one table to each of my test databases called user, with the following structure:
Table user ========== user_id int(11) PK (autonumber) first_name varchar(45) last_name varchar(45) sex char(1)
Test #1: Import one million records into the user table.
To insert the records, I created a stored procedure that set each field to random strings. These would also be used for pattern matching later:
DELIMITER $$ CREATE PROCEDURE `varchar_select`(IN p_schema_name varchar(50), IN p_table_name varchar(50)) READS SQL DATA BEGIN DECLARE p_sql VARCHAR(255) DEFAULT "SELECT "; DECLARE num_rows INT DEFAULT 0; DECLARE i INT DEFAULT 0; DECLARE col_name VARCHAR(50); DECLARE col_type VARCHAR(50); DECLARE col_names CURSOR FOR SELECT column_name, column_type FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA = p_schema_name AND table_name = p_table_name ORDER BY ordinal_position; -- be sure to open the cursor before calling FOUND_ROWS()! OPEN col_names; select FOUND_ROWS() into num_rows; SET i = 1; the_loop: LOOP IF i > num_rows THEN CLOSE col_names; LEAVE the_loop; END IF; FETCH col_names INTO col_name, col_type; SET p_sql = CONCAT(p_sql, col_name, ','); SET i = i + 1; END LOOP the_loop; -- remove the trailing comma from the field list SET p_sql = TRIM(TRAILING ',' FROM p_sql); -- create and execute the prepared statement SET @sqlv = p_sql; PREPARE stmt1 FROM @sqlv; EXECUTE stmt1; DEALLOCATE PREPARE stmt1; END
The above version of the proc iterates over the column names and concatenates them to the SELECT string that will be executed as a prepared statement. Now, we’ll add the code to enclose the VARCHAR values within quotes.
The Updated Call to Concat()
The col_type includes the length of varchar fields so that it may come back as
varchar(50), so a LIKE expression is employed for type matching.
The 2nd CONCAT parameter is a string that calls CONCAT() from the prepared statement. Hence, it produces a string such as
CONCAT('"', col_name, '"') AS col_name,.
SET p_sql = IF(col_type LIKE 'varchar%', CONCAT(p_sql, 'CONCAT('"',',col_name, ','"') AS ',col_name, ','), CONCAT(p_sql, col_name, ',') );
Calling the varchar_select Proc
Calling our proc for the actor table does exactly what we hope it would; the name fields are enclosed within quotes, while the other fields are not:
mysql> call varchar_select('sakila', 'actor'); actor_id first_name last_name last_update ------------------------------------------------------------- 1 "PENELOPE" "GUINESS" 2006-02-15 04:34:33 2 "NICK" "WAHLBERG" 2006-02-15 04:34:33 3 "ED" "CHASE" 2006-02-15 04:34:33 4 "JENNIFER" "DAVIS" 2006-02-15 04:34:33 5 "JOHNNY" "LOLLOBRIGIDA" 2006-02-15 04:34:33
The full function and proc code is available on GitHub.
While we have achieved something fairly generic today, there are still some limitations to our proc. For one thing, it always returns all of the table columns. Moreover, there is no support for joins at all. In the next installment, we’ll take a parameterized query approach to handle more complex queries.