The MySQL table types
We all know accessing data in RAM is a lot quicker than accessing data on disk. So, with this in mind, wouldn’t it make sense for MySQL to store as much data as possible in memory? The answer is of course yes, and MySQL can in a number of ways. Query caches, holding indexes in memory, and most extreme of all, holding an entire table in memory. The latter is the topic of this article. Tables in memory are called HEAP tables. The MySQL table types include:
- MyISAM tables, the default table type, which employs table locking, making it ideal for high volume select, low volume insert/update environments, such as most websites.
- InnoDB tables, which supports transactions and referential integrity
- ISAM tables, the old MySQL table type, replaced by MyISAM in most instances now.
- BDB, which are also transaction safe.
- MERGE tables, a table type for a collection of MyISAM table types, allowing them to be used as one.
Creating a HEAP table
Creating a HEAP table is simply a matter of specifying the table type as HEAP, for example as follows:
mysql> CREATE TABLE heapofdust (id INT, fname VARCHAR (40)) TYPE=HEAP; Query OK, 0 rows affected (0.08 sec)
Note that HEAP tables cannot contain fields of type TEXT or BLOB. If you try to create a HEAP table containing one of these types, you will get an error, as follows:
mysql> CREATE TABLE heapofleaves (id INT, fname VARCHAR (40), description TEXT) TYPE=HEAP; ERROR 1163: The used table type doesn't support BLOB/TEXT columns
Syntax-wise, you can now INSERT and SELECT data as you would for an ordinary MyISAM table, just faster. In early 2002, John Lim wrote some benchmarks to measure the difference HEAP tables make. He found that HEAP tables were faster than MyISAM tables, but not by much. It is likely that with MySQL 4’s table cache, the differences will be even less. These were his results:
Table type | 100 000 inserts (seconds) | Inserts per s | 50 000 selects (s) | Selects/s |
---|---|---|---|---|
MySQL 3.23 Heap (Mem used: 10Mb) | 23.82 | 4,198 | 21.96 | 2,276 |
MySQL 3.23 MyISAM | 33.42 | 2,992 | 25.91 | 1,930 |
You can find the full details of his benchmarks (which also compare other DBMS’) at the PHP Everywhere site.
Of course, benchmarks mean little if they aren not relevant to your setup. Systems with slow disks may benefit more from a move to HEAP. Most important to consider is whether you have enough memory to hold the entire table in memory. If at any point you run out, the table will automatically be converted to MyISAM, although this may come at an inconvenient time, and you will want to avoid it. There are a number of ways to manage memory usage and HEAP tables. First, the MySQL variable, max_heap_table_size sets the maximum size a HEAP table can be (before converting it to MyISAM). This variable is set in the my.cnf (or my.ini) file. You can read more about setting the MySQL variables in the article Optimizing MySQL: Hardware and the Mysqld Variables.
The alternative (you can use both methods together) is to specify the MAX_ROWS attribute when creating the table.
mysql> CREATE TABLE heapofleaves (id INT, fname VARCHAR (40)) TYPE=HEAP MAX_ROWS=1;
HEAP tables and indexes
Heap tables have a number of limitations when it comes to the use of indexes. Let’s take a look at this behavior, and compare it with an identical MyISAM table. First, we will create two identical tables, one HEAP, the other MyISAM:
mysql> CREATE TABLE myisam1 (id INT, name VARCHAR(40) NOT NULL, INDEX(name)) TYPE MYISAM; mysql> CREATE TABLE heap1 (id INT, name VARCHAR(40) NOT NULL, INDEX(name)) TYPE=HEAP; mysql> INSERT INTO myisam1 VALUES(1,'ian'),(2,'Mzwake'),(3,'Helena'); mysql> INSERT INTO heap1 VALUES(1,'ian'),(2,'Mzwake'),(3,'Helena');
Now we will perform a simple query, returning a specific record. We will use EXPLAIN to see how the indexes are used (see here for more details on using EXPLAIN, and the index types).
mysql> EXPLAIN SELECT * FROM heap1 WHERE name='ian'; +-------+------+---------------+------+---------+-------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+------+---------+-------+------+------------+ | heap1 | ref | name | name | 40 | const | 10 | where used | +-------+------+---------------+------+---------+-------+------+------------+ mysql> EXPLAIN SELECT * FROM myisam1 WHERE name='ian'; +---------+------+---------------+------+---------+-------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+------+---------------+------+---------+-------+------+------------+ | myisam1 | ref | name | name | 40 | const | 1 | where used | +---------+------+---------------+------+---------+-------+------+------------+
As expected, the use of indexes is identical and efficient. However, when we alter the query to look for a range (all names greater than a specific name), note the differences between the two table types.
mysql> EXPLAIN SELECT * FROM myisam1 WHERE name>'ian'; +---------+-------+---------------+------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+------+---------+------+------+------------+ | myisam1 | range | name | name | 40 | NULL | 2 | where used | +---------+-------+---------------+------+---------+------+------+------------+ mysql> EXPLAIN SELECT * FROM heap1 WHERE name>'ian'; +-------+------+---------------+------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+------+---------+------+------+------------+ | heap1 | ALL | name | NULL | NULL | NULL | 3 | where used | +-------+------+---------------+------+---------+------+------+------------+
The MyISAM table uses a ‘range’ join, while the HEAP table does an ‘ALL’ join, the worst type. HEAP tables only apply an index with the ‘=’ and ‘<=>’ operators. If your application relies heavily on range indexes, HEAP tables may even be slower than MyISAM tables. Let’s look at another important difference:
mysql> EXPLAIN SELECT * FROM myisam1 WHERE name LIKE 'ia%'; +---------+-------+---------------+------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +---------+-------+---------------+------+---------+------+------+------------+ | myisam1 | range | name | name | 40 | NULL | 1 | where used | +---------+-------+---------------+------+---------+------+------+------------+ mysql> EXPLAIN SELECT * FROM heap1 WHERE name LIKE 'ia%'; +-------+------+---------------+------+---------+------+------+------------+ | table | type | possible_keys | key | key_len | ref | rows | Extra | +-------+------+---------------+------+---------+------+------+------------+ | heap1 | ALL | name | NULL | NULL | NULL | 3 | where used | +-------+------+---------------+------+---------+------+------+------------+
HEAP tables can only use whole keys to return results, while MyISAM can use prefixes. In summary. here are the important differences:
- HEAP indexes cannot be built on BLOB or TEXT fields
- HEAP tables cannot use partial keys (prefixes)
- HEAP tables do not support AUTO_INCREMENT fields
- HEAP indexes can only use the ‘=’ and ‘<=>’ operators
- HEAP indexes cannot be used to return an ‘ORDER BY’ result.
- HEAP indexes cannot provide information on how many records are between two results, which would assist the optimizer in its choice of index.
The last three reasons are all related. HEAP tables use hashed indexes, which allow them to pinpoint specific matches extremely quickly, but do not return any kind of range data. An index matches a record, but there is no ordering to allow it to return subsequent records.
HEAP table usage
HEAP tables are automatically used by MySQL when creating temporary tables (unless they contain BLOB or TEXT fields). If the temporary table becomes too big (as determined by max_heap_table_size and tmp_table_size), the temporary table is automatically converted to MyISAM.
HEAP tables can also be used where the table size can fit into the available RAM, and where UPDATEs are infrequent. Since HEAP tables are stored in RAM, if the machine crashes or loses power, all data is lost. For this reason they’re not ideal for holding important changes, but are better suited to providing a subset of another table (which can then record the changes), making it available for fast access. The ideal scenario is one where updates occur to a MyISAM table once a day (or the changes only need to be accessible once a day) allowing a HEAP table to generated from this table for the day’s usage, providing speedy, disk-free access. For example, once a day the table heap2 is created from the table myisam1 (which can contain TEXT/BLOB fields):
mysql> CREATE TABLE heap2 TYPE=HEAP SELECT name FROM myisam1; Query OK, 3 rows affected (0.02 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> SELECT * FROM heap2; +--------+ | name | +--------+ | Helena | | ian | | Mzwake | +--------+ 3 rows in set (0.00 sec) mysql> ALTER TABLE heap2 ADD INDEX(name); Query OK, 3 rows affected (0.00 sec) Records: 3 Duplicates: 0 Warnings: 0 mysql> DESC heap2; +-------+----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+----------+------+-----+---------+-------+ | name | char(40) | | MUL | | | +-------+----------+------+-----+---------+-------+
The table heap2 now has an index (which can be different from that on the original MyiSAM table), and exists in memory, for lightning fast access. In the meantime, the MyISAM table can be continually updated, and not impact on the access speed of the HEAP table at all. These changes will become available to users for fast querying the next day. One small example, but hopefully it gives you some inspiration for using HEAP tables in your own environments. Good luck!