Table types in MySQL: Part 2 – InnoDB tables

Last month we looked at
the HEAP
table type, a table type which runs entirely in memory. This month we look at
setting up the InnoDB table type, the type of most interest to serious users.
The standard MyISAM table type is ideal for website use, where there are many
reads in comparison to writes, and no transactions. Where these conditions do
not apply (and besides websites, they do not apply often in the database
world), the InnoDB table is likely to be the table type of choice. This article
is aimed at users who are familiar with MySQL, but have only used the default MyISAM
table type.

InnoDB Features

  • ACID-compliant
    transactions.

  • Full
    referential integrity

  • Row-level
    locking

  • Tables are
    stored in a tablespace (unlike MyISAM tables where each table is a file)

What is an ACID-compliant transaction?

Database requests
(selects, updates or inserts) happen linearly, one after the other. When one
user has a related set of requests, or there are many users working at the same
time, it is important that the results remain consistent. A transaction
is a set of related SQL statements making up one logical whole, for example, a
sales transaction could consist of updates to customer, sales and product
tables. A transaction is said to be committed when it is completed. To
maintain data integrity, there are four conditions which must apply to
transactions, encapsulated by the acronym ACID. These are Atomicity, Consistency,
Isolation and Durability. Very briefly, atomicity means that the
whole transaction must complete, and if this is not possible, none of the
individual statements must be carried out. Consistency refers to the state of
the data, and the rules to ensure this state is maintained. For example, each
invoice may have to relate to a customer. During a transaction, these rules can
be broken (an invoice could be inserted before the customer), but once the
transaction is complete, the consistency must be restored. Isolation means that
data changed during one transaction cannot be used by another until the first
transaction is complete. Two transactions trying to reduce a bank balance
cannot both work off the same balance (if you had two transactions removing
$100 from a $1000 account, the second must obviously work off a $900 balance,
once the first is complete.) Each transaction works in isolation. Durability
means that once a transaction has been committed, the data remains consistent. Therefore,
if the database crashes in the middle of a transaction, a restore will return
the data to the situation it was in as of the most recently committed
transaction.

Creating InnoDB tables

Being a more complex
table type, InnoDB tables require slightly more administration savvy than the
default MyISAM tables. InnoDB tables are created in a tablespace, which
is not the same as the MyISAM table, that being simply a file on the filesystem.
Assuming you are running a stable version of MySQL 4 (4.0.15 is the latest at
the time of writing), MySQL automatically creates a file called ibdata1
in the data directory (datadir, usually C:MYSQLdata on Windows
systems, or /usr/local/mysql/data or /usr/local/var on Unix-based
systems). This file begins at 10MB, but is auto-extending, meaning it grows
larger (in 8MB blocks) as your data expands. Early versions of MySQL 4 set the
table to be 64MB, but not auto-extending, meaning when that was full, you would
be unable to add more data).

With all this happening
automatically, all that is left to create an InnoDB table is simply the CREATE
statement. Here is an example:

CREATE TABLE innodb_table1(field1 INT, field2 CHAR(10), INDEX (field1)) 
TYPE=INNODB; 

The syntax is identical
to what you are probably used to, but with the type specified at the end.

InnoDB configuration options

There are a number of
important configuration variables to consider. All of these are set in the MySQL
configuration file (usually my.cnf or my.ini, and located in the
data directory). The most important of these is innodb_data_file_path,
which specifies the space available to the tables (including both data and
indexes). An example:

innodb_data_file_path = ibdata1:10M:autoextend 

This is the default
setting, indicating that there is an initial file, ibdata1 of 10MB,
which is auto-extending. You can set data files to appear on different disks as
well. For example:

innodb_data_file_path = 
/disk1/ibdata1:500M;/disk2/ibdata2:300MB;/disk3/ibdata3:100MB:autoextend 

You will usually want to
make sure that the final file is auto-extending, just to cover you if you run
out of ‘logical’ space, but of course, nothing covers you if the physical disk
runs out of space. If you want to use absolute file paths in this parameter,
you need to set the value of innodb_data_home_dir to an empty string. To
make any changes to your configuration file take effect, you will need to
restart the server.

Here is a list of the
other InnoDB configuration variables, with a brief description of their
function:

  • innodb_data_home_dir:
    Where InnoDB data files are stored. If left blank, this will be the same as datadir,
    and if set to an empty string, you can use absolute file paths in innodb_data_file_path.

  • innodb_mirrored_log_groups:
    Should be set to 1 (the number of identical copies of log groups kept for the
    database.)

  • innodb_log_group_home_dir:
    Where InnoDB log files are kept (If this is not set, datadir is used.

  • innodb_log_files_in_group:
    Number of log files in the log group (logs are written in rotation). Defaults
    to 2, which is usually sufficient.

  • innodb_log_file_size:
    Size in megabytes of each log file. Defaults to 5MB. The larger it is, the
    slower a recovery after a crash will be, but marginal disk I/O will be saved.

  • innodb_log_buffer_size:
    Size of the buffer used to write log files. The larger it is, the less need to
    write to disk in the middle of a transaction. 8MB is suggested.

  • innodb_flush_log_at_trx_commit:
    0, 1 or 2. Should almost always be set to 1 for safety reasons, which writes
    the log to disk and flushes the disk when a transaction is committed. 0 does
    this about once a second, while 2 writes the log to disk immediately, but
    flushes the disk about once a second.

  • innodb_log_arch_dir:
    If log archiving was used, this would be the directory where these files are
    stored. Should currently be the same as innodb_log_group_home_dir.

  • innodb_log_archive:
    Set to 0, as MySQL recovers using its own log files.

  • innodb_buffer_pool_size:
    Size in bytes of the memory buffer used to cache table data and indexes. The
    larger the better if you can spare it (up to 80% on dedicated database
    servers).

  • innodb_buffer_pool_awe_mem_mb:
    Used from MySQL 4.1, and on Windows only, this variable sets the size of the
    buffer pool placed in Address Windowing Extension (AWE) memory (maximum 64000).

  • innodb_additional_mem_pool_size:
    Size in bytes of a memory pool that stores information about the internal data
    structures. 2MB is suggested initially, but if you see warning messages in the MySQL
    error log about allocating memory from the operating system, you should
    increase this.

  • innodb_file_io_threads:
    Number of file I/O threads in InnoDB. 4 is suggested on Unix systems, often
    higher for Windows systems.

  • innodb_lock_wait_timeout:
    Time in seconds an InnoDB transaction waits before rolling back (used only in
    case of external deadlocks, such as from LOCK TABLES statements)

  • innodb_flush_method:
    Flushing method (default fdatasync,
    which is usually faster, although on some systems O_DSYNC can be
    faster).

  • innodb_force_recovery:
    Only set this when you want to dump data from a corrupt database! See the
    manual for more if you need to use this option.

If something goes wrong

When you start the
server, MySQL creates the tablespace, data files and log files. If something
goes wrong, the problem is usually one of two things: permissions, or a syntax
error in the configuration file. Make sure all directories you are trying to
store data and logs in already exist, and that the server is assigned
permission to write in them. Delete all files created in a failed startup
before trying again.

In future articles we
will look in more detail at some of the options available when using InnoDB
tables, but now that you know how to create them, the best way to find out more
is to begin experimenting – on a development server of course! Good luck.

»


See All Articles by Columnist
Ian Gilfillan

Ian Gilfillan
Ian Gilfillan
Ian Gilfillan lives in Cape Town, South Africa. He is the author of the book 'Mastering MySQL 4', published by Sybex, and has been working with MySQL since 1997. These days he develops mainly in PHP and MySQL, although confesses to starting out with BASIC and COBOL, way back when, and still has a soft spot for Perl. He developed South Africa's first online grocery store, and has developed and taught internet development and other technical courses for various institutions. He has majors in Programming and Information Systems, as well as English and Philosophy. For 5 years he was Lead Developer and IT Manager for Independent Online, South Africa's premier news portal. However, he has now 'retired' from fulltime work, and is hoping that his next book will be more in the style of William Blake and Allen Ginsberg.

Latest Articles