Last month we looked at
the HEAP
table type, a table type which runs entirely in memory. This month we look at
setting up the InnoDB table type, the type of most interest to serious users.
The standard MyISAM table type is ideal for website use, where there are many
reads in comparison to writes, and no transactions. Where these conditions do
not apply (and besides websites, they do not apply often in the database
world), the InnoDB table is likely to be the table type of choice. This article
is aimed at users who are familiar with MySQL, but have only used the default MyISAM
table type.
InnoDB Features
-
ACID-compliant
transactions. -
Full
referential integrity -
Row-level
locking -
Tables are
stored in a tablespace (unlike MyISAM tables where each table is a file)
What is an ACID-compliant transaction?
Database requests
(selects, updates or inserts) happen linearly, one after the other. When one
user has a related set of requests, or there are many users working at the same
time, it is important that the results remain consistent. A transaction
is a set of related SQL statements making up one logical whole, for example, a
sales transaction could consist of updates to customer, sales and product
tables. A transaction is said to be committed when it is completed. To
maintain data integrity, there are four conditions which must apply to
transactions, encapsulated by the acronym ACID. These are Atomicity, Consistency,
Isolation and Durability. Very briefly, atomicity means that the
whole transaction must complete, and if this is not possible, none of the
individual statements must be carried out. Consistency refers to the state of
the data, and the rules to ensure this state is maintained. For example, each
invoice may have to relate to a customer. During a transaction, these rules can
be broken (an invoice could be inserted before the customer), but once the
transaction is complete, the consistency must be restored. Isolation means that
data changed during one transaction cannot be used by another until the first
transaction is complete. Two transactions trying to reduce a bank balance
cannot both work off the same balance (if you had two transactions removing
$100 from a $1000 account, the second must obviously work off a $900 balance,
once the first is complete.) Each transaction works in isolation. Durability
means that once a transaction has been committed, the data remains consistent. Therefore,
if the database crashes in the middle of a transaction, a restore will return
the data to the situation it was in as of the most recently committed
transaction.
Creating InnoDB tables
Being a more complex
table type, InnoDB tables require slightly more administration savvy than the
default MyISAM tables. InnoDB tables are created in a tablespace, which
is not the same as the MyISAM table, that being simply a file on the filesystem.
Assuming you are running a stable version of MySQL 4 (4.0.15 is the latest at
the time of writing), MySQL automatically creates a file called ibdata1
in the data directory (datadir, usually C:MYSQLdata on Windows
systems, or /usr/local/mysql/data or /usr/local/var on Unix-based
systems). This file begins at 10MB, but is auto-extending, meaning it grows
larger (in 8MB blocks) as your data expands. Early versions of MySQL 4 set the
table to be 64MB, but not auto-extending, meaning when that was full, you would
be unable to add more data).
With all this happening
automatically, all that is left to create an InnoDB table is simply the CREATE
statement. Here is an example:
CREATE TABLE innodb_table1(field1 INT, field2 CHAR(10), INDEX (field1)) TYPE=INNODB;
The syntax is identical
to what you are probably used to, but with the type specified at the end.
InnoDB configuration options
There are a number of
important configuration variables to consider. All of these are set in the MySQL
configuration file (usually my.cnf or my.ini, and located in the
data directory). The most important of these is innodb_data_file_path,
which specifies the space available to the tables (including both data and
indexes). An example:
innodb_data_file_path = ibdata1:10M:autoextend
This is the default
setting, indicating that there is an initial file, ibdata1 of 10MB,
which is auto-extending. You can set data files to appear on different disks as
well. For example:
innodb_data_file_path = /disk1/ibdata1:500M;/disk2/ibdata2:300MB;/disk3/ibdata3:100MB:autoextend
You will usually want to
make sure that the final file is auto-extending, just to cover you if you run
out of ‘logical’ space, but of course, nothing covers you if the physical disk
runs out of space. If you want to use absolute file paths in this parameter,
you need to set the value of innodb_data_home_dir to an empty string. To
make any changes to your configuration file take effect, you will need to
restart the server.
Here is a list of the
other InnoDB configuration variables, with a brief description of their
function:
-
innodb_data_home_dir:
Where InnoDB data files are stored. If left blank, this will be the same as datadir,
and if set to an empty string, you can use absolute file paths in innodb_data_file_path. -
innodb_mirrored_log_groups:
Should be set to 1 (the number of identical copies of log groups kept for the
database.) -
innodb_log_group_home_dir:
Where InnoDB log files are kept (If this is not set, datadir is used. -
innodb_log_files_in_group:
Number of log files in the log group (logs are written in rotation). Defaults
to 2, which is usually sufficient. -
innodb_log_file_size:
Size in megabytes of each log file. Defaults to 5MB. The larger it is, the
slower a recovery after a crash will be, but marginal disk I/O will be saved. -
innodb_log_buffer_size:
Size of the buffer used to write log files. The larger it is, the less need to
write to disk in the middle of a transaction. 8MB is suggested. -
innodb_flush_log_at_trx_commit:
0, 1 or 2. Should almost always be set to 1 for safety reasons, which writes
the log to disk and flushes the disk when a transaction is committed. 0 does
this about once a second, while 2 writes the log to disk immediately, but
flushes the disk about once a second. -
innodb_log_arch_dir:
If log archiving was used, this would be the directory where these files are
stored. Should currently be the same as innodb_log_group_home_dir. -
innodb_log_archive:
Set to 0, as MySQL recovers using its own log files. -
innodb_buffer_pool_size:
Size in bytes of the memory buffer used to cache table data and indexes. The
larger the better if you can spare it (up to 80% on dedicated database
servers). -
innodb_buffer_pool_awe_mem_mb:
Used from MySQL 4.1, and on Windows only, this variable sets the size of the
buffer pool placed in Address Windowing Extension (AWE) memory (maximum 64000). -
innodb_additional_mem_pool_size:
Size in bytes of a memory pool that stores information about the internal data
structures. 2MB is suggested initially, but if you see warning messages in the MySQL
error log about allocating memory from the operating system, you should
increase this. -
innodb_file_io_threads:
Number of file I/O threads in InnoDB. 4 is suggested on Unix systems, often
higher for Windows systems. -
innodb_lock_wait_timeout:
Time in seconds an InnoDB transaction waits before rolling back (used only in
case of external deadlocks, such as from LOCK TABLES statements) -
innodb_flush_method:
Flushing method (default fdatasync,
which is usually faster, although on some systems O_DSYNC can be
faster). -
innodb_force_recovery:
Only set this when you want to dump data from a corrupt database! See the
manual for more if you need to use this option.
If something goes wrong
When you start the
server, MySQL creates the tablespace, data files and log files. If something
goes wrong, the problem is usually one of two things: permissions, or a syntax
error in the configuration file. Make sure all directories you are trying to
store data and logs in already exist, and that the server is assigned
permission to write in them. Delete all files created in a failed startup
before trying again.
In future articles we
will look in more detail at some of the options available when using InnoDB
tables, but now that you know how to create them, the best way to find out more
is to begin experimenting – on a development server of course! Good luck.