Maximizing Query Performance through Column Indexing in MySQL

Indexes are a
feature that you can enable on your MySQL tables to increase performance, but
they do have some downsides. When you create a new index, MySQL builds a
separate block of information that needs to be updated every time there are
changes made to the table. This means that if you are constantly updating,
inserting and removing entries in your table this could have a negative impact
on performance. Read on as we review some of the best practices for achieving
the right balance between query and update speed through column indexing in
MySQL.

How Column Indexing Works

Indexes are a
type of lookup table that is used to improve the performance of SELECT queries. Without an index, MySQL has to start with
the first record and then read the whole table to find the relevant rows.
Conversely, if the table has an index for the columns in question, MySQL can
quickly find the data without having to look at all the rows in a table.

Indexes
can consist of one or more fields up to a total of 16 columns. For certain data
types, you can even index a prefix of the column. One way to think about
multiple-column indexes is like a sorted array containing values that are
created by concatenating the values of the indexed columns. For this reason,
indexes on multiple fields are also known as “concatenated indexes”.

Here’s an
example showing how the database would use indexing in performing a query. Given
the following SELECT
statement:


SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;

The simplest case
scenario is that a multiple-column index exists on col1 and col2; in that case,
the appropriate rows can be fetched directly.

Another
possibility is that separate single-column indexes exist on col1 and col2;
here, the query optimizer tries to find the most restrictive index by deciding
which index will find fewer rows and using that index to fetch them.

When a table has a
multiple-column index, any leftmost prefix of the index can be used by the
optimizer to find rows. For example, if you have a three-column index on (col1,
col2, col3), you have indexed search capabilities on (col1), (col1, col2), and
(col1, col2, col3). Hence, it can pay off to create your
multiple-column indexes starting with the most restrictive to the least
restrictive fields.

By "pay
off", I mean this: if a table has 1,000 rows, this is at least 100 times
faster than reading sequentially!

Creating Single and Multiple Column Indexes

You
can specify a column to index by using the INDEX statement at the time that you
create a new MySQL table. Alternatively, you can also add a single index to an
existing table by using the CREATE INDEX clause.

In
the following example, we are defining a new table called “blog”, which will
contain two fields: blogID and post_text. To add an index to the blogID field,
we add the INDEX (ID field) clause:


CREATE TABLE blog (
blogID INT,
post_text TEXT,
INDEX (blogID)
)

Had we not
defined the index at creation time, we could always add one later, using the
following syntax:


CREATE INDEX id_index ON blog(blogID);

Notice that the
“Create Index” clause requires an index name as well. The “ALTER TABLE”
command also supports the adding of an index:


ALTER TABLE blog ADD INDEX(blogID);

We can create
multiple-column indexes, or concatenated indexes, the same way:


CREATE TABLE blog (
blogID INT,
post_text TEXT,
auth_id INT,
INDEX (blogID, auth_id)
)

To add the
indexes on a pre-existing table, use the following syntax:


CREATE INDEX id_index ON blog(blogID, auth_id);

OR


ALTER TABLE blog ADD INDEX(blogID, auth_id);

Indexing Partial Fields (Partial Index)

Indexes
can be created that use only the leading part of column values, using “col_name(length)” syntax to specify an index prefix length.
Prefixes can be specified for CHAR, VARCHAR, BINARY, BLOB,
TEXT, and VARBINARY columns:


CREATE INDEX index_name ON table_name (field_name(prefix_length));

The
statement shown here creates an index of the post table using the first 10
characters of the post_text column:


CREATE INDEX post_index ON post (post_text(10));

Potential
advantages to using partial indexes is that if the values in the indexed field
tend to differ in the first 10 characters, this index should not be much slower
than an index created from the entire post_text column. At the same time, using
column prefixes for indexes can make the index file much smaller, which could
save a lot of disk space and may also speed up table inserts.

Partial
indexes are associated with the Like statement, as in:


SELECT id, first_name, last_name, user_name
FROM t1
WHERE last_name like ‘Rob G%’;

Note that
changing the expression to somethng like “%Rob G%” will lose the benefit of the
index because it searches the middle of the field.

You can change
an existing index on a table using the ALTER TABLE syntax, which we saw above:


ALTER TABLE t1
DROP INDEX last_name,
ADD INDEX (last_name(10));

Some Indexing Tips

  • Create
    an index on the field that has the biggest variety of values first. This will
    result in the “most bang for your buck”, so to speak.
  • Keep
    indexes small. It’s better to have an index on just zip code or postal code,
    rather than postal code & country. The smaller the index, the better the
    response time.
  • For
    high frequency functions (thousands of times per day) it can be wise to have a
    very large index, so the system does not even need the table for the read
    function.
  • For
    small tables an index may be disadvantageous. The same can also be said for any
    function where the system would be better off by scanning the whole table.
  • · Remember that an index slows down additions, modifications and
    deletes because the indexes need to be updated whenever the table is. So, the
    best practice is to add an index for values that are often used for a search,
    but that do not change much. As such, an index on a bank account number is
    better than one on the balance.

Other Index Types

MySQL supports a
few other index types, depending on the database engine used. One example is a FULLTEXT index,
which is used for keyword searches in stored documents. Only the MyISAM storage
engine supports FULLTEXT indexes and only for CHAR, VARCHAR, and TEXT columns. Indexing always takes
place over the entire column and column prefix indexing is not supported. You
can also create special indexes on spatial data types, called R-Trees. Again,
only MyISAM supports R-tree indexes on spatial types. Other storage engines use
B-trees for indexing spatial types. The MEMORY storage engine uses HASH indexes
by default, but also supports B-Tree indexes.

Having
seen how indexes can negatively impact table inserts and updates, we now know
that one has to take great care in deciding which fields to index. Moreover,
choosing between single and mutiple-field indexes can be a daunting task. In
the latter, field ordering also plays a critical role. Hopefully, this article
provided you with some helpful guidelines in deciding on which fields to index
and when to do so.

»


See All Articles by Columnist
Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Latest Articles