Identify Slow Queries using MySQL’s Analysis Tools


Using MySQL analysis tools, you can identify and optimize slow queries in order to eliminate the bottlenecks that they cause. Rob Gravelle examines the Slow Query Log and Explain command, which can both help identify slow queries.

When you
consider the mind-numbing amount of data that is being accessed every day by
people around the world you realize that the work that relational databases do
is truly miraculous! However, unlike real miracles, those provided by databases
like MySQL don’t just happen. They need a little help from their designers,
developers, and administrators. That’s where the MySQL analysis tools come
in. Using these tools, you can identify and optimize slow queries in order to
eliminate the bottlenecks that they cause. Today we will be examining the Slow
Query Log and Explain command, which can both help identify slow queries.

Enabling the Slow Query Log

The slow
query log contains all SQL statements that took more than a certain number of
seconds to execute.

Some
things to keep in mind: First, MySQL writes a statement to the slow query log
after it has been executed and all locks have been released, so the log order
might be different from execution order. Second, note that the time to acquire
the initial table locks is not counted as execution time. Third, queries
handled by the query cache are not added to the slow query log, nor are queries
that would not benefit from the presence of an index because the table has zero
rows or one row.

To
enable the slow query log, start the mysqld server program with the options
log-slow-queries[=file_name]
and long_query_time options. The long_query_time
option specifies the time limit of slow queries in seconds. The minimum long_query_time
value is 1 while the default is 10. Likewise, if no file_name is given for –log-slow-queries, the
default name is host_name-slow.log.
The server creates the file in the data directory unless an absolute path name
is given to specify a different one.

Here is
the syntax for enabling
the Slow Query Log in the MySQL server config file in the mysqld section:

[mysqld]
log-slow-queries[=<path/filename>]
long_query_time=<time-in-seconds>

A
practical example:

[mysqld]
long_query_time         = 15
log-slow-queries        = /var/log/mysql/mysql-slow.log

Use the
following to set up slow query log via command parameters:

root> ./bin/mysqld_safe --log_slow_queries=/mysql/logs/slow_query.log --long_query_time=20

To
simplify examining the queries in a long slow query log, you can use mysqldumpslow
command on some versions of MySQL.

Queries
that do not use indexes can also be logged in the slow query log by including
the — log-queries-not-using-indexes option.

The EXPLAIN Statement

The EXPLAIN
statement provides an analysis of a specified SELECT statement. To use the
EXPLAIN statement, simply include the EXPLAIN keyword, followed by the table or
SELECT statement, as shown in the following syntax:

EXPLAIN tbl_name
 
Or:
 
EXPLAIN SELECT select_options

In the Use
Derived Tables in Your MySQL Queries to Improve Performance
article we create
a query that tallied the number of bonuses for each
employee. Here is what the EXPLAIN statement produces for it:

EXPLAIN
SELECT e.name, 
       e.salary, 
       COUNT(b.bonus_id) AS 'Total Bonuses'
FROM employees e 
     LEFT OUTER JOIN 
(SELECT emp_id, bonus_id  FROM bonuses WHERE YEAR(award_date) = 2009) AS b
 ON e.id = b.emp_id
GROUP BY e.id;

The
results:

id

select_type

table

type

possible_keys

key

key_len

ref

rows

Extra

1

PRIMARY

e

ALL

(NULL)

(NULL)

(NULL)

(NULL)

8

Using temporary;
Using filesort

1

PRIMARY

<derived2>

ALL

(NULL)

(NULL)

(NULL)

(NULL)

10

2

DERIVED

bonuses

ALL

(NULL)

(NULL)

(NULL)

(NULL)

12

Using where

Each
output row from EXPLAIN contains ten columns providing information about one
table. Here’s a column by column breakdown of EXPLAIN’s output:

  1. id: The SELECT identifier.
    This is the sequential number of the SELECT within the query.
  2. select_type: The type of SELECT, which
    can be any of the values listed below:
  • SIMPLE: Simple SELECT (not using UNION or subqueries).
  • PRIMARY: Outermost SELECT.
  • UNION: Second or later SELECT statement in a UNION.
  • DEPENDENT UNION: Second or later
    SELECT statement in a UNION, dependent on outer query.
  • UNION RESULT: Result of a UNION.
  • SUBQUERY: First SELECT in
    subquery.
  • DEPENDENT SUBQUERY: First SELECT
    in subquery, dependent on outer query.
  • DERIVED: Derived table SELECT
    (subquery in FROM clause).
  • UNCACHEABLE SUBQUERY: A subquery
    for which the result cannot be cached and must be re-evaluated for each row of
    the outer query.
  1. table: The table to which the
    row of output refers.
  2. type: The join type. This is
    important information because some join types are preferable to others. Here
    is a list of the different join types, ordered from most to least
    desirable:
  • system: A system table containing
    only one row. This is a special case of the const join type (see next
    item).
  • const: Result of comparing all
    parts of a PRIMARY KEY or UNIQUE index to constant values. The table has
    at most one matching row, which is read at the start of the query. Because
    there is only one row, values from the column in this row can be regarded
    as constants by the rest of the optimizer. const tables are
    very fast because they are read only once.
  • eq_ref: Used for indexed columns
    that are compared using the equals (=) operator. The comparison value can
    be a constant or an expression that uses columns from tables that are
    read before this table. One row is read from this table for each
    combination of rows from the previous tables. Other than the system
    and const types, this is the best possible join type. It is used
    when all parts of an index are used by the join and the index is a
    PRIMARY KEY or UNIQUE index.
  • ref: Used if the join uses
    only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key
    value). All rows with matching index values are read from this table for
    each combination of rows from the previous tables. If the key that is
    used matches only a few rows, this is a good join type. ref can be
    used for indexed columns that are compared using the = or <=>
    operator.
  • fulltext: The join is performed
    using a FULLTEXT index.
  • ref_or_null: This join type is like ref,
    but with the addition that MySQL does an extra search for rows that
    contain NULL values. This join type optimization is used most often in resolving
    subqueries.
  • index_merge: This join type indicates
    that the Index Merge optimization is used. In this case, the key
    column in the output row contains a list of indexes used, and key_len
    contains a list of the longest key parts for the indexes used.
  • unique_subquery: An index lookup function
    that replaces the subquery completely for better efficiency.
  • index_subquery: This join type is similar
    to unique_subquery. It replaces IN subqueries, but it works for
    non-unique indexes in subqueries.
  • range: Only rows that are in a
    given range are retrieved, using an index to select the rows. The key
    column in the output row indicates which index is used. The key_len
    contains the longest key part that was used. The ref column
    is NULL for this type. Range can be used when a key column is
    compared to a constant using any of the =, <>, >, >=, <,
    <=, IS NULL, <=>, BETWEEN, or IN operators.
  • index: This join type is the
    same as ALL (see next item), except that only the index
    tree is scanned. This usually is faster than ALL because the index
    file is presumably smaller than the data file. MySQL can use this join
    type when the query uses only columns that are part of a single index.
  • ALL: A full table scan is done for each
    combination of rows from the previous tables. This is almost certainly very
    bad in all cases. You can – and should – avoid ALL by adding
    indexes that allow row retrieval from the table based on constant values
    or column values from earlier tables.
  1. possible_keys: Indicates which indexes
    MySQL can choose from to find the rows in this table. Note that this
    column is totally independent of the order of the tables as displayed in
    the output from EXPLAIN, so some of the keys in possible_keys might
    be unusable with the generated table order.

    A NULL value indicates that there are no relevant indexes. In this case,
    you may be able to improve the performance of your query by examining the
    WHERE clause to check whether it refers to some column or columns that
    would be suitable for indexing. If so, create an appropriate index and rerun
    the EXPLAIN command.

  2. key: The key column
    indicates the key (index) that MySQL actually decided to use. If MySQL
    decides to use one of the possible_keys indexes to look up rows,
    that index is listed as the key value. It is also possible that key will
    name an index that is not present in the possible_keys value. This
    can happen if none of the possible_keys indexes are suitable for
    looking up rows, but all the columns selected by the query are columns of
    some other index. That is, the named index covers the selected columns, so
    although it is not used to determine which rows to retrieve, an index scan
    is more efficient than a data row scan.

    You can use FORCE INDEX, USE INDEX, or IGNORE INDEX in your query to force
    MySQL to use or ignore an index listed in the possible_keys column.

  3. key_len: Indicates the length of
    the key that MySQL decided to use. The shorter the better. The length is
    NULL if the key column says NULL. The value of key_len helps you determine
    how many parts of a multiple-part key MySQL actually uses.
  4. ref: The ref column
    shows which columns or constants are compared to the index
    named in the key column to select rows from the table.
  5. rows: Indicates the number of
    rows MySQL believes it must examine to execute the query. Large numbers
    here will almost certainly result in slow execution!
  6. Extra: This column contains
    additional information about how MySQL resolves the query. If you want to
    make your queries as fast as possible, you should look out for Extra
    values of “Using filesort” and “Using temporary”.

Conclusion

MySQL’s Slow
Query Log and Explain command are indispensable tools in spotting potential
bottlenecks in query execution. Of course, that’s only half the battle. The
next step is to optimize slow SELECT queries and target tables to bring
performance up to satisfactory levels. That will be addressed more fully in a future
article.

»


See All Articles by Columnist

Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Latest Articles