Identify Slow Queries using MySQL's Analysis Tools

January 28, 2011

Using MySQL analysis tools, you can identify and optimize slow queries in order to eliminate the bottlenecks that they cause. Rob Gravelle examines the Slow Query Log and Explain command, which can both help identify slow queries.

When you consider the mind-numbing amount of data that is being accessed every day by people around the world you realize that the work that relational databases do is truly miraculous! However, unlike real miracles, those provided by databases like MySQL don't just happen. They need a little help from their designers, developers, and administrators. That's where the MySQL analysis tools come in. Using these tools, you can identify and optimize slow queries in order to eliminate the bottlenecks that they cause. Today we will be examining the Slow Query Log and Explain command, which can both help identify slow queries.

Enabling the Slow Query Log

The slow query log contains all SQL statements that took more than a certain number of seconds to execute.

Some things to keep in mind: First, MySQL writes a statement to the slow query log after it has been executed and all locks have been released, so the log order might be different from execution order. Second, note that the time to acquire the initial table locks is not counted as execution time. Third, queries handled by the query cache are not added to the slow query log, nor are queries that would not benefit from the presence of an index because the table has zero rows or one row.

To enable the slow query log, start the mysqld server program with the options - log-slow-queries[=file_name] and long_query_time options. The long_query_time option specifies the time limit of slow queries in seconds. The minimum long_query_time value is 1 while the default is 10. Likewise, if no file_name is given for --log-slow-queries, the default name is host_name-slow.log. The server creates the file in the data directory unless an absolute path name is given to specify a different one.

Here is the syntax for enabling the Slow Query Log in the MySQL server config file in the mysqld section:

[mysqld]
log-slow-queries[=<path/filename>]
long_query_time=<time-in-seconds>

A practical example:

[mysqld]
long_query_time         = 15
log-slow-queries        = /var/log/mysql/mysql-slow.log

Use the following to set up slow query log via command parameters:

root> ./bin/mysqld_safe --log_slow_queries=/mysql/logs/slow_query.log --long_query_time=20

To simplify examining the queries in a long slow query log, you can use mysqldumpslow command on some versions of MySQL.

Queries that do not use indexes can also be logged in the slow query log by including the -- log-queries-not-using-indexes option.

The EXPLAIN Statement

The EXPLAIN statement provides an analysis of a specified SELECT statement. To use the EXPLAIN statement, simply include the EXPLAIN keyword, followed by the table or SELECT statement, as shown in the following syntax:

EXPLAIN tbl_name
 
Or:
 
EXPLAIN SELECT select_options

In the Use Derived Tables in Your MySQL Queries to Improve Performance article we create a query that tallied the number of bonuses for each employee. Here is what the EXPLAIN statement produces for it:

EXPLAIN
SELECT e.name, 
       e.salary, 
       COUNT(b.bonus_id) AS 'Total Bonuses'
FROM employees e 
     LEFT OUTER JOIN 
(SELECT emp_id, bonus_id  FROM bonuses WHERE YEAR(award_date) = 2009) AS b
 ON e.id = b.emp_id
GROUP BY e.id;

The results:

id

select_type

table

type

possible_keys

key

key_len

ref

rows

Extra

1

PRIMARY

e

ALL

(NULL)

(NULL)

(NULL)

(NULL)

8

Using temporary; Using filesort

1

PRIMARY

<derived2>

ALL

(NULL)

(NULL)

(NULL)

(NULL)

10

2

DERIVED

bonuses

ALL

(NULL)

(NULL)

(NULL)

(NULL)

12

Using where

Each output row from EXPLAIN contains ten columns providing information about one table. Here’s a column by column breakdown of EXPLAIN’s output:

  1. id: The SELECT identifier. This is the sequential number of the SELECT within the query.
  2. select_type: The type of SELECT, which can be any of the values listed below:
  • SIMPLE: Simple SELECT (not using UNION or subqueries).
  • PRIMARY: Outermost SELECT.
  • UNION: Second or later SELECT statement in a UNION.
  • DEPENDENT UNION: Second or later SELECT statement in a UNION, dependent on outer query.
  • UNION RESULT: Result of a UNION.
  • SUBQUERY: First SELECT in subquery.
  • DEPENDENT SUBQUERY: First SELECT in subquery, dependent on outer query.
  • DERIVED: Derived table SELECT (subquery in FROM clause).
  • UNCACHEABLE SUBQUERY: A subquery for which the result cannot be cached and must be re-evaluated for each row of the outer query.
  1. table: The table to which the row of output refers.
  2. type: The join type. This is important information because some join types are preferable to others. Here is a list of the different join types, ordered from most to least desirable:
  • system: A system table containing only one row. This is a special case of the const join type (see next item).
  • const: Result of comparing all parts of a PRIMARY KEY or UNIQUE index to constant values. The table has at most one matching row, which is read at the start of the query. Because there is only one row, values from the column in this row can be regarded as constants by the rest of the optimizer. const tables are very fast because they are read only once.
  • eq_ref: Used for indexed columns that are compared using the equals (=) operator. The comparison value can be a constant or an expression that uses columns from tables that are read before this table. One row is read from this table for each combination of rows from the previous tables. Other than the system and const types, this is the best possible join type. It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE index.
  • ref: Used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key value). All rows with matching index values are read from this table for each combination of rows from the previous tables. If the key that is used matches only a few rows, this is a good join type. ref can be used for indexed columns that are compared using the = or <=> operator.
  • fulltext: The join is performed using a FULLTEXT index.
  • ref_or_null: This join type is like ref, but with the addition that MySQL does an extra search for rows that contain NULL values. This join type optimization is used most often in resolving subqueries.
  • index_merge: This join type indicates that the Index Merge optimization is used. In this case, the key column in the output row contains a list of indexes used, and key_len contains a list of the longest key parts for the indexes used.
  • unique_subquery: An index lookup function that replaces the subquery completely for better efficiency.
  • index_subquery: This join type is similar to unique_subquery. It replaces IN subqueries, but it works for non-unique indexes in subqueries.
  • range: Only rows that are in a given range are retrieved, using an index to select the rows. The key column in the output row indicates which index is used. The key_len contains the longest key part that was used. The ref column is NULL for this type. Range can be used when a key column is compared to a constant using any of the =, <>, >, >=, <, <=, IS NULL, <=>, BETWEEN, or IN operators.
  • index: This join type is the same as ALL (see next item), except that only the index tree is scanned. This usually is faster than ALL because the index file is presumably smaller than the data file. MySQL can use this join type when the query uses only columns that are part of a single index.
  • ALL: A full table scan is done for each combination of rows from the previous tables. This is almost certainly very bad in all cases. You can – and should - avoid ALL by adding indexes that allow row retrieval from the table based on constant values or column values from earlier tables.
  1. possible_keys: Indicates which indexes MySQL can choose from to find the rows in this table. Note that this column is totally independent of the order of the tables as displayed in the output from EXPLAIN, so some of the keys in possible_keys might be unusable with the generated table order.

    A NULL value indicates that there are no relevant indexes. In this case, you may be able to improve the performance of your query by examining the WHERE clause to check whether it refers to some column or columns that would be suitable for indexing. If so, create an appropriate index and rerun the EXPLAIN command.

  2. key: The key column indicates the key (index) that MySQL actually decided to use. If MySQL decides to use one of the possible_keys indexes to look up rows, that index is listed as the key value. It is also possible that key will name an index that is not present in the possible_keys value. This can happen if none of the possible_keys indexes are suitable for looking up rows, but all the columns selected by the query are columns of some other index. That is, the named index covers the selected columns, so although it is not used to determine which rows to retrieve, an index scan is more efficient than a data row scan.

    You can use FORCE INDEX, USE INDEX, or IGNORE INDEX in your query to force MySQL to use or ignore an index listed in the possible_keys column.

  3. key_len: Indicates the length of the key that MySQL decided to use. The shorter the better. The length is NULL if the key column says NULL. The value of key_len helps you determine how many parts of a multiple-part key MySQL actually uses.
  4. ref: The ref column shows which columns or constants are compared to the index
    named in the key column to select rows from the table.
  5. rows: Indicates the number of rows MySQL believes it must examine to execute the query. Large numbers here will almost certainly result in slow execution!
  6. Extra: This column contains additional information about how MySQL resolves the query. If you want to make your queries as fast as possible, you should look out for Extra values of “Using filesort” and “Using temporary”.

Conclusion

MySQL’s Slow Query Log and Explain command are indispensable tools in spotting potential bottlenecks in query execution. Of course, that’s only half the battle. The next step is to optimize slow SELECT queries and target tables to bring performance up to satisfactory levels. That will be addressed more fully in a future article.

» See All Articles by Columnist Rob Gravelle








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers