Using MySQL’s Analysis Tools to Improve Query Performance

MySQL’s Explain Extended
Explained

 

One nice feature
added to the EXPLAIN statement in MySQL 4.1 is the EXTENDED keyword which
provides you with some helpful additional information on query optimization. It
should be used together with SHOW
WARNINGS to get information about how query looks after transformation as well
as what other notes the optimizer may wish to tell us. This article will demonstrate how to use
Explain Extended with Show Warnings to fine tune your Select statements as well
as how to rev up performance even more using optimizer hints and SHOW STATUS.

What the Extended Keyword Adds

 

While
it may look like a regular Explain statement, behind the scenes, MySQL engineers
the SQL statement into its optimized form.
Using Show Warnings afterwards prints out the optimized Select
statement. Examining it can give us a
better idea of how MySQL is interpretting our SQL.
Here’s an example:

 

The
following query could be used to find how
many employees show up in the bonuses table. That is to say, how many employees
have received bonuses:

 

SELECT COUNT(*)

FROM employees

WHERE id IN (SELECT
emp_id FROM bonuses);

 

Adding the EXPLAIN EXTENDED prefix to the above statement
will execute the statement behind the scenes so that the compiler optimizations
can be analyzed:

 

EXPLAIN EXTENDED

SELECT COUNT(*)

FROM employees

WHERE id IN (SELECT
emp_id FROM bonuses);

 

We can study the following table to glean valuable insight
into query execution:

 

id

select_type

table

type

possible_keys

key

key_len

ref

rows

filtered

Extra

1

PRIMARY

emp

index

(NULL)

PRIMARY

4

(NULL)

8

100

Using where; Using index

2

DEPENDENT SUBQUERY

bonuses

index_subquery

emp_id

emp_id

4

func

1

100

Using index

 

This
table is very much like the one produced by the regular EXPLAIN except for the
added filtered column in the second
last position.
The
filtered column, added
in MySQL 5.1.12, indicates an estimated percentage of table rows that will be
filtered by the table condition. Hence, the rows column shows the estimated
number of rows examined and rows × filtered / 100 calculates the number
of rows that will be joined with previous tables.

 

Applying
EXPLAIN EXTENDED to our query gives us the opportunity
to run the
Show
Warnings statement afterwards to see final optimized query:

 

 

SHOW WARNINGS;

 

select count(0) AS `COUNT(*)`

from `operations`.`employees` `emp`

where
<in_optimizer>(`operations`.`emp`.`id`,

<exists>(<index_lookup>(<cache>(`operations`.`emp`.`id`)


in bonuses on emp_id)))

 

It shows that the query engine is using <in_optimizer>
and converting the Where clause into an <exists> check using an
<index_lookup >. There is some caching being done as well.

 

The EXISTS clause
provides a simple way to find intersection between tables. However, the way that EXISTS executes tends
to take more time than other methods because
each
row from the outer query result is compared against the inner query, meaning
that the inner query is executed as many times as there are rows in the outer
query result.

 

A
reasonable goal then to improve performance would be to get rid of the
EXISTS. While the EXPLAIN EXTENDED does
not tell us how to write our query, a good place to start is to remove the nested
subquery in the WHERE clause:

 

EXPLAIN EXTENDED

SELECT COUNT(DISTINCT emp.id)

FROM employees emp

INNER JOIN bonuses b on emp.id = b.emp_id;

 

After
executing the revised statement, SHOW
WARNINGS produces a very different and simpler output:

 

SHOW WARNINGS;

 

select count(distinct `operations`.`emp`.`id`)
AS `COUNT(DISTINCT emp.id)`

from `operations`.`employees` `emp`

join `operations`.`bonuses` `b`

where (`operations`.`b`.`emp_id` = `operations`.`emp`.`id`)

 

 

Using Optimizer Hints

Optimizer hints are instructions that you can embed in your
SQL that instruct or suggest to the optimizer how
you would like the SQL to be executed.
They should be used sparingly and with caution! Hints limit the choices
the optimizer has available, so if data in the tables change or new indexes are
created, MySQL may be unable to adapt because of your hints. The best time to
use them is in situations where you discover that the optimizer has made a poor
choice.

 

The following able lists
the commonly used optimizer hints:

 

MySQL optimizer hints

Hint

Placement

Description

STRAIGHT_JOIN

After the SELECT clause

Forces the optimizer to join the
tables in the order in which they appear in the
FROM clause. Use this if you want to force tables to be
joined in a particular order.

USE INDEX(index [,index…])

After a table name in the FROM clause

Instructs MySQL to only consider
using the indexes listed. MySQL may choose to use none of the indexes if it
calculates that using them would not be faster than scanning the entire
table.

FORCE INDEX(index [,index…])

After a table name in the FROM clause

Instructs MySQL to use one of
the indexes listed. This differs from
USE INDEX in that MySQL is instructed not to perform a table scan
of the data unless it is impossible to use any of the indexes listed.

IGNORE INDEX(index [,index…])

After a table name in the FROM clause

Instructs MySQL not to consider
any of the listed indexes when working out the execution plan.

 

Measuring Query Execution with SHOW
STATUS

Whenever we execute
a SQL statement, the elapsed time taken to execute the statement is reported:

 

mysql>SELECT * FROM ORDERS;

Query OK, 0 rows affected (8.25 sec)

 

While the elapsed
time is a good general measurement of SQL performance, it can be unreliable.
For instance, it can vary depending on how many other jobs are running on the server.
Cached data on the operating system can also affect execution time due to its
effect on the number of required physical I/O reads. For these reasons, it is a good idea to
obtain performance metrics using the
SHOW
STATUS statement.

 

The level of detail
will vary depending on the storage engine. InnoDB currently offers the most
comprehensive selection of statistics, with
over 130 variables.
Displaying all of them would create a list which scrolls for many pages, so
it’s best to filter the list by adding a LIKE clause:

 

mysql> SHOW
STATUS LIKE ‘threads%’;

 

Variable_name

Value

Threads_cached

0

Threads_connected

1

Threads_created

1

Threads_running

1

 

Each of the
statistics has a special meaning in MySQL. Here are some of the more important ones:

  • Innodb_buffer_pool_read_requests: Number of
    requests from the InnoDB buffer pool
  • Innodb_data_reads: Number of blocks from disk that InnoDB had
    to read to execute the query.
  • Innodb_rows_read: Number of rows read by InnoDB to satisfy
    the query.
  • Last_query_cost: Optimizer’s "cost" estimate for
    the last SQL executed.
  • Sort_rows: Number of rows that had to be sorted.
  • Sort_merge_passes: Number of disk sort "merge runs"
    that had to be performed.

 

 

Innodb_buffer_pool_read_requests is probably the most useful of the lot for
determining the outcome of our SQL tuning efforts. When combined with
Explain Extended, Show Warnings, and optimizer
hints, you should be able to see dramatic performance improvements in many
Select queries.

 

 

 

See all articles by Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Latest Articles