Using MySQL's Analysis Tools to Improve Query Performance

February 18, 2011

MySQL’s Explain Extended Explained

 

One nice feature added to the EXPLAIN statement in MySQL 4.1 is the EXTENDED keyword which provides you with some helpful additional information on query optimization. It should be used together with SHOW WARNINGS to get information about how query looks after transformation as well as what other notes the optimizer may wish to tell us. This article will demonstrate how to use Explain Extended with Show Warnings to fine tune your Select statements as well as how to rev up performance even more using optimizer hints and SHOW STATUS.

What the Extended Keyword Adds

 

While it may look like a regular Explain statement, behind the scenes, MySQL engineers the SQL statement into its optimized form. Using Show Warnings afterwards prints out the optimized Select statement. Examining it can give us a better idea of how MySQL is interpretting our SQL. Here’s an example:

 

The following query could be used to find how many employees show up in the bonuses table. That is to say, how many employees have received bonuses:

 

SELECT COUNT(*)

FROM employees

WHERE id IN (SELECT emp_id FROM bonuses);

 

Adding the EXPLAIN EXTENDED prefix to the above statement will execute the statement behind the scenes so that the compiler optimizations can be analyzed:

 

EXPLAIN EXTENDED

SELECT COUNT(*)

FROM employees

WHERE id IN (SELECT emp_id FROM bonuses);

 

We can study the following table to glean valuable insight into query execution:

 

id

select_type

table

type

possible_keys

key

key_len

ref

rows

filtered

Extra

1

PRIMARY

emp

index

(NULL)

PRIMARY

4

(NULL)

8

100

Using where; Using index

2

DEPENDENT SUBQUERY

bonuses

index_subquery

emp_id

emp_id

4

func

1

100

Using index

 

This table is very much like the one produced by the regular EXPLAIN except for the added filtered column in the second last position. The filtered column, added in MySQL 5.1.12, indicates an estimated percentage of table rows that will be filtered by the table condition. Hence, the rows column shows the estimated number of rows examined and rows × filtered / 100 calculates the number of rows that will be joined with previous tables.

 

Applying EXPLAIN EXTENDED to our query gives us the opportunity to run the Show Warnings statement afterwards to see final optimized query:

 

 

SHOW WARNINGS;

 

select count(0) AS `COUNT(*)`

from `operations`.`employees` `emp`

where <in_optimizer>(`operations`.`emp`.`id`,

<exists>(<index_lookup>(<cache>(`operations`.`emp`.`id`)

in bonuses on emp_id)))

 

It shows that the query engine is using <in_optimizer> and converting the Where clause into an <exists> check using an <index_lookup >. There is some caching being done as well.

 

The EXISTS clause provides a simple way to find intersection between tables. However, the way that EXISTS executes tends to take more time than other methods because each row from the outer query result is compared against the inner query, meaning that the inner query is executed as many times as there are rows in the outer query result.

 

A reasonable goal then to improve performance would be to get rid of the EXISTS. While the EXPLAIN EXTENDED does not tell us how to write our query, a good place to start is to remove the nested subquery in the WHERE clause:

 

EXPLAIN EXTENDED

SELECT COUNT(DISTINCT emp.id)

FROM employees emp

INNER JOIN bonuses b on emp.id = b.emp_id;

 

After executing the revised statement, SHOW WARNINGS produces a very different and simpler output:

 

SHOW WARNINGS;

 

select count(distinct `operations`.`emp`.`id`) AS `COUNT(DISTINCT emp.id)`

from `operations`.`employees` `emp`

join `operations`.`bonuses` `b`

where (`operations`.`b`.`emp_id` = `operations`.`emp`.`id`)

 

 

Using Optimizer Hints

Optimizer hints are instructions that you can embed in your SQL that instruct or suggest to the optimizer how you would like the SQL to be executed. They should be used sparingly and with caution! Hints limit the choices the optimizer has available, so if data in the tables change or new indexes are created, MySQL may be unable to adapt because of your hints. The best time to use them is in situations where you discover that the optimizer has made a poor choice.

 

The following able lists the commonly used optimizer hints:

 

MySQL optimizer hints

Hint

Placement

Description

STRAIGHT_JOIN

After the SELECT clause

Forces the optimizer to join the tables in the order in which they appear in the FROM clause. Use this if you want to force tables to be joined in a particular order.

USE INDEX(index [,index...])

After a table name in the FROM clause

Instructs MySQL to only consider using the indexes listed. MySQL may choose to use none of the indexes if it calculates that using them would not be faster than scanning the entire table.

FORCE INDEX(index [,index...])

After a table name in the FROM clause

Instructs MySQL to use one of the indexes listed. This differs from USE INDEX in that MySQL is instructed not to perform a table scan of the data unless it is impossible to use any of the indexes listed.

IGNORE INDEX(index [,index...])

After a table name in the FROM clause

Instructs MySQL not to consider any of the listed indexes when working out the execution plan.

 

Measuring Query Execution with SHOW STATUS

Whenever we execute a SQL statement, the elapsed time taken to execute the statement is reported:

 

mysql>SELECT * FROM ORDERS;

Query OK, 0 rows affected (8.25 sec)

 

While the elapsed time is a good general measurement of SQL performance, it can be unreliable. For instance, it can vary depending on how many other jobs are running on the server. Cached data on the operating system can also affect execution time due to its effect on the number of required physical I/O reads. For these reasons, it is a good idea to obtain performance metrics using the SHOW STATUS statement.

 

The level of detail will vary depending on the storage engine. InnoDB currently offers the most comprehensive selection of statistics, with over 130 variables. Displaying all of them would create a list which scrolls for many pages, so it’s best to filter the list by adding a LIKE clause:

 

mysql> SHOW STATUS LIKE 'threads%';

 

Variable_name

Value

Threads_cached

0

Threads_connected

1

Threads_created

1

Threads_running

1

 

Each of the statistics has a special meaning in MySQL. Here are some of the more important ones:

  • Innodb_buffer_pool_read_requests: Number of requests from the InnoDB buffer pool
  • Innodb_data_reads: Number of blocks from disk that InnoDB had to read to execute the query.
  • Innodb_rows_read: Number of rows read by InnoDB to satisfy the query.
  • Last_query_cost: Optimizer's "cost" estimate for the last SQL executed.
  • Sort_rows: Number of rows that had to be sorted.
  • Sort_merge_passes: Number of disk sort "merge runs" that had to be performed.

 

 

Innodb_buffer_pool_read_requests is probably the most useful of the lot for determining the outcome of our SQL tuning efforts. When combined with Explain Extended, Show Warnings, and optimizer hints, you should be able to see dramatic performance improvements in many Select queries.

 

 

 

See all articles by Rob Gravelle








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers