Optimizing the MySQL Query Cache

Introduction

MySQL’s
query cache is an impressive piece of engineering if sometimes
misunderstood.  Keeping it optimized and used efficiently can make a big
difference in the overall throughput of your application, so it’s worth taking
a look under the hood, understanding it, and then keeping it tuned optimally.

How It Works

A lot of
database engines can cache queries, usually this allows them to avoid a hard
parse of the query each time.  The subsequent calls to the queries, even
using different values for bind variables, will use the same cached
query.  The data blocks are cached separately.  MySQL does of course
have various buffer caches for caching blocks, specifically when you’re using InnoDB,
however they are self-contained, and completely separate from the query cache.
 

MySQL’s
query cache works a bit differently.  It caches the parsed query and the
complete result set.  If you have lots of small queries that return small
result sets such as with web-based applications, it can be a real godsend to
have the query cache working for you.

So the
main things to keep in mind when tuning the query cache are (a) the total size
of the query cache (b) the average size of query result sets, and (c) the size
of queries above which you don’t want to keep cached.  Each of these three
things can be measured, and once set – assuming your application footprint
doesn’t change dramatically – should remain fairly static.

Tuning the Query Cache

First,
be sure the query cache is turned on.  The query_cache_type variable should
be set to ON, and the query_cache_size should be non-zero.  Use this
command:


mysql> show variables like ‘query%’;
+——————————+———–+
| Variable_name | Value |
+——————————+———–+
| query_alloc_block_size | 8192 |
| query_cache_limit | 1048576 |
| query_cache_min_res_unit | 1024 |
| query_cache_size | 256000000 |
| query_cache_type | ON |
| query_cache_wlock_invalidate | OFF |
| query_prealloc_size | 8192 |
+——————————+———–+
7 rows in set (0.00 sec)

Recall
that we described above the maximum size query that the query cache will allow
to be cached.  The variable query_cache_limit controls this.  If
there are very large reports with large result sets, you probably don’t want
them to clog up your query cache.  So set that accordingly.

The
biggest problem most untuned query caches will have is either that they (a)
aren’t big enough, or (b) have fragmentation.  

Setting
the query cache size can be done by looking at the query cache hit ratio. 
Whenever MySQL performs a SELECT operation, it either increments com_select or
the qcache_hits status variables.  com_selects thus show us the cache
misses.  


mysql> show status like ‘qcache_hits’;
+—————+————+
| Variable_name | Value |
+—————+————+
| Qcache_hits | 1446235295 |
+—————+————+
1 row in set (0.05 sec)

mysql> show status like ‘com_select’;
+—————+——-+
| Variable_name | Value |
+—————+——-+
| Com_select | 1 |
+—————+——-+
1 row in set (0.04 sec)

So get
the hit ratio by this formula:  qcache_hits / (qcache_hits + com_select)
which gives us .9999 or 99.99%.  So we know we’re big enough.

Fragmentation
is a little more subtle.  Sometimes our query cache is quite large; however,
we’re not getting a high hit ratio.  Check qcache_free_memory status
variable, is it also high?  If so, it could mean fragmentation. 
Imagine, each query that is cached must be a minimum of query_cache_min_res_unit. 
If this is larger than the query’s result set, the rest is wasted memory that can’t
be used by other cached queries.  So you want the query_cache_min_res_unit
to be roughly the average query size, maybe a little larger than average to accommodate
most of the queries, but not too large to be wasting memory for most query
result sets.  

Determine
the average size of your applications cached queries by taking the query cache
used memory and divide by the number of queries in the cache.  The former
is determined by query_cache_size system variable, minus the qcache_free_memory
status variable.  The latter can be found from the qcache_queries_in_cache
variable.  

Since
all of these variables can be changed dynamically, I recommend you make changes
on your running system incrementally, and then rerunning the hit ratios to see
if you’re making a difference.  If you’re unsure about what you’re doing,
you should definitely experiment on development first.  You could do so by
running a MySQL query benchmarking tool, which calls lots of different queries,
and then make changes to the query cache as you monitor the query cache at the
same time.

Exceptions

The
query cache cannot be used in all cases.  If there are queries larger than
query_cache_limit, they will not be cached.  In addition, if you’re using
MySQL 4.0 and earlier, the query cache would not work with transactions, that
is with queries hitting InnoDB tables.  In 4.1 and up, the InnoDB storage
engine indicates to the MySQL kernel whether or not the query cache can be
used.  Essentially, if queries are updating those tables, they’ll hold
row-level locks and invalidate the query cache.  Once a commit is
performed, subsequently the query cache can be used again on those
tables.  This is all handled transparently though, so you don’t really
have to worry about it too much.

Conclusions

MySQL’s
query cache is a powerful technology, and one that when managed well can
improve throughput of your database dramatically.  Once your application
is built, you can take a look at how it uses the database, and tune the query
cache accordingly.  Make it large enough, avoid fragmentation, and keep
out the really really big queries, and you should be able to keep your cache
hit ratios high, and enjoy great performance.

»
See All Articles by Columnist Sean Hull

Sean Hull
Sean Hull
Sean Hull is a business and technology consultant, author and speaker. He is the founder and senior consultant at Heavyweight Internet Group in New York and has worked as a database consultant and expert for ten years. He is the author of "Oracle and Open Source" on O'Reilly & Associates and his articles have appeared in Oracle's own OTN website, DBA Zine, ChangeThis.com, and many others. Using techniques, technologies, and perspectives discussed in the book, articles, and seminars, he has taken his career into exciting areas. He has served as Technology Consultant, Project Manager, CEO, and mentor with a background in programming, systems integration & administration, project development and management. He can be reached at [email protected] or visit http://www.iheavy.com for more info about consulting services.

Latest Articles