MySQL’s Query Cache

A typical scenario

Boss: Our new website is crawling! How
can it be, we have four state-of-the-art web servers – what’s the problem?
You: Well, the web servers are fine – it’s the database server that’s
struggling.
Boss: What? You told me this MySQL thing was fast, that we didn’t need
Oracle, and now you say it can’t cope! How can this be?
You: Well, the web servers are behaving so well that they’re pushing
through lots of queries, and the database can’t manage to process all of them
at the same time. It’s only one database, and lots of web servers…
Boss: It’s too late to buy Oracle now – what are we going to do!?
Big Boss to Boss(in the boss’s mind): This project has been a disaster
from the beginning – now you want me to delay it while we install a new
database, and spend a whole lot more! Do you think we’re made of money!? I’m
calling in someone who knows what they’re doing – you’re history buddy.
Colleague (about to take your job): Wait, I think I can solve the
problem!

So, what does your
colleague know that you don’t? How can he save the day and let the boss get all
the credit? Our scenario is too imprecise to generalize, and there are many
possible solutions. You can read about optimizing
queries and indexes
, optimizing by improving
the hardware, and tweaking the MySQL variables
, using the slow
query log
, and of course, there are other methods such as replication. However,
MySQL 4 provides one feature that can prove very handy – a query cache. In a
situation where the database has to repeatedly run the same queries on the same
data set, returning the same results each time, MySQL can cache the result set,
avoiding the overhead of running through the data over and over. Usually, you
would want to implement some sort of caching on the web server, but there are
times when this is not possible, and then it is the query cache you will look
to for help.

Setting up the query cache

To make sure MySQL uses
the query cache, there are a few variables you need to set in the configuration
file (usually my.cnf or my.ini). First, is the query_cache_type. There
are three possible settings: 0 (for off, do not use), 1 (for on, cache queries)
and 2 (on demand, discussed more below). To ensure it is always on, place:

query-cache-type = 1

in the
configuration file. If you started the server having only made this change, you
would see the following cache variables set:


mysql> SHOW VARIABLES LIKE ‘%query_cache%’;
+——————-+———+
| Variable_name | Value |
+——————-+———+
| have_query_cache | YES |
| query_cache_limit | 1048576 |
| query_cache_size | 0 |
| query_cache_type | ON |
+——————-+———+
4 rows in set (0.06 sec)

Note that
these are results from MySQL 4.0.x – you’ll see more in versions 4.1.x and
beyond. The query_cache_type will be set to ON or OFF as appropriate. However,
there is one more to set, and that is the query_cache_size. If set to 0
(the default), the cache will be disabled. This variable determines the memory,
in bytes, used for the query cache. For our purposes, we will set it to 20 MB:

query-cache-size = 20M

The amount is shown in
bytes:


mysql> SHOW VARIABLES LIKE ‘%query_cache%’;
+——————-+———-+
| Variable_name | Value |
+——————-+———-+
| have_query_cache | YES |
| query_cache_limit | 1048576 |
| query_cache_size | 20971520 |
| query_cache_type | ON |
+——————-+———-+
4 rows in set (0.06 sec)

The Query cache in action (almost)

For this tutorial, I used
a dump from Wikipedia, the open content
encyclopedia (you can find the dumps here.
I am using a fairly slow machine, with nothing else happening on it, to minimize
interference in the results. Let’s run the same query twice, and see how much
improvement we see the second time:


SELECT * FROM cur;

14144 rows in set (2.96 sec)
Now we run the same query again:
SELECT * FROM cur; 14144 rows in set (3.02 sec)

Now we
run the same query again:

SELECT * FROM cur; 14144 rows in set (3.02 sec)

What is
happening? We would expect the second query to take noticeably less time. Let’s
examine some of the status variables to get a better picture.


mysql> SHOW STATUS LIKE ‘%qcache%’;
+————————-+———-+
| Variable_name | Value |
+————————-+———-+
| Qcache_queries_in_cache | 0 |
| Qcache_inserts | 2 |
| Qcache_hits | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 2 |
| Qcache_free_memory | 20962720 |
| Qcache_free_blocks | 1 |
| Qcache_total_blocks | 1 |
+————————-+———-+
8 rows in set (0.00 sec)

The two queries we ran
are both recorded (by Qcache_inserts), but neither of them have been
cached. (You may get different results if other queries have been running.) The
problem is that the result set is too big. I used the Wikipedia Esperanto dump
(4MB compressed – the English dump is 135MB, and even though my English is
better than my Esperanto, bandwidth is expensive in South Africa!), but it is immaterial, as even
that is more than the query cache can handle by default. There are two limits
in play here – the limit for each individual query is determined by the value
of query_cache_limit, which is 1MB by default. Moreover, the limit of
the cache in total is determined by query_cache_size, which we have seen
already. The former limit applies here. If a result set is greater than 1M, it
is not cached.

The Query cache in action (really)

Let’s try a smaller
query:


SELECT cur_is_new FROM cur WHERE cur_user_text > ‘Y’

2336 rows in set (0.38 sec)

Let’s see if this one was
cached:


mysql> SHOW STATUS LIKE ‘%qcache%’;
+————————-+———-+
| Variable_name | Value |
+————————-+———-+
| Qcache_queries_in_cache | 1 |
| Qcache_inserts | 3 |
| Qcache_hits | 0 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 2 |
| Qcache_free_memory | 20947592 |
| Qcache_free_blocks | 1 |
| Qcache_total_blocks | 4 |
+————————-+———-+
8 rows in set (0.00 sec)

There is now a query in
the cache. If it took 0.38 seconds to run the first time, let’s see if we
notice an improvement the second time:


SELECT cur_is_new FROM cur WHERE cur_user_text > ‘Y’

2336 rows in set (0.11 sec)

Much better! And, looking
at the status again:


mysql> SHOW STATUS LIKE ‘%qcache%’;
+————————-+———-+
| Variable_name | Value |
+————————-+———-+
| Qcache_queries_in_cache | 1 |
| Qcache_inserts | 3 |
| Qcache_hits | 1 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 2 |
| Qcache_free_memory | 20947592 |
| Qcache_free_blocks | 1 |
| Qcache_total_blocks | 4 |
+————————-+———-+
8 rows in set (0.06 sec)

The cache has been hit
once. The status variables above should be fairly self-explanatory. Available
memory for the cache has gone from 20962720 to 20947592 bytes. The most useful
variable for future tuning is Qcache_lowmem_prunes. Each time a cached
query is removed from the query cache, (because MySQL needs to make space for
another), this value will be incremented. If it increases quickly, and you
still have memory to spare, you can up the query_cache_size, while if it
never increases, you can reduce the cache size.

Let’s run the query
again, with a slight difference, as follows:


SELECT cur_is_new from cur where cur_user_text > ‘Y’

2336 rows in set (0.33 sec)

That took longer than we
would have expected. Let’s look at the status variables to see what’s up:


mysql> SHOW STATUS LIKE ‘%qcache%’;
+————————-+———-+
| Variable_name | Value |
+————————-+———-+
| Qcache_queries_in_cache | 2 |
| Qcache_inserts | 4 |
| Qcache_hits | 1 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 2 |
| Qcache_free_memory | 20932976 |
| Qcache_free_blocks | 1 |
| Qcache_total_blocks | 6 |
+————————-+———-+

The query has not made
use of the cache – in fact, MySQL has inserted another query in the cache! The problem
here is that MySQL’s query cache is case-sensitive (in fact it is byte
sensitive). The query must be identical in every way – no extra spaces, no
changes in case. Therefore, the above query is treated as a different query.
This fact alone should be enough for you to adopt a certain convention, and
ensure all application developers make use of it. I use caps for MySQL
keywords, and lower case for table and field names.

Ian Gilfillan
Ian Gilfillan
Ian Gilfillan lives in Cape Town, South Africa. He is the author of the book 'Mastering MySQL 4', published by Sybex, and has been working with MySQL since 1997. These days he develops mainly in PHP and MySQL, although confesses to starting out with BASIC and COBOL, way back when, and still has a soft spot for Perl. He developed South Africa's first online grocery store, and has developed and taught internet development and other technical courses for various institutions. He has majors in Programming and Information Systems, as well as English and Philosophy. For 5 years he was Lead Developer and IT Manager for Independent Online, South Africa's premier news portal. However, he has now 'retired' from fulltime work, and is hoping that his next book will be more in the style of William Blake and Allen Ginsberg.

Latest Articles