One of the primary errors when the system gets busy is Too
many connections. It's generally no problem to increase the
number. You can see how close you're running to max by comparing
max_connections with the extended-status variable
threads_connected. If you regularly come close, the
time for some improvements is drawing near.
One improvement you can make on a code level is to use persistent
connections rather than ordinary connections (e.g. using
pconnect() rather than connect() in PHP), In this case, you may
have a scenario with something like:
max_connections - 200
threads_connected - 155
threads_running - 1
Don't be worried that MySQL is wasting resources here. The 154
connections that aren't doing anything are persistent, meaning
that you may have Web server instances running that are
connected, but only one thread is actually running. The time it
takes to process is usually very small. Personally, I've found
persistent connections to be better, although there is no
consensus on whether connections should be persistent or not in
MySQL. Even though the connection overhead in MySQL is much lower
than for other databases (with something like Oracle you
have to use persistent connections), I've always found it
to be better to keep the connection open for reuse rather than
close and open it all the time. But test this for yourself, as
some reports have indicated differently. (Be sure to test
properly under load — some 'proofs' out there claiming non-
persistent connections are better are scarily inaccurate.)
Another delay applies to systems that use INSERT DELAYED
statements. There is a parameter called
delayed_queue_size, defaulted to 1000. This means
that after 1000 INSERT DELAYED statements are queued, the client
is no longer released immediately (remember that this is the
difference between INSERT DELAYED and INSERT LOW PRIORITY, see
the article on Optimizing MySQL Queries and Indexes if you're not
sure). If you have a busy system, and find clients are often being
delayed as they hit this limit, you may want to increase this
figure.
back_log is another value that may be worth
increasing. If your system gets a lot of connection requests in a
very short space of time, MySQL will start refusing connections
after the value of this variable has been stacked. Increasing
this gives your server more scope to handle these, but it
obviously doesn't help if you just get too many connections. This
just helps cope with those short bursts!
In a system such as the above, it is often an indicator of
something going wrong when the threads_connected
value starts to climb, usually meaning queries are getting
locked, and you're approaching gridlock. Use a
show processes
to spot the guilty queries. By watching the queries that appear
in this list, and seeing which are locking or taking too long to
resolve, you can often identify a badly optimized query, and make
improvements to it before the situation gets out of hand.
Another good indicator of a problem is a
slow_queries value that is continually increasing.
In a good system, you should have as few of these as possible.
perhaps you do the odd complex join for an archive search. But if
your routine queries are slow, you have a problem.
And lastly, as far the mysqld variables covered in this
article, if you do lots of sorting (for example, ORDER BY clauses
in your SELECT's), you'll probably need to tweak the
sort_buffer. Each thread that does a sort allocates
a buffer of this size. You may want to go beyond 8MB if you're
doing lots of this kind of query.
Disks
As always, there's more to say and more to find out — use
the generous documentation supplied by MySQL to read more about
all the variables, and search their archives for discussions.
But, before I leave you in the pit lane, there's one more topic I
want to cover.
Besides RAM and processor speed, disk seek time is
probably the primary hardware bottleneck. Increasing the speed of
your disks alone can have a major impact on your database speed,
but there are a few more tricks you can play.
On an operating system level, you can use striping (placing
blocks on multiple disks), mount with the noatime flag (which
means that you don't bother to record the last access time for a
file. On your busy database system, the last access time was
probably the last second for the files that count!). But these
and other operating system tweaks will have to wait for another
day (I'm just mentioning these to give a direction for the bright
sparks who want to go further than this article).
The best way to speed up your available disks is to balance the
load across multiple devices. Besides using striping, symbolic
links provide the easiest way to achieve this. Let's assume you
have two databases both accessed equally. By default these will
appear on one device, so your disk is constantly moving around
seeking records in both databases. It would be far better to place
each of these databases on a separate device, thus reducing the
amount of data on each individual disk (a disk is now called into
action half the time it was before, and can find the records more
quickly, as it has to move across less data than previously). The
way to do this is quite simple:
ln -s source_directory target_directory
The source directory would be on the new device, while the
target_directory would be the name of the directory
containing the database you want to move (in MySQL databases are
stored as directories, with the tables as files therein). The
link then still works smoothly from its current location, but the
data is stored in a new place.
Good luck with your optimization efforts. These are exciting
times to be using MySQL, with MySQL 4 offering huge improvements
in functionality and ease of use.
resource list: