The Trouble with SQL Tuning

Database administrators tend to think of SQL tuning as necessary, important, even critical. The reality is that the savings realized by tuning SQL, usually manifested in saved CPU time and decreases in elapsed times, may offset the costs incurred in time and resources.

Ask yourself this question:  How long have DBAs been doing SQL tuning?  My guess is 25 years or more. If so, then why are we still doing it?!  Do you, as a DBA, want to spend your entire career tuning SQL statements?  Don’t you want something more than that?

The Need to Tune

Database administrators (DBAs) are usually responsible for application performance. There are several categories of performance tuning; here are the most common.

SQL tuning. The DBA captures SQL statements and executes the EXPLAIN facility. This presents access path information in a format that the DBA can analyze to see if better alternatives are available.

Resource tuning. The DBA and systems programming staff measure resource consumption such as CPU, I/O, and memory. They then analyze how DB2, application programs, and even single SQL statements use these resources, in an attempt to minimize resource usage while maximizing throughput.

DB2 subsystem tuning. Here, the DBA and systems programmers analyze how the DB2 software itself uses resources. They look at configuration parameters, measure and track internal statistics generated by DB2.

Of these, SQL tuning is the simplest and most common. So common, in fact, that DBAs sometimes consider it their most important job.  As we will see, this perception can lead to a false sense of accomplishment.

What Does SQL Tuning Accomplish?

The most common performance issues related to SQL are those where a statement either executes for an inordinate amount of time, consumes excessive resources, or exceeds some capacity limit.  Here are some examples:

  • A relatively simple SQL statement executed against a small table runs for over an hour;
  • An SQL statement executes for two minutes, and during that time consumes two minutes of CPU time (basically monopolizing the CPU);
  • An SQL statement fails, and the error message is “CPU limit exceeded”;
  • One or more SQL statements for a mission-critical on-line application execute slowly, resulting in excessive transaction times for customers.

A Level-set on Resource Limits

One common misperception by DBAs is the only answer to poorly performing SQL statements is “tune or scale up”.  That is, you must either tune the SQL or add hardware resources (CPUs, memory, etc.).

However, reaching resource capacity limits is a symptom, not a problem.  The two solutions “tune or scale up” are only two potential solutions to possible (multiple) root causes.

For example, too often I have seen a DBA tune SQL to reduce CPU usage.  The query then runs quickly!  Regrettably, this gives an opportunity for other ‘latent work’ to use the now-relieved CPU.  The same is true for other constraints.  This kind of thinking just reactively addresses symptoms.

The DBA asks, “Then how do I know what to tune?” This is the wrong question. Better questions are: “How do we measure optimization?”  “Who is responsible for enterprise IT capacity planning?”  “Do we have best practices in-place for processes such as new application/SQL migration, disaster recovery planning, and so forth?”.

Note that these questions are strategic, rather than tactical. They address the issue of the DBA’s priorities and the cost of their time.

But what about SQL Tuning?

DBAs need to remove themselves from the rut of “what do I tune?” and spend some quality time on metrics, configurations, and costs.  Develop these measurements and do a root cause analysis. Know your system!  In my experience, this will take time and effort.  DBAs are smart folks, but we are rarely subject matter experts on operating system and DBMS system performance tuning. 

So what about that query you tuned that used to take 10 minutes of CPU time and now takes only seconds? Let’s consider some real-world examples.

Suppose that the DB2 configuration parameter SRTPOOL (sort pool size) is set far too low. One likely result will be that SQL statements requiring sorting (say, due to an ORDER BY) will fail due to insufficient sort pool space.  So the DBA “tunes” the queries by telling the developers that then should remove ORDER BY from their queries and sort the returned result sets themselves (!).  Was this tuning?

Another DBA finds a query that runs for two hours, and consumes 15 minutes of CPU time. The DBA then tunes the query, and it now runs in 10 minutes elapsed time using 10 minutes of CPU time.  A Hero? Regrettably, no. The query that used to consume 12.5% of a CPU (15 minutes over 2 hours) now consumes 100% of the CPU. Whenever this query ran CPU usage spiked, causing some havoc.  Was this tuning?

Here is another example. A DBA determines that adding an index on a particular table column  would speed up a poorly-performing SQL statement. So they add the index.

Question:  Why did that index not already exist? 

Was the database design process faulty?  If so, maybe the DBAs should spend more time on data modeling standards.

Was the business logic for the new application so vague and imprecise that the SQL wasn’t known at an early stage? This points out issues in the application design process.

Since SQL performance analysis is a well-known process, why did the developers not do their own SQL tuning?  Do they not have the tools or training?

The point is that the DBA must investigate the root causes of poor performance, not simply focus on single SQL statements.  Otherwise, they will be reacting to SQL tuning for the rest of their career.  Is this a good way for the DBA to spend their time? Wouldn’t it be better to fix the application design or database design process so that 80% of those indexes are there from the beginning?

There are ten or so different categories of “fixes” that the DBA can suggest as part of simple SQL tuning. They include:

  • Adding an index;
  • Bringing RunStats up-to-date;
  • Ensuring data is loaded and stored in clustering sequence;
  • Enforcing good SQL coding practices;
  • Developers doing EXPLAINs.

Each of these fixes can be traced back to one or more root causes.

But what about Reducing CPU?

If you have a lot of CPU-starved resources that is a symptom.  What is the root cause?  Here is a list of potential causes of SQL statements that use excessive CPU time.

  • Poorly-written SQL;
  • Poorly-written applications;
  • Transactional applications with poor commit strategies;
  • Unneeded or over-used batch unload / load / backup processes;
  • Too frequent RunStats;
  • Poorly-chosen buffer pool sizes;
  • Poorly-assigned pageset-to-BP selection;
  • Insufficient memory allocated to Sort Pool / RID Pool;
  • Poor memory management resulting in paging to real storage;
  • Poorly configured DB2 subsystem;
  • DB2 is configured for maximum I/O throughput, not minimum CPU utilization;
  • DB2 is configured for high-performance WRITE access;
  • Execution of frequent backups;
  • Execution of LOAD … LOG YES processing.

The DBA must first gather relevant data before they can correctly conclude that tuning SQL statements will solve a CPU capacity problem.  There is even more work required to prove that adding CPU capacity to a DB2 subsystem will increase throughput.

I’m not suggesting that SQL tuning is always a waste of time (although if you are not measuring the resources spent and saved in the process, how do you know that it was worth it?).  Announcing, “It runs faster, and with less CPU used!” certainly sounds good.  However, as a manager, I preferred that my DBAs spent their quality time on high-priority issues such as disaster recovery preparedness, data availability, and security.

But Developers Write Poorly-Performing SQL!

This is an unfortunate but very common truth. Best practices and SQL implementation standards are the DBA’s friends in this environment. Perhaps you can convince developers that “Expert SQL developers code like this!”, and give them examples of good-performing SQL, or even bad examples.

One other issue that is often overlooked: Does your IT enterprise charge back resource costs to users?  That is, do they pay for the CPU they use, and the disk storage they use? Are there service level agreements for transactions/applications and penalties for exceeding them? If none of these chargeback systems exist, then application developers have no incentive to write well-performing code.

So Should I do SQL Tuning?

To get out of the reactive rut of reacting to poorly-tuned SQL, do root cause analysis.  Why does bad SQL keep arriving?  Are developers coding poorly?  If so, perhaps the solution is training, or having them do their own tuning.  Are tools automatically generating poorly-performing SQL?  The answer may be a combination of tool upgrades, index management, RunStats management, or even DB2 subsystem tuning.

Last, too often DB2 subsystem, data sharing and operating system configurations will affect application and SQL performance. Wouldn’t it be annoying if you spent hours, days, weeks tuning SQL queries, only to discover that system parameter values were skewing the results?  Do you really look forward to re-re-re-tuning every single bad SQL statement whenever configuration changes happen?

Summary

The main point is this: If the DBAs are spending their time doing SQL tuning and not fixing the issues that caused them, how will the support staff and the IT enterprise grow? How can you ever find any time to be pro-active?  Enterprises are depending more and more upon a professional staff of subject matter experts.

If you spend your career doing SQL tuning you may become an expert on that…which means that tasks such as subsystem configuration and tuning, implementing new features (XML), coordinating disaster recovery, and so forth, will be given to others. In addition, you run the risk of being replaced by a newly-hired (and cheaper) SQL tuner.

References

Quality Assurance (Wikipedia)

Root Cause Analysis Template

Root Cause Analysis and Methods
From the book, “Apollo Root Cause Analysis — A New Way of Thinking:, 3/e, by Dean L Gano, (c) 2007.

Tuning DB2 SQL Access Paths
(c) 2003 by Craig Mullins and IBM

Top Ten SQL Performance Tips
(c) 2006 by Sheryl Larsen and Quest Software

See all articles by Lockwood Lyon

Lockwood Lyon
Lockwood Lyon
Lockwood Lyon is a systems and database performance specialist. He has more than 20 years of experience in IT as a database administrator, systems analyst, manager, and consultant. Most recently, he has spent time on DB2 subsystem installation and performance tuning. He is also the author of The MIS Manager's Guide to Performance Appraisal (McGraw-Hill, 1993).

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles