SQL Server Performance Tuning : Pt. 2

Optimization Techniques

In Part I of my SQL Server Performance Tuning and Optimization article,
I identified ways of reducing the process of finding the location of SQL performance issues to a science. Unfortunately, fixing performance issues, the focus of part II, can be something of a black art. While there are some quick guidelines I have to offer, sometimes the answers and directions consist of trial and error and judgement calls based on experience or understanding of the engine which are beyond the scope of this article.

The whole odyssey reminds me of physics problems where you have to solve several equations to get the information you need in order to answer your original question. Where most people go wrong is forgetting to answer the original question, thinking they have an answer already. Our question was “How can we make this process faster”. We have now found where the problem is, but we don’t yet have a solution.

The good news is there are often quick ways of improving performance. And
there are techniques you can use that will have a high probability of being
successful, though they may be more cumbersome.

Though my answers usually involve indexing, I have not been impressed by the
Index Tuning Wizard.

Depending on how I see the problem I usually go through the following
strategies of things to look for and try. In each case the same basic idea
applies. Understand the SQL that is running slowly. Try to think of what is
making it slow. Change the SQL or indexes or tables. Then rerun the statement to
see if your change(s) worked. Sometimes you can spend hours trying to get from 2
seconds to 500 ms; other times 1 second will be reduced to 100 ms almost
effortlessly. Be prepared for more of the former than the latter.

Errors, Omissions, Pleonasms, and Cursors

This is the first thing to look for, not because it is the most common, but
because it should become apparent as you examine and come to understand the
problem statements.

Oftentimes, a query is just wrong. Even wrong queries can return the right
results, or appear to do so. A select without a where clause is a common
mistake. For example, setting a variable from a table without a where clause:

SELECT @val = col from aTable

This will select all the rows from the table and the variable will end up
with the value in the last row. On a fast machine with a medium dataset size,
this could be missed–the statement will be a lot slower than it should be,
but may not be slow enough to stop a developer or tester.

Sometimes an unnecessary redundancy (pleonasm) is used (as opposed to those
necessary redundancies;):

From

Select aCol 
  from aTable 
  where anID in 
    (select anID from aTable 
       where anotherCol in (someVals))

To

Select aCol 
  from aTable 
  where anotherCol in (somVals)

Cursors are sometimes necessary, but many times they are used when when not really needed. Performing a quick operation several thousand times in a
cursor can be orders of magnitude slower than performing the same operation for
all the rows once.

Searching for coding errors and irregularities often takes a good knowledge
of SQL because they are often caused by inexperience. It’s often hardest to see
our own mistakes, so this is an area where another team member can and should be asked for help.

SARG’s and Indexing

Indexing is really the heart and soul of optimization. Most optimization
issues will bring up the need for an index or two, even if some of the other
techniques are employed. Indexes are usually the fastest way to access data, and
seeing how indexes are used by the Optimizer is vital to fixing performance
problems.

SARG stands for search arguments. These are restrictions in where and join
clauses that can be applied to indexes to select rows more quickly:

Select au_lname 
  from authors 
  where au_lname like 'w%'

In this statement, the column au_lname in the where clause is the SARG. If
there were a million rows in the table, an index on au_lname would be a huge
performance improvement. With the index, the above select could jump right to
the J’s and pull out the W%’s right away. Without the index there would be a
table scan. Not only would every row have to be checked, but the amount of data
pages checked would be much larger (resulting in more time for searching)
because the table has more columns and data than the index. The table could
consist of 10,000 pages where the index might be only 1,000 pages and the number
of pages accessed to find the data in the index might be only a dozen. So this
index would reduce a 10,000 page search to 12, for an immediate performance gain
of 100 times.

Remember, SARG’s must match the first column in an index. The index (state,
au_lname
) would not help the above query and would not be used. It would be like
looking for first names in the phone book; it’s just not ordered for that.

Sometimes there are conditional criteria which are understood, but not used
in the query; maybe they are used later in code logic. A common example is when
adding a date restriction to a query can improve performance. Oftentimes adding
search arguments can speed up a query because it will use different or
additional indexes.

Covered Index

Whenever considering indexes, consider a covered index for even better
performance when retrieving large rowsets. A covered index is nothing more than
an index that includes all columns referenced in a query. So a covered index is
only a covered index in relation to particular queries. Repeating the above
example:

Create index lname on authors(lname)
Select au_lname 
  from authors 
  where au_lname like 'w%"

The index on au_lname is a covering index for this query. What this means is
that all the information the query needs is in the index data pages and the
table data pages will not need to be accessed to execute the query. On the other
hand if the query were:

Select au_lname, state 
  from authors 
  where au_lname like 'w%'

and the index still only covered au_lname, SQL Server would search through
the index, and then for each entry it found it would have to load the data page
and pull the single row off the 8 K page to get the state name. If 255 rows in
the index qualified (which would all be on one index page), there would be 255 page accesses of the table data.

Depending on what percentage of rows qualified, the Optimizer might decide to
do a table scan instead of using the index. The table might only be 100 pages
long, but using the index, in the above example, several of the pages would be
hit multiple times, so a straight table scan would be better. Alternatively, if
the table data were 1,000 pages (8 MB) the Optimizer would probably see that the
index use would be faster. And if the table were 10,000 pages, you could rest
assured that the index would be chosen.

So even if the index is used, one index page will yield 255 additional page
accesses to retrieve the data. In this case, a covering index would reduce the
page accesses back down to one or two. If the index was on (au_lname, state) the
data pages wouldn’t need to be accessed and the simple index lookup would give
all results from the query.

If access were not a big consideration and insert/update and storage space
issues were not problems, I would add an additional index (state, au_lname) for the following select:

Select au_lname 
  from authors 
  where state = 'NY'

It has the same keys as the above index, but with state at the beginning, the
search will be much faster. Alternatively, the previous index (au_lname, state)
is still a covering index, and even though it is not in a useful column order,
it still provides a performance improvement over a table scan. A full index scan
would require fewer pages than a table scan.

Latest Articles