Getting the Best from the Index Tuning Wizard
July 27, 2000
The Index Wizard is a handy tool shipped as part of the SQL 7 Profiler that can analyze a set of SQL queries and suggest index changes that could improve their performance. Here I will take a look at how to get the best results out of it.
How The Wizard Works
You need to point the Wizard at a copy of your database, "feed" the Index Wizard with a "Workload" file (which is a set of representative SQL commands that are used against the database you wish to tune) and just let it go. The Wizard examines the queries in the workload file and tries to determine if indexing changes would improve their performance.
Building A Workload File
If you are developing a new database, then what these "representative" commands are may well be guesswork, and if SQL queries are generated on the fly by web clients for instance, there may be infinite variations of SQL queries that can be run--in this case you have to make an educated guess. With other applications all your queries may be controlled through a small set of stored procedures, you just need to fabricate calls to these SPs with some "representative" parameters.
With live systems the problem of getting a workload file pretty much goes away, SQL profilers tracing function will create one for you--just point Profiler at the live database and let it record some actual user queries. Don't forget to add the database ID in the filters section (simply run "SELECT DB_ID()" in the relevant database to find out it's ID) to exclude queries for other databases--you can also cut out all the "connection" info in order to keep you trace file small.
Running The Wizard
As a matter of preference I always run profiler on a remote machine and store the output in a .trc file on my remote machine, minimizing load on the SQL Server. This is particularly important in live environments as you want to add as little overhead on a live server as possible.
Once I have my workload file I need to run the Index Tuning wizard against the database. Again, to minimize impact on live servers, I always do this on a development machine with an up to date copy of any live database I have run my workload against. Index Wizard works on a "Logical IO" basis, which is not really machine dependant, so it does not matter if the development machine's hardware spec is different to the live one.
The Wizard will recommend a number of indices to you, many of which will be in place already, and can generate a script to create new indexes for you, which should be saved to disk for later us. If you require, the Wizard can generate any new indexes for you on the spot, or schedule the job for a quieter time.
Dealing With The Output
Amongst other things, the Wizard will show you the "Top 100" queries that if it thinks will benefit most from implementing its recommendations, and let's you save them to a SQL file. I suggest you do save the queries, as they can come in handy later on.
It will also ask you if you want to put the new indices in place straight away--I usually say "Yes" to this, because I am usually running the Wizard on a development database. Once the new indexes are created, I can now compare "Head to Head" the copy database (with the new indexes) against the live database (or another copy of it)
This is where the "most improved" queries that we saved earlier come in handy--I compare the query plan and for the queries on each copy of the database, or run the queries in each database to get a feel for actual time improvements (Needless to say, don't run any updating queries against your live database)
Once I have convinced myself that the Wizard's recommendations are worthwhile I can then plan out their implementation in the Live environment. I don't always assume that they are worthwhile--I will tell you why later.
Hints And Tips
It's tempting to get as large as possible a workload file to make your testing more "representative", but I suggest you do not do this for the following reasons:
If you really want to process against larger data sets, I suggest you split them up into smaller sections and aggregate the results.
Remember that an additional index can speed up many operations, but may also slow down any insert, update and delete queries (because the new indes also needs to be kept up to date) You have to consider this before adding an index. If the wizard says that it's suggested index will speed up retrieval by a huge amount, but you mostly insert to this table, it might not be such a good idea to add the index after all.
Remember also that the usage profile of your database may change throughout the day--maybe you add data to the tables throughout the working day but produce batch reports during the evenings. More indexes will favour batch reporting at the expense of data collection, but I would say that it's often more important to reduce transaction times during the data collection stage. During the day your users are sitting waiting for things to happen, but you have the whole night to produce those reports.
The wizard defaults to considering indices that can have up to 16 columns. It's generally recommended to keep indexes as narrow as possible though, and I usually reduce this setting under the "Advanced" tab.
If the wizard suggests an index that will shave a tiny fraction of a query that already seems to go blindingly fast, don't discount it simply because you cannot actually see the improvement--when you have multiple concurrent users these tiny savings really can add up.
One Final Thought
If adding an index is the cure, is the absence of an index the real problem? It could be the query itself that needs tuning, or it could be that your database's structure is not as efficient as it could be. I have often found that the Wizard's results have revealed a more fundamental flaw than a missing index, and you should always consider this possibility.