Using Index Intersection


First introduced in SQL Server version 7, Index Intersection gives you new options for creating indexes on tables to maximize performance

To start with I am going to pick the authors table from the Pubs database and explain the indexes that exist on that table, then look at how we would use indexes on that table for queries prior to Index Intersection, then we will look at the options that Index Intersection gives us.

A cautionary note

In a way, the authors table is both a good and a bad example for this article. It is great for demonstration purposes because the Pubs database predates SQL Server 7 and so we can see how indexes were chosen without the benefit of Index Intersection. It is also well known and available to everyone. On the other hand, the performance gains from using Index Intersection (if any) for a table of this size and design are fairly negligible.

Without Index Intersection

Imagine you have a table with two columns that you search on regularly as a pair–for example:

use pubs
select *
from   authors
where  au_fname = 'Akiko' AND
       au_lname = 'Yokomoto'

The pubs database has a non-clustered compound index (i.e. an index with more than one column) on this table which suits this query perfectly, because the index is defined on the columns au_lname and au_fname in that order. SQL Server will use this index to return results for this query.

The ordering of the indexes columns is important because to use a Compound Index, the leftmost column of the index must be considered in the Where clause (or the Join clause of a multi-table query) Because of this, SQL Server will handle the following two queries in different ways.

select *
  from   authors
  where  au_lname = 'Yokomoto'
select *
  from   authors
  where  au_fname = 'Akiko'

The first query from this pair will use the same index to search the table as the first example, because au_lname is the first column defined in the index. However, SQL Server cannot use the same index for the second query, because au_fname is not leftmost in the index definition, and so the optimizer will pick another execution plan, or do a full table scan (au_fname is normally not indexed)

With index intersection

Index Intersection is a technique built into the SQL Server engine to enable it to use more than one index on a table to satisfy a given query. To demonstrate, we need to alter the pubs table a little, so now would be a great time to back up the database.

Backup secured? OK–let’s proceed…

First we are going to create a cut-down version of the authors table using this script:

create table authors_names (
   id       int identity,
   au_lname varchar(40),
   au_fname varchar(40),
   filler   char(7000)
insert authors_names (au_lname, au_fname, filler)
select l.au_lname,
from authors l CROSS JOIN authors f
alter table authors_names add constraint PK_authors_names PRIMARY KEY CLUSTERED (id)
create nonclustered index i__au_fname on authors_names (au_fname)
create nonclustered index i__au_lname on authors_names (au_lname)

Note that I have taken a couple of short cuts to generate suitable test data. Apart from ignoring unnecessary columns for the test, I have used a Cross Join to increase the amount of available test data (see this article for further information). I have also added a “filler” column to use up extra space, as SQL Server will not use indexes on very small tables.

Now if you run the three queries we used earlier against the authors_names table using the “Display Estimated Execution Plan” option in Query Analyser, you should see that the all three queries use one or more indexes (remember the query against au_fname did a table scan previously).

In addition you should see that the query that specifies both first and last names in the Where clause uses both the index on au_fname and the index on au_lname. This is the advantage that Index Intersection brings–It allows SQL Server to scan more than one relevant index to get the data you need, minimizing the amount of data returns and maximizing performance. Prior to SQL 7 this functionality was not available in SQL Server.

Points To Note

I chose to use non-clustered indexes in this example for a good reason. The optimizer will generally prefer to use a clustered index rather than an Index Intersection, so the demonstration would be less likely to work on a setup where clustered indexes are used in the query.

Whether or not a combination of single-column indexes is better than a compound index depends on the data and the queries you run against it. You should test out both cases in your database designs to see which is more efficient for your circumstances.

As ever with indexed tables, it’s important to maintain the index structure and statistics regularly (or use auto-stats) to help SQL Server choose efficient indexes for each query.

Neil Boyle
Neil Boyle
Neil Boyle left school at the age of sixteen thinking that computers were things that only existed in Star Trek. After failed careers as a Diesel Mechanic, Industrial Cleaner, Barman and Bulldozer Driver he went back to college to complete his education. Since graduating from North Staffs Poly he has worked up through the ranks from Trainee COBOL Programmer to SQL Server Consultant, a role in which he has specialised for the past seven years.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles