Indexes have been a method to improve the performance of SQL
queries since the creation of relational databases in the 1970’s. You are
probably familiar with the ways that indexes can speed query operations at the
expense of additional overhead during data inserts, updates, and deletes. Up
until recently, indexes were always created on a single table and they’ve
undergone only limited changes over the years. That was true until a few years
ago when Oracle introduced "materialized views." These views enabled
Oracle to claim a big boost in performance over other relational database
products. That is, if you used just the right benchmark tests. SQL Server 2000
matched that improvement with indexed views.
The usability of indexed views is limited. While all editions of SQL Server can create and consume an indexed view, it is only the Enterprise Edition and Developer Edition that will make use of them without the addition of the query hint NOEXPAND. As you’ll see later, that condition limits their flexibility a great deal. When a database might have to run on many different SQL Servers that may not all be Enterprise Edition, I question whether you really want to use indexed views. Near the end of the article, I’ll cover what you have to do to make use of them on non-Enterprise Editions of SQL Server. Having made you aware of this limitation let me also let you know that indexed views can have a major positive impact on performance and can really be worth the trouble.
What is an Indexed View?
An indexed view is a view that has a unique clustered index
created on it. Normally views do not exist on disk as rows. That changes for
indexed views, which exist in the database as rows that realize the view. There
can also be non-clustered indexes on the view so long as it has the unique
clustered index.
Since the indexed view exists on disk there are two types of
overhead added to the database:
- The disk space taken up by the view
- The cost of maintaining the view on disk as the base tables are
modified.
Both of these costs can be substantial and limit the
usefulness of indexed views to situations where there is a high ratio of data
retrieval operations to data modifications. They are best for decision support
and analytic applications where the frequency of data modification is low.
Adding indexed views to an OLTP type application may slow it down more than it
helps.
Why use them?
When the circumstances are right, there is one powerful
reason for using them: performance of data retrieval operations. Because the
view exists on disk, there can be substantial savings in response time to
queries that involve the view.
In addition, the view can be used by the optimizer in
response to some or all of a query that does not directly involve the view that
is indexed. The mere existence if the indexed view can change the performance
behavior of queries on other views and on queries that involve similar joins or
grouping operations.
The performance gain during the query comes from one of two
factors:
- Storing a join.
-
Performing grouping and aggregations in advance of running the
query.
It is even possible to combine aggregation and join
operations in a single indexed view.
This article uses an example from the Northwind database:
the [Order Details Extended] view. Here is its definition:
create view “Order Details Extended” AS
SELECT OD.OrderID, OD.ProductID, Products.ProductName,
OD.UnitPrice, OD.Quantity, OD.Discount,
(CONVERT(money,(OD.UnitPrice*Quantity*(1-Discount)/100))*100)
AS ExtendedPrice
FROM Products INNER JOIN “Order Details” OD
ON Products.ProductID = “Order Details”.ProductID
I have modified the syntax of the CREATE VIEW statement
slightly from the definition that you will find in the Northwind database by
adding an alias (OD) for the [Order Details] table and cleaning up the
formatting. The definition of the view is fundamentally unchanged.
Look at a simple query on [Order Details Extended]. I have
selected an OrderID for a moderate size Order to narrow the resultset to only
six rows. The DBCC DROPCLEANBUFFERS command is used so that the physical reads
aren’t short-circuited by the data cache. Set STATISTICS IO ON is used so we
can see the quantity of IO that occurs. Here is the query and its results:
DBCC DROPCLEANBUFFERS
GO
SET STATISTICS IO ON
SELECT ProductName, UnitPrice as UnitPri, Quantity as Qty
, Discount as Disc, ExtendedPrice
FROM [Order Details Extended]
WHERE OrderID = 10657
GO
(Results)
DBCC execution completed. If DBCC printed error messages, contact your system administrator.
ProductName UnitPri Qty Disc ExtendedPrice
——————————- ——– — —- —————
Genen Shouyu 15.5000 50 0.0 775.0000
Jack’s New England Clam Chowder 9.6500 24 0.0 231.6000
Spegesild 12.0000 45 0.0 540.0000
Zaanse koeken 9.5000 10 0.0 95.0000
Gnocchi di nonna Alice 38.0000 45 0.0 1710.0000
Camembert Pierrot 34.0000 30 0.0 1020.0000
(6 row(s) affected)Table ‘Products’. Scan count 6, logical reads 12, physical reads 2, read-ahead reads 0.
Table ‘Order Details’. Scan count 1, logical reads 2, physical reads 2, read-ahead reads 0.
There are 12 logical reads on Products, 2 on [Order
Details], and a total of four physical reads. It’s the physical reads that
count the most because they use the resource, the disks, that has the highest
effect on response time. However, logical reads also have their costs.
Figure 1 has the query plan that is generated before
any indexes are added to views. Near the end of the article, we will see how
the plan for this query is changed by the existence of an indexed view, even
though the index is on another view.
Figure 1 Query Plan for SELECT on [Order Details Extended]
As you can see, the query plan confirms the information from
the statistics, all the work of the query is reading the [Order Details] and
Products tables using their primary keys and joining the result. For such a
simple query with a small result set the query is handled very efficiently by
SQL Server but the physical IO of this query can still be cut in half.
Any attempt to add an index on [Order Details Extended] will
fail. There are many restrictions on the views that can be indexed, on the
columns that an indexed view can contain, and on the columns in an indexed view
that may be indexed. The next section shows what the restrictions are and how
a view can be constructed to live within them.