Fragmentation relates to index entries that are out of sequence, which for queries that access data sequentially, leads to additional work to retrieve the index’s data. This excerpt from the book SQL Server DMVs in Action discusses a script for identifying the most fragmented indexes. Fragmentation affects query performance and should be removed.
SQL Server DMVs in Action Ian W. Stirk |
This article is taken from the book SQL Server DMVs
in Action. The author discusses a script for identifying the most fragmented
indexes. Fragmentation affects query performance and should be removed.
Fragmentation relates to index entries that are out of sequence,
which for queries that access data sequentially, typically index scans, leads
to additional work to retrieve the index’s data. This additional work can
reflect itself in longer running queries, with potentially more blocking and
client timeouts. Where possible, we should remove this fragmentation so we
don’t perform any unnecessary work.
The crux of this script is the DMV sys.dm_db_index_physical_stats.
This DMV accepts various parameters, allowing fragmentation to be reported at
various levels of granularity, such as for a database, table, index, and so on.
Under the hood, this DMV calls database console commands (DBCC), which can take
a long time to execute. Consider this and its impact on resources when running
this script. On my 1.5 terabyte database containing 255 tables, this script
took more than an hour to execute.
The script we use to identify the most fragmented indexes is
shown in listing 1.
Listing 1 The most fragmented indexes
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED SELECT #A DB_NAME() AS DatbaseName , SCHEMA_NAME(o.Schema_ID) AS SchemaName , OBJECT_NAME(s.[object_id]) AS TableName , i.name AS IndexName , ROUND(avg_fragmentation_in_percent,2) AS [Fragmentation %] INTO #TempFragmentation FROM sys.dm_db_index_physical_stats(db_id(),null, null, null, null) s INNER JOIN sys.indexes i ON s.[object_id] = i.[object_id] AND s.index_id = i.index_id INNER JOIN sys.objects o ON i.object_id = O.object_id WHERE 1=2 EXEC sp_MSForEachDB 'USE [?]; #B INSERT INTO #TempFragmentation SELECT TOP 20 DB_NAME() AS DatbaseName , SCHEMA_NAME(o.Schema_ID) AS SchemaName , OBJECT_NAME(s.[object_id]) AS TableName , i.name AS IndexName , ROUND(avg_fragmentation_in_percent,2) AS [Fragmentation %] FROM sys.dm_db_index_physical_stats(db_id(),null, null, null, null) s INNER JOIN sys.indexes i ON s.[object_id] = i.[object_id] AND s.index_id = i.index_id INNER JOIN sys.objects o ON i.object_id = O.object_id WHERE s.database_id = DB_ID() AND i.name IS NOT NULL AND OBJECTPROPERTY(s.[object_id], ''IsMsShipped'') = 0 ORDER BY [Fragmentation %] DESC' #C SELECT top 20 * FROM #TempFragmentation ORDER BY [Fragmentation %] DESC DROP TABLE #TempFragmentation #A Creating a temporary table to hold the results #B Looping around all databases #C Identifying most fragmented indexes
In the listing, you can see there is a single DMV and two system
tables involved in identifying the most fragmented indexes, a brief description
of each is shown in table 1.
Table 1 DMVs/system tables for identifying most
fragmented indexes
DMV / table |
Description |
sys.dm_db_index_physical_stats |
Contains size and fragmentation information for the data |
sys.indexes |
Contains details for each index such as name and type |
sys.objects |
Contains details for each object such as schema name |
The joining of the DMV and system tables provides enough
information to identify the most fragmented indexes across all databases on the
server. The DMV and system table are joined on their common key columns; that
is, object_id and index_id. The system tables sys.objects and sys.indexes are
joined on the object_id key. The system table sys.objects is used to provide
information about the schema to which the index relates.
We use a common pattern to create the temporary table to hold the
transient results. This is a common pattern for looping over all the databases
on the server.
The first part of the script creates an empty temporary table
(called #TempFragmentation) with the required structure of column names and
datatypes. We use the Microsoft-supplied stored procedure, sp_MSForEachDB, to
execute a query on each database on the server. The query we execute selects 20
most fragmented indexes on each database. We put the results of each execution
into the temporary table. Finally, we select the top 20 most fragmented indexes
across all databases on the server.
The query we execute on each database identifies the top 20 most
fragmented indexes using the column avg_fragmentation_in_percent and sorting by
this column in a descending order.
Note that we ignore any indexes whose name column is set to NULL.
That’s because they are not indexes; they are heaps. Similarly, we are only
interested in our own user-created indexes, so we filter out any indexes that
relate to tables created by the SQL Server installation process (the column
IsMsShipped has a value of 1).
An example of the type of output for this query is given in
figure 1.
Figure 1 Output showing the most fragmented indexes
Having a low level of fragmentation is especially important for
those queries that involve ranges, which retrieve data between two points.
Fragmentation results in additional work. Where possible, fragmentation should
be removed.
Typically, Microsoft recommends that indexes with fragmentation
percentage in excess of 30% be rebuilt. Similarly, indexes with fragmentation
percentage between 10% and 30% should be reorganized.
It is possible to rebuild/reorganize indexes individually from
within SQL Server Management Studio (SSMS) by right-clicking the relevant index
and selecting Rebuild. While this is OK for selected indexes, for a more
encompassing approach, a script should be created and the output should run
automatically. This script should be run at regular intervals as part of the
regular database housekeeping jobs. These operations can be performed when the
database is online; however, that may impact the performance, so please test a
small change before it is applied more aggressively.
It is possible to concentrate our efforts on a given database,
table or index. This is achieved by supplying relevant parameters to the sys.dm_db_index_physical_stats
DMV.
Summary
In this article I discussed a script that uses DMVs to identify
indexes that are fragmented. Indexes are a vital element in determining how a
table’s data is accessed and affect query performance.
SQL Server DMVs in Action Ian W. Stirk |