dcsimg

Using Dynamic Management Objects to Monitor I/Os Against an Instance of SQL Server

July 30, 2010

To ensure that queries run as efficiently as possible, the database administrator needs to monitor performance to find inefficient queries. This article focuses on exploring how a DBA can monitor the I/Os against an instance of SQL Server, using Dynamic Management Objects, or more commonly know as Dynamic Management Views (DMVs) and Functions (DMFs).

When you are designing an application around a SQL Server database, or any database platform for that matter, how well your queries perform is important. With SQL Server, you may have a single application using a single database, or multiple applications running against multiple databases on the same instance. In order to make queries that run against a specific instance of SQL Server run as efficient as possible, the database administrator needs to monitor performance to find inefficient queries. There are many different things a DBA can monitor, CPU, I/O, memory, etc. In this article I will focus on exploring how a DBA can monitor the I/Os against an instance of SQL Server, using Dynamic Management Objects, or more commonly know as Dynamic Management Views (DMVs) and Functions (DMFs).

Using Dynamic Management Objects for Your Analysis

When you are looking at a database instance and your goal is to identify what is consuming I/O, where do you start? There are a number of different ways to go about this, depending on your situation. If you have a single database running on the instance, then all you will need to do is peek into that single database and identify queries that are issuing the bulk of the I/Os. However, if you have multiple databases running on an instance, you need to determine how much I/O each database is generating.

Depending on whether you have a single database instance or a multiple database instance will determine which DMVs and DMFs to use. I will explore which DMVs to use in a minute. However, first you need to understand how SQL Server collects information for Dynamic Management Objects and the limitations they have.

SQL Server starts collecting information that can be exposed via DMVs and DMFs as soon as SQL Server starts running. The information collected is available as long as SQL Server is running. Once SQL Server is stopped all the information collected is lost. Information collected may also be cleared, or reset depending on what happens within SQL Server. For example, statistics are collected on indexes as they are used and updated, but if an index is dropped and recreated then the statistics collected are also dropped along with the index. Therefore, when you are using DMVs and DMFs to analyze your I/O you need to keep in mind how long SQL Server has been running, and whether or not the statistics could have been reset due to various activities.

Determining I/O Consumption by Database

Before you go looking at a specific database to find those I/O intensive T-SQL statements, you must first identify which database, or databases on your SQL Server instance are using most of the I/O. To do this you can use the "sys.dm_io_virtual_file_stats" DMF. This DMF returns the I/O statistics associated with the DATA and LOG files for all databases on an instance. This DMF replaces the "fn_virtualfilestats" function.

This DMF can return I/O for all databases, or just a specific database depending on what parameters are pass to the function. Here is the syntax for calling this function:

sys.dm_io_virtual_file_stats (
{ database_id | NULL }
, { file_id | NULL }
)

If a database_id is passed to this function then this function only returns information for the database where the database_id matches the ID that is passed. You can also pass just file_id if you only want to return the I/O information for a specific file_id across all databases. Alternatively, you can pass both a database_id and a file_id if you only want this function to return I/O information for a specific database_id and file_id. If both the database_id and File_id parameters are null then this DMF returns I/O information for all databases and files on an instance.

The following table identifies the output returned from this DMF.

Column name

Data type

Description

database_id

smallint

ID of database.

file_id

smallint

ID of file.

sample_ms

int

Number of milliseconds since the computer was started. This column can be used to compare different outputs from this function.

num_of_reads

bigint

Number of reads issued on the file.

num_of_bytes_read

bigint

Total number of bytes read on this file.

io_stall_read_ms

bigint

Total time, in milliseconds, that the users waited for reads issued on the file.

num_of_writes

bigint

Number of writes made on this file.

num_of_bytes_written

bigint

Total number of bytes written to the file.

io_stall_write_ms

bigint

Total time, in milliseconds, that users waited for writes to be completed on the file.

io_stall

bigint

Total time, in milliseconds, that users waited for I/O to be completed on the file.

size_on_disk_bytes

bigint

Number of bytes used on the disk for this file. For sparse files, this number is the actual number of bytes on the disk that are used for database snapshots.

file_handle

varbinary

Windows file handle for this file.

Now that we understand how to call the sys.dm_io_virtual_file_stats and the columns returned, let me provided you with a couple of sample queries. These sample queries can help you identify different I/O aspects related to your instance.

First, let me identify which databases are contributing the bulk of the I/O for an instance. In order to do this, I will be summing up and grouping the number of read I/Os and write I/Os by database and then ordering the output where the database with the highest number of I/Os is displayed first. Below is a query that will show the top five databases from an I/O perspective:

SELECT TOP 5 DB_NAME(database_id) AS [Database Name]
,SUM(num_of_reads + num_of_writes) AS [Total I/Os]
FROM sys.dm_io_virtual_file_stats (NULL,NULL)
GROUP BY database_id
ORDER BY SUM(num_of_reads + num_of_writes) DESC

When I run this code on my laptop, which has a very lightly used SQL Server instance, I get the following output.

Database Name Total I/Os
------------------ --------------------
Examples 678
msdb 221
ReportServer 171
Demo 170
master 132

From this output, you can see that the "Examples" database has done the most I/Os since my SQL Server started, followed by the "msdb" database. By using this single query, you could quickly identify which database on your instance has used the most I/O. This information can then be used to perform additional I/O analysis.

Another thing you can identify with this DMF is how long SQL Server has been up and running. On each record returned from querying this DMF is a column named "sample_ms". That column identifies the number milliseconds since SQL Server started. By using this column value, you can calculate the number of days, hours, minutes, seconds, and milliseconds that SQL Server has been up and running, as the following query demonstrates.

SELECT sample_ms / (1000*60*60*24) AS DAYS
, sample_ms / (1000*60*60) - sample_ms / (1000*60*60*24) * 24 AS HOURS
, sample_ms / (1000*60) - sample_ms / (1000*60*60) * 60 AS Minutes
, sample_ms / (1000) - sample_ms / (1000*60) * 60 AS Seconds
, sample_ms - sample_ms / (1000) * 1000 AS MilliSeconds
FROM sys.dm_io_virtual_file_stats(1,1);

If SQL Server has been up for a length of time it is not possible to identify when the I/Os occurred against each of these virtual files using this DMF. If you are concerned, about which database is currently performing all the I/O you need to compare the virtual I/O information between two different samples. Here is a script that will identify the top five databases, based on the amount of I/Os between two different samples that are 10 seconds apart.

DECLARE @Sample TABLE (
database_id SMALLINT,
file_id SMALLINT,
sample_ms INT,
num_of_reads BIGINT,
num_of_bytes_read BIGINT,
io_stall_read_ms BIGINT,
num_of_writes BIGINT,
num_of_bytes_written BIGINT,
io_stall_write_ms BIGINT,
io_stall BIGINT,
size_on_disk_bytes BIGINT,
file_handle VARBINARY(500));
 
INSERT INTO @Sample SELECT * FROM sys.dm_io_virtual_file_stats(NULL,NULL);
 
WAITFOR DELAY '00:00:10';
 
SELECT TOP 5 DB_NAME(I.database_id) AS 'Database Name'
, SUM(I.num_of_reads - S.num_of_reads) +
SUM(I.num_of_writes - S.num_of_writes) AS 'Number of I/Os'
FROM sys.dm_io_virtual_file_stats(NULL,NULL) I
JOIN @Sample S
ON I.database_id = S.database_id
AND I.file_id = S.file_id
GROUP BY I.database_id
ORDER BY SUM(I.num_of_reads - S.num_of_reads) +
SUM(I.num_of_writes - S.num_of_writes) DESC;

Here I have captured the first sample of I/O statistics by inserting the records returned from sys.dm_io_virtual_file_stats DMF into a table variable named @Sample. I then use the WAITFOR statement to wait 10 seconds and then I compare the information from the sys.dm_io_virtual_file_stats with what was saved in the table variable. In this comparison, I calculate the number of I/Os that have been accumulated between both samples, by subtracting the count of I/Os from the first sample, from the I/O count of the second sample, by database. I then display the top five databases based on the number of I/Os submitted during this 10 second window.

As you can see, there are a number of different ways you can use the sys.dm_io_virtual_file_stats DMF, to calculate which database is causing the bulk of the database I/Os on your instance.

Which T-SQL Statements Are Issuing those I/Os?

Now that you know which databases are issuing the bulk of the I/O operations you are probably asking yourself which T-SQL statements within that database is issuing all the I/Os. When you are doing performance analysis, you want to find those statements that are issuing the majority of I/Os. You might find all it takes is to fine-tune a few queries to reduce your database I/Os considerably.

To find the most I/O intensive commands for a specific database you need to use a couple different DMVs and DMFs. The "sys.dm_exec_query_stats" DMV has the I/O statistics by SQL statement , but not the text of the statement. To identify text of the SQL statements, you need to join the output of the query stats DMV with the "sys.dm_exec_sql_text" DMF. In addition, to make matters even a little more complicated, you need to review the plan attribute information from the "sys.dm_exec_plan_attributes" DMF to identify database for ADHOC queries. Below is a T-SQL statement that identifies the 100 most expensive T-SQL statements from an I/O perspective that were executed from the "master" database:

SELECT TOP 100
COALESCE(DB_NAME(st.dbid),
DB_NAME(CAST(pa.value AS INT))+'*') AS [Database Name]
-- find the offset of the actual statement being executed
,SUBSTRING(text,
CASE WHEN qs.statement_start_offset = 0
OR qs.statement_start_offset IS NULL
THEN 1
ELSE statement_start_offset/2 + 1 END,
CASE WHEN qs.statement_end_offset = 0
OR qs.statement_end_offset = -1
OR qs.statement_end_offset IS NULL
THEN LEN(st.text)
ELSE qs.statement_end_offset/2 END -
CASE WHEN qs.statement_start_offset = 0
OR qs.statement_start_offset IS NULL
THEN 1
ELSE qs.statement_start_offset/2 END + 1
) AS [Statement]
, qs.creation_time AS [Plan Creation Date]
, qs.execution_count [Execution Count]
,(qs.total_logical_reads + qs.total_logical_writes
+ qs.total_physical_reads )/qs.execution_count [Average IOs]
,qs.total_logical_reads + qs.total_logical_writes
+ qs.total_physical_reads [Total IOs]
,qs.total_logical_reads/qs.execution_count [Avg Logical Reads]
,qs.total_logical_reads [Total Logical Reads]
,qs.total_logical_writes/qs.execution_count [Avg Logical Writes]
,qs.total_logical_writes [Total Logical Writes]
,qs.total_physical_reads/qs.execution_count [Avg Physical Reads]
,qs.total_physical_reads [Total Physical Reads]
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.plan_handle) st
OUTER APPLY sys.dm_exec_plan_attributes(qs.plan_handle) pa
WHERE attribute = 'dbid'
AND COALESCE(DB_NAME(st.dbid),
DB_NAME(CAST(pa.value AS INT))) = 'master'
ORDER BY (total_logical_reads + total_logical_writes
+ total_physical_reads) / execution_count DESC

Let me go through this query to help explain what it is doing.

First, you can see I am displaying a number of different statistics associated with the "Statement" that was executed. I am displaying totals and average I/Os for the physical, and logical reads as well as the write I/Os. All of the statistics come out of the "sys.dm_exec_query_stats" DMV.

Next, I performed a CROSS APPLY with the sys.dm_exec_sql_text based on the plan_handle to identify statements associated with the compiled query plan. Since some query plans can contain multiple statements, I need to use the "statement_start_offset" and "statement_end_offset" columns to identify which statement the statistics are associated with in the query text.

Lastly, I perform an OUTER APPLY with the sys.dm_exec_plan_attributes DMF to identify the Database ID for ADHOC plans. I need to do this because the sys.dm_exec_query_text DMF does not contain the Database ID for ADHOC query plans. Just to let you know, the "pa.value" identified in the above query is the Database ID in which the query is submitted. Therefore, it is a possibility that the "Database Name" generated from this query is not really the database in which the T-SQL text was run. This could be true if three part naming is used when identifying database objects in an execution plan. You can identify these ADHOC queries and their derived database names by the "*" that follows the database name in the output from this query.

Once I get all the data together from the different DMVs and DMFs, I then calculate I/Os by command summarizing the reads and writes and then divide the summed amount by the execution count to calculate the average number of I/Os per statement execution. Finally, I output the statistics order by the statement that performs the most I/Os on average.

When reviewing output from this query it is important to look at the number of times a query has executed (Execution Count) verses average I/Os per query. This will allow you to focus in on those queries that are executed frequently. I say this because, if you have a query that is executed hundreds of thousands of times and you can cut down the average I/Os per execution significantly then you drastically reduced the total I/Os since, the T-SQL statement is executed so many times. That is not to say you should totally ignore the T-SQL statements that are executed just a few times, especially if they are executing millions of I/Os per execution. It really comes down to balancing your time, and the effort related to resolving the I/O issues. Sometimes all it takes is adding a missing index, and other times it is a total rework/redesign of your databases.

Keep in mind that these statistics are collected only since SQL Server started up, or a query plan was cached. Therefore, when you are reviewing the output of the above query you might consider when the plan was created (Plan Creation Date) so you know how long SQL Server has been keeping statistics for each statement.

Tune-up Your Server by Reducing I/O

I/O is the resource that T-SQL commands wait on the longest. The more I/Os a query does the longer it will take to run the query. By looking at your instance and identifying the databases that are using the bulk of the I/Os will help you identify where to focus your tuning effort. By drilling into the high I/O databases and finding the specific T-SQL statements that consume large amounts of I/Os, you can quickly identify those problem queries that you should look at from a tuning perspective. By reducing I/O on your big I/O consuming queries and databases, you can help tune-up your server from an I/O perspective.

Additional Resources

Technet: Dynamic Management Views and Functions (Transact-SQL)
MSDN Blogs - Best Practices for Dynamic Management Views - George Heynen

» See All Articles by Columnist Gregory A. Larsen








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers