In my prior T-SQL best practices articles I’ve talked about
different coding styles that led to slow and fast query executions, and ways to
minimize cached plans. This article will be an extension on those concepts
discuss in previous articles. I this article I will talk about parameterized
queries. I will discuss how the SQL Server optimizer tries to parameterize a
query if it can, as well as how you can build your own parameterized query.
What is a Parameterize Query?
A simple way to think of a parameterized query is that it is
just a T-SQL query that accepts parameters that control what is returned from a
query. By using different parameters, a parameterized query returns different
results. To get a parameterized query you need to write your code a specific
way or it needs to meet a specific set of criteria.
There are two different ways a parameterized query can be
created. The first method is let the query optimizer automatically
parameterize your query. Another method is to programmatically code a
parameterized query by writing your T-SQL code a specific way and passing it to
the sp_executesql system stored procedure. I will be discussing this method
later on in this article.
The key thing about parameterized queries is the query
optimizer will be creating a cached plan that can be re-used. By using
parameterized queries automatically or programmatically SQL Server is able to
optimize the processing of similar T-SQL statements. This optimization
eliminates the need to burn valuable resources to creating a cache plan for
each execution of these similar T-SQL statements. Also by creating a re-usable
plan, SQL Server is also reducing the memory usage required to store similar
execution plans in the procedure cache.
Now let’s look at different ways to get SQL Server to create
parameterized queries.
How Parameterized Queries are Created Automatically?
The people at Microsoft who have written the query optimizer
code have gone to great lengths to optimize the way SQL Server processes your
T-SQL command. I guess that’s how the query optimizer got its name. One of
these methods to minimize resources and maximize performance that query
optimizer performs is to look at a T-SQL statement and determine if they can be
parameterized. To look at how this works let’s take a look at the following
T-SQL statements:
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader
WHERE SalesOrderID = 56000;
GO
Here you can see that this command has a couple of traits.
One it is simple, and the other is it contains a single value specified in the
WHERE predicate for the SalesOrderID value. The query optimizer is able to
identify the simplicity of this query and the fact that there is a single
parameter (“56000”) for the SalesOrderID. Because of this, the query
optimizer is able to parameterize this query automatically.
If you use the following SELECT statement to look into a
clean buffer pool that contains only the cached plan for the above statement
you can see that the query optimizer has rewritten the T-SQL query as a
parameterized T-SQL statement:
SELECT stats.execution_count AS cnt,
p.size_in_bytes AS [size],
[sql].[text] AS [plan_text]
FROM sys.dm_exec_cached_plans p
OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) sql
JOIN sys.dm_exec_query_stats stats
ON stats.plan_handle = p.plan_handle;
GO
When I run this command on a SQL Server 2008 instance I get
the following output, (note output has been reformatted to make it easier to
read):
cnt size plan_text
— ——- ————————————————————–
1 49152 (@1 int)SELECT * FROM [AdventureWorks].[Sales].[SalesOrderHeader]
WHERE [SalesOrderID]=@1
If you look at the plan_text column in the above
output you can see it doesn’t look like the original T-SQL text. As stated
earlier the query optimizer re-wrote this query into a parameterized T-SQL
statement. Here you can see it now has a variable (@1) with a data type (int)
defined in the front of the SELECT statement. Also at the end of the plan_text
the value “56000” has been replaced with the @1 variable. Now that this
T-SQL statement has been re-written and stored as a cached plan it will be
available for re-used if a future T-SQL command stays mainly the same with only
changes to the valuing being supplied for the SalesOrderID column. Let’s see
this in action.
If I run the following commands on my machine:
DBCC FREEPROCCACHE;
GO
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader
WHERE SalesOrderID = 56000;
GO
SELECT *
FROM AdventureWorks.Sales.SalesOrderHeader
WHERE SalesOrderID = 56001;
GOSELECT stats.execution_count AS cnt,
p.size_in_bytes AS [size],
[sql].[text] AS [plan_text]
FROM sys.dm_exec_cached_plans p
OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) sql
JOIN sys.dm_exec_query_stats stats
ON stats.plan_handle = p.plan_handle;
GO
I get the following output from the final SELECT
statement, (note output has been reformatted to make it easier to read):
cnt size plan_text
— ——– ————————————————————–
2 49152 (@1 int)SELECT * FROM AdventureWorks].[Sales].[SalesOrderHeader]
WHERE [SalesOrderID]=@1
Here, I first freed the procedure cache, then I
executed two different, but similar non-parameterize queries to see if the
query optimizer would create two different cached plans or a single cached plan
that would be used for both queries. Here you can see that the query optimizer
was in fact smart enough to paramerterze the first query and cache the plan.
Then when the second query was sent to SQL Server that was similar but with a
different SalesOrderID value the optimizer was able to identify that a plan was
already cached and then re-used it to process the second query. You can tell
this because the “cnt” column now says the plan was used twice.
The database configuration PARAMETERIZATION option
can affect how T-SQL statements are automatically parameterized. There are two
different settings for this option, SIMPLE and FORCED. When the
PARAMETERIZATION setting is set to SIMPLE, only simple T-SQL statements are
parameterized. To demonistrate this, take the following command:
SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = 56000
This query is similar to my prior examples, except
here I added an additional JOIN criteria. When the AdventureWorks database
PARAMETERIZATION option is set to SIMPLE this query is not automatically
parameterized. The SIMPLE PARAMETERIZATION setting tells the query optimizer
to only parameterize simple queries. But when the PARAMETERIZATION option is
set to FORCED this query is automatically parameterized.
When you set the database option to use FORCE
PARAMETERIZATION the query optimizer tries to parameterize all queries, not
just simple queries. You might think this would be good. But in some cases
the query optimizer will pick query plans that are suboptimal when the database
PARAMETERIZATION setting is FORCED. When the database PARAMETER setting is
FORCED it changes the literal constants in a query. This may lead to indexes
and index views from not being selected to partipate in the execution plan when
computed columns are involved in query, leading to an inefficient plan. The
FORCED PARAMETERIZATION option might be a good solution to improve performance
on databases that have a high volumn of similar queries with slightly different
parameters being passed to those queries. An online sales application, where
customers are performing lots of similar seaches of your products, with
different product values might be a good type of application that would benefit
from FORCED PARAMETERIZATION.
Not all query clauses will be parameterized. For example
TOP, TABLESAMPLE, HAVING, GROUP BY, ORDER BY,
OUTPUT…INTO, or FOR XML clauses of a query are not parameterized. For a
complete list of clauses that are not parameterized refer to the “Forced
Parameterized” topic in the Books Online documentation.
Using sp_execute_sql to Parameterize your T-SQL
You don’t have to rely on
the database PARAMETERIZATION options to get the query optimizer to
parameterize a query. You can parameterize your own queries. You do this by
re-writing your T-SQL statements and executing the rewritten statements using
the “sp_executesql” system stored procedure. As already seen, the above
SELECT statement that included a “JOIN” clause was not automatically parameterized
when the database PARAMETERIZATION setting was SIMPLE. Let me rewrite that
query so the query optimizer will create a re-usable parameterized query
execution plan.
To demonstrate let’s look
at two similar T-SQL statements that do not automatically get parameterized and
create two different cached execution plans. Then I will re-write the two
queries so they both use the same cached parameterized execution plan.
Let’s take a look at this
code:
DBCC FREEPROCCACHE
GO
SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = 56000
GO
SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = 56001
GO
SELECT stats.execution_count AS cnt,
p.size_in_bytes AS [size],
LEFT([sql].[text], 200) AS [plan_text]
FROM sys.dm_exec_cached_plans p
OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) sql
JOIN sys.dm_exec_query_stats stats ON stats.plan_handle = p.plan_handle;
GO
Here I’m freeing the
procedures cache and then running the two different non-simple T-SQL statements
that contain a JOIN. Then I’m reviewing the cached plans. Here is the output
from the SELECT statement that uses the DMV’s (note output has been
reformatted to make it easier to read):
cnt size plan_text
— ———– ——————————————————————————-
1 49152 SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D
ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = 56001
1 49152 SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D
ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = 56000
As you can see from this
output these two SELECT statements are not parameterize by the query optimizer.
The optimizer creates two different cached execution plans, where each one is
executed only once. We can help the optimizer out by using sp_executesql
system stored procedure to create a parameterized execution plan for these two
different SELECT statements.
Here is the code above,
re-written to use the sp_executesql system stored procedure:
DBCC FREEPROCCACHE;
GOEXEC sp_executesql N’SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = @SalesOrderID’, N’@SalesOrderID INT’, @SalesOrderID = 56000;
GOEXEC sp_executesql N’SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = @SalesOrderID’, N’@SalesOrderID INT’, @SalesOrderID = 56001;
GOSELECT stats.execution_count AS exec_count,
p.size_in_bytes AS [size],
[sql].[text] AS [plan_text]
FROM sys.dm_exec_cached_plans p
OUTER APPLY sys.dm_exec_sql_text (p.plan_handle) sql
JOIN sys.dm_exec_query_stats stats ON stats.plan_handle = p.plan_handle;
GO
As you can see I’ve rewritten
the two SELECT statements to be executed by using the “EXEC sp_executesql”
statement. For each one of those EXEC statements I passed three different
parameters. The first parameter is the basic SELECT statement, but I’ve
replaced the value of the SalesOrderID with a variable (@SalesOrderID). In the
second parameter, I identified the data type for the @SalesOrderID, which is an
integer in this case. Then in the last parameter, I passed the value of the
SalesOrderID. This parameter will control the results that my SELECT produces
based on the SalesOrderID value. The first two parameters of each execution of
sp_executesql are exactly the same. However, the third parameter is different,
because each one has a different SalesOrderID value.
Now when I run the code
above I get the following output from the DMV SELECT statement (note
output has been reformatted to make it easier to read):
cnt size plan_text
— ———– —————————————————————————————–
2 49152 (@SalesOrderID INT)SELECT SUM(LineTotal) AS LineTotal
FROM AdventureWorks.Sales.SalesOrderHeader H
JOIN AdventureWorks.Sales.SalesOrderDetail D ON D.SalesOrderID = H.SalesOrderID
WHERE H.SalesOrderID = @SalesOrderID
From this output, you can
see that I have a single parameterized cached plan that was executed twice,
once for each EXEC statement.
Saving Resources and Optimizing Performance with Parameterized Queries
Each T-SQL statement
needs to be evaluated and an execution plan needs to be built before the
statement can be executed. Creating an execution plan takes valuable CPU
resources. Once an execution plan is created, it uses memory space to store it
in the procedure cache. One way to reduce CPU and memory usage is to take
advantage of parameterized queries. Even though a database can be set up to
FORCE parameterization on all queries, this is not always the best option. By
understand which of your T-SQL statements can be parameterized and then using
the sp_executesql stored procedure, you can help SQL Server save resources and
optimize the performance of your queries.