dcsimg

Tips for Using Common Table Expressions

October 29, 2010

Greg Larsen explores using multiple Common Table Expressions (CTEs), the MAXRECUSION option, how to use a CTE within a cursor, and what you cannot do within a CTE.

I introduced you to Common Table Expression (CTE) in my last article. In that article, I covered the basic syntax of a CTE and provided a couple of examples of using CTEs. This article will be an extension of that article. This article explores using multiple CTEs, the MAXRECUSION option, how to use a CTE within a cursor, and what you cannot do within a CTE.

Using Multiple CTEs

Multiple CTEs can be used within the scope of a single SELECT, UPDATE, DELETE, INSERT or CREATE VIEW statement. There are a number of different flavors of how multiple CTEs can be used. In this section, I will describe two different ways that you can use multiple CTEs to support querying SQL Server.

For my first example, I will be creating two different CTEs that are used to join some data in a SELECT statement. To do this I will be defining my two CTEs using a WITH statement and then referencing each of the defined CTEs within my SELECT statement. Being able to define two different CTEs with separate logic, and then joining the CTE output in a single SELECT statement makes it easier to write and debug your CTE code, one CTE at a time. Additionally being able to do this makes the code much easier to read, and maintain.

WITH
---------------
-- First CTE --
---------------
SalesDollarPerYear_CTE (SalesPersonID, FullName,Title, SalesTerritory, Year2002, Year2003, Year2004 )
AS (
SELECT
pvt.[SalesPersonID]
,pvt.[FullName]
,pvt.[Title]
,pvt.[SalesTerritory]
,pvt.[2002]
,pvt.[2003]
,pvt.[2004]
FROM (SELECT
soh.[SalesPersonID]
,c.[FirstName] + ' ' + COALESCE(c.[MiddleName], '') + ' ' + c.[LastName] AS [FullName]
,e.[Title]
,st.[Name] AS [SalesTerritory]
,soh.[SubTotal]
,YEAR(DATEADD(m, 6, soh.[OrderDate])) AS [FiscalYear]
FROM [Sales].[SalesPerson] sp
INNER JOIN [Sales].[SalesOrderHeader] soh
ON sp.[SalesPersonID] = soh.[SalesPersonID]
INNER JOIN [Sales].[SalesTerritory] st
ON sp.[TerritoryID] = st.[TerritoryID]
INNER JOIN [HumanResources].[Employee] e
ON soh.[SalesPersonID] = e.[EmployeeID]
INNER JOIN [Person].[Contact] c
ON e.[ContactID] = c.ContactID
) AS soh
PIVOT
(
SUM([SubTotal])
FOR [FiscalYear]
IN ([2002], [2003], [2004])
) AS pvt),
---------------
-- Second CTE --
---------------
SalesCountPerYear_CTE (SalesPersonID, NumberOfOrders, MaxDate)
AS
(
SELECT SalesPersonID, COUNT(*), MAX(OrderDate)
FROM Sales.SalesOrderHeader
GROUP BY SalesPersonID
)
-------------------
-- Use Both CTEs --
-------------------
SELECT * FROM SalesDollarPerYear_CTE Y
JOIN SalesCountPerYear_CTE S
ON Y.SalesPersonID = S.SalesPersonID

In this example the first CTE SalesDollarsPerYear_CTE is defined using a WITH statement like normal. This CTE returns rolled up Sales.SalesOrder history information by sales person. The rolled up information summarizes the order SubTotal amounts by fiscal year. The second CTE is defined right after the first CTE, using the same WITH clause, by placing a comma after the first CTE. Once both CTEs are defined, an easy to read SELECT statement references each CTE. In this case, I joined the output of each CTE based on SalesPersonID.

Not only can you define multiple CTEs and reference them in a single SELECT statement, but you can also have a CTE that references another CTE. In order to do this all you need to do is define the referenced CTE prior to using it. Here is an example where my first CTE is referenced inside the second CTE definition.

USE AdventureWorks;
GO
WITH
---------------
-- First CTE --
---------------
ProductSales_CTE (ProductName, TotalSold)
AS
(
SELECT P.Name, COUNT(*)
FROM Sales.SalesOrderDetail SOD
RIGHT OUTER JOIN Production.Product P
ON SOD.ProductID = P.ProductID
GROUP BY P.Name
),
---------------------------------------------
-- Second CTE that reference the first CTE --
---------------------------------------------
WorstBestProduct_CTE (ProductName, TotalSold)
AS
(
SELECT TOP 1 ProductName, TotalSold
FROM ProductSales_CTE
ORDER BY TotalSold DESC
UNION
SELECT TOP 1 ProductName, TotalSold
FROM ProductSales_CTE
ORDER BY TotalSold ASC
)
--------------------------
-- Using the Second CTE --
--------------------------
SELECT ProductName, TotalSold
FROM WorstBestProduct_CTE;

Here you can see that I first defined a CTE named "ProductSales_CTE". This CTE calculates the number of items sold for each Production.Product . The second CTE "WorstBestProduct_CTE" references the ProductSales_CTE. In this second CTE, I return the Productname that sold the fewest number of items, as well as the ProductName that sold the most number of items. This was done by referencing the "ProductSales_CTE" twice in my second CTE. The two different SELECT statements sort the ProductSales_CTE output on TotalSales, first in descending order, than in ascending order, and select the first record returned. After both CTEs were defined, the final SELECT statement only needs to reference the "WorstBestProduct_CTE" to execute both CTEs.

MAXRECUSION

When you are writing recursive CTEs, there is a possibility that the code you write could cause an infinite loop. An infinite loop would occur if the recursive part of your CTE always returned at least one row. Here is an example of a recursive CTE that causes an infinite loop.

USE AdventureWorks;
GO
WITH MyCTE(Number) AS
(
SELECT 1 AS Number
UNION ALL
SELECT R.Number + 1
FROM MyCTE AS R
)
SELECT Number
FROM MyCTE;

This is a very simple recursive CTE. This CTE code demonstrates what happens when an infinite loop is coded. This CTE just generates a list of numbers starting from 1 and increments them by 1 for each new number returned from the CTE. Because the recursive part of this CTE does not indicate a stopping point the MAXRECUSION setting will control when this recursion query will stop.

MAXRECUSION can be used to control the number times the recursive part of the query (the SELECT statement that references the CTE) is executed. The default value for MAXRECUSION is 100. If you execute the above code, you will find that after 100 executions of the second SELECT statement in "MyCTE" the following error will occur:

Msg 530, Level 16, State 1, Line 1

The statement terminated. The maximum recursion 100 has been exhausted before statement completion.

After the error occurs, if you look at the output produced, you will see that 101 different numbers were created.

If the logic of your CTE and the data that your CTE works against requires more iteration than the default value for MAXRECUSION, you can override the default number by using the MAXRECURSION query hint. Here is an example where I have modified the above TSQL batch to perform 200 iteration of the recursive SELECT statement in "MyCTE".

WITH MyCTE(Number) AS
(
SELECT 1 AS Number
UNION ALL
SELECT R.Number + 1
FROM MyCTE AS R
)
SELECT Number
FROM MyCTE 
OPTION(MAXRECURSION 200);
GO

Note that since my CTE is still an infinite loop CTE, that when 200 iteration are reached, an error similar error to above is still thrown.

The maximum number you can set when using the MAXRECUSIVE query hint is 32,767. That does not mean if your data and CTE needs 40,000 iterations to complete that you cannot build a CTE. If you need your CTE to perform more than 32,767 recursive iterations, you can also set the MAXRECUSIVE value to 0 (zero). This tells the optimizer to let this query run through as many iterations as needed to complete your CTE. Here is a modified example of my infinite loop CTE where I set the MAXRECURSIVE value to 0.

WITH MyCTE(Number) AS
(
SELECT 1 AS Number
UNION ALL
SELECT R.Number + 1
FROM MyCTE AS R
WHERE R.Number < 100000
)
SELECT Number
FROM MyCTE 
OPTION(MAXRECURSION 0);
GO

As you can see I have now change my infinite loop CTE so the recursive SELECT statement stops after returning 100000 different numbers. Note, that if your recursive part of your query never returns an empty set, then your CTE query will become an infinite loop. Therefore, be careful when you set the MAXRECURSIVE value to 0.

Using a CTE Within a Cursor

A CTE can also be referenced within a cursor definition. Here is a simple CTE that is referenced in a cursor that retrieves employee records one at a time.

USE [AdventureWorks]
GO
DECLARE Emp_Cursor CURSOR FOR
----------------------
-- CTE defined here --
----------------------
WITH Employee_CTE (EmployeeID, FirstName, MiddleName, LastName, Title, Phone,
EmailAddress, EmployeeAddress, EmployeeCity)
AS (
SELECT
e.[EmployeeID]
,c.[FirstName]
,c.[MiddleName]
,c.[LastName]
,e.[Title] AS [JobTitle]
,c.[Phone]
,c.[EmailAddress]
,a.[AddressLine1]
,a.[City]
FROM [HumanResources].[Employee] e
INNER JOIN [Person].[Contact] c
ON c.[ContactID] = e.[ContactID]
INNER JOIN [HumanResources].[EmployeeAddress] ea
ON e.[EmployeeID] = ea.[EmployeeID]
INNER JOIN [Person].[Address] a
ON ea.[AddressID] = a.[AddressID])
SELECT LastName + ' ,' + FirstName AS EmpName,
EmailAddress
FROM Employee_CTE;
DECLARE @EmpName varchar(100);
DECLARE @Email varchar(100);
DECLARE @I INT;
SET @I = 0;
OPEN Emp_Cursor;
FETCH NEXT FROM Emp_Cursor INTO @EmpName, @Email;
WHILE @@FETCH_STATUS = 0
BEGIN;
SET @I = @I + 1;
PRINT 'Employer #' + CAST(@I AS CHAR(3));
PRINT 'Employname = ' + @EmpName + CHAR(13) +
'Email = ' + @Email + CHAR(13) + CHAR(13);
FETCH NEXT FROM Emp_Cursor INTO @EmpName, @Email;
END;
CLOSE Emp_Cursor;
DEALLOCATE Emp_Cursor;

In this example, I created a CTE named "Employee_CTE". In order to use that CTE in a cursor, I had to define it right inside the "DECLARE Emp_Cursor CURSOR" statement. Some people make the mistake of trying to define their cursor right before the "DECLARE …CURSOR" statement. When you do this, you will get an error. Once I have defined my CTE inside my cursor definition, the rest of the cursor processing is similar to processing any other cursor.

What Cannot Be Used Within a CTE

Not all T-SQL statements and clauses can be used within a CTE. In some cases certain statements/clauses just cannot be used in a recursive CTEs.

Below, is a list of those statements and/or clauses that cannot be used in ANY CTE.

  • COMPUTE or COMPUTE BY
  • ORDER BY (except when a TOP clause is specified)
  • INTO
  • OPTION clause with query hints
  • FOR XML
  • FOR BROWSE

Here are the statements and/or clauses that cannot be used in a recursive CTE:

  • SELECT DISTINCT
  • GROUP BY
  • HAVING
  • Scalar aggregation (meaning you can't use min or max)
  • TOP
  • LEFT, RIGHT, OUTER JOIN (INNER JOIN is allowed)
  • Subqueries
  • A hint applied to a recursive reference to a CTE inside a CTE_query_definition.

Simplify Your T-SQL using CTEs

CTEs are very much like creating a view or temporary table, but they can be called recursively. Being able to define multiple CTEs that join two complicated T-SQL statements makes your T-SQL code easier to write, debug and maintain. The power of multiple CTEs and recursion allows you to write your code in a series of multiple building blocks of simple code. By using a CTE, you can simplify your complicated T-SQL logic.

» See All Articles by Columnist Gregory A. Larsen








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers