Lead and Lag Functions in SQL Server 2012

Introduction

Have you ever been in a situation where you needed to write a query that needed to do comparisons or access data from the subsequent rows along with the data from the current row? This article discusses different ways to write these types of queries and more specifically examines LEAD and LAG analytics functions, which were introduced with SQL Server 2012, and helps you understand how leveraging these functions can aid you in such situations.

Accessing Prior or Subsequent Rows

SQL Server 2012 introduced LAG and LEAD functions for accessing prior or subsequent rows along with the current row but before we go into the details of these functions, let me explain how you can write these queries in earlier versions of SQL Server.

Let’s first create a table and load some sample data with the script below. This table contains customer information along with when a specific plan for the customer was started, assuming when a new plan starts, the older one gets ended automatically.

DECLARE   @CustomerPlan TABLE
  (
       CustomerCode VARCHAR(10),
         PlanCode   VARCHAR(10),
       StartDate  DATE
    )
INSERT INTO @CustomerPlan VALUES ('C00001', 'P00001', '1-Sep-2014')
INSERT INTO @CustomerPlan VALUES ('C00001', 'P00002', '1-Oct-2014')
INSERT INTO @CustomerPlan VALUES ('C00001', 'P00003', '10-Oct-2014')
INSERT INTO @CustomerPlan VALUES ('C00001', 'P00004', '25-Oct-2014')
INSERT INTO @CustomerPlan VALUES ('C00002', 'P00001', '1-Oct-2014')
INSERT INTO @CustomerPlan VALUES ('C00002', 'P00002', '1-Nov-2014')
 
SELECT * FROM   @CustomerPlan;

Query Results

You can use Common Table Expression (CTE) along with the ROW_NUMBER ranking function to access subsequent rows in the same result set. For example, for a given customer I want to know the expiration date for the current plan based on the activation date of the next plan. Basically, when a new plan is started the previous plan is automatically ended and hence the end date for a previous plan is the start date minus one day of the next plan:

WITH   CTE as 
(
        SELECT RN = ROW_NUMBER() OVER (PARTITION BY   CustomerCode ORDER BY   StartDate ASC),   * 
        FROM @CustomerPlan
)
SELECT
        [Current   Row].*, 
        ISNULL(DATEADD(DAY, -1, [Next Row].StartDate), '31-Dec-2099') AS EndDate
FROM   CTE [Current Row]
        LEFT JOIN CTE [Next   Row] ON [Current Row].CustomerCode   = [Next Row].CustomerCode   AND  [Next Row].RN   = [Current Row].RN   + 1
ORDER BY [Current Row].CustomerCode, [Current Row].RN;

Query Results

In the image above, you can see the plan P00002 of customer C00001 starts on 1st October 2014 and hence the end date for plan P00001 is the 30th September 2014 (1st October 2014 minus one day), likewise P00003 starts on 10th October and hence the end date for P00002 ends on 9th October 2014.

LEAD Function in SQL Server 2012

Though it was very much possible to write queries as shown above using CTE and ranking function to access prior or subsequent rows in earlier versions of SQL Server, SQL Server 2012 has simplified it further.

SQL Server 2012 introduced LEAD analytic function to access the subsequent row (or columns from the subsequent row) without using self-join or CTE or ranking function.

There are three main parameters for the LEAD function along with the OVER clause, which works in the same way as it works with ranking functions:

  • With the first parameter, you can specify a scalar expression or column name whose value from the subsequent row is to be returned.
  • With the second parameter, you can specify an offset to access not only the next immediate row but any row after the current row. For example, its default value of 1 accesses the next immediate row whereas a value of 3 accesses the third row from the current row.
  • With the third parameter, you can specify the default value to be returned in case the returned value is NULL.

The script below is a re-write of the above script (which uses CTE and ranking function) using the LEAD function, which gives the same result as above, but as you can see this query is more simplified and does not uses self-join.

SELECT *, 
DATEADD(DAY, -1, LEAD(StartDate, 1,'01-Jan-2100') 
        OVER (PARTITION BY   CustomerCode ORDER BY   StartDate ASC))   AS EndDate
FROM   @CustomerPlan

Query Results

LAG Function in SQL Server 2012

Like the LEAD function, LAG is another analytic function introduced in SQL Server 2012 that has the same three parameters as the LEAD function along with the OVER clause, which works in same way as it works with ranking functions. But unlike the LEAD function, which allows accessing subsequent rows, the LAG function allows accessing previous rows from the same result set, again without using self-join. With the second offset parameter, you can specify the physical offset that comes before the current row.

If you look at the structure of the table we created above, each row contains a customer’s current plan. With the help of the LAG function, we will find out the previous (plan from the last row) as shown in the script below:

SELECT   CustomerCode, PlanCode AS CurrentPlanCode,
LAG(PlanCode, 1, 'NA') 
        OVER (PARTITION BY   CustomerCode ORDER BY   StartDate ASC)   AS LastPlan
FROM   @CustomerPlan;

If you notice in the figure below, the second row contains the P00002 as the current plan and P00001 as the last plan (coming the immediate previous row); likewise the third row contains P00003 as the current plan and P00002 as the last plan (again coming from the immediate previous row).

Query Results

LEAD and LAG functions are flexible to let you specify the number of rows to move backward or forward from the current row with the offset parameter. For example, as you can see in the script below I have specified 2 as offset with the LAG function and hence for the third row where the current plan is P00003, the last plan is P00001 (as it goes two rows back to pick up the value).

SELECT   CustomerCode, PlanCode AS CurrentPlanCode,
LAG(PlanCode, 2, 'NA') 
        OVER (PARTITION BY   CustomerCode ORDER BY   StartDate ASC)   AS LastPlan
FROM   @CustomerPlan;

Query Results

Conclusion

In this article I discussed how you can use CTE and ranking function to access or query data from previous or subsequent rows. Then I also talked about how to leverage LEAD and LAG analytics functions, introduced with SQL Server 2012, to achieve the same without writing  self-join query using CTE and ranking function.

Resources

LEAD (Transact-SQL)

LAG (Transact-SQL)

See all articles by Arshad Ali

Arshad Ali
Arshad Ali
Arshad Ali works with Microsoft India R&D Pvt Ltd. He has 8+ years of experience, mostly on Microsoft Technologies. Most recently, as a SQL Developer and BI Developer he has been working on a Data Warehousing project. Arshad is an MCSD, MCITP: Business Intelligence, MCITP: Database Developer 2008 and MCITP: Database Administrator 2008 certified and has presented at several technical events including SQL-School. On an educational front, he has an MCA (Master in Computer Applications) and MBA-IT. Disclaimer : I work for Microsoft and help people and businesses make better use of technology to realize their full potential. The opinions mentioned herein are solely mine and do not reflect those of my current employer or previous employers.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles