Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted August 1, 2016

Temporal Data Part 3 – Reporting Out Current and Historical Information

By Greg Larsen

As you start creating temporal tables to track history information you might find it necessary to join the history records with the current records into a single set of records.  With the introduction of temporal table support in SQL Server 2016 Microsoft also added some additional functionality that makes it easy for you to join the current and history records of a system-versioned table. In this article I will show you some of the different ways to do analysis of your system-versioned records over time.

Business Case

Management is always looking for ways to compare past historical information with current information to produce trend reports.  With these trend reports management is armed with information to track business metrics over time.   Along with their direction to track data changes over time they now have the following reporting requirements to display historical trend information over time.  Management has asked the SQL Team to demonstrate how new temporal data support functionality makes it easy to track historical information.

Sample Data Salary History Information

Prior to showing you how to write a single query that meets the “Business Situation” documented above, I first need to create a dbo.EmpSalary table record and then modify it a few times.  To perform that automation I will run the following code:

USE TemporalDemo;
GO
CREATE TABLE dbo.EmpSalary  
(    
     EmpID int NOT NULL PRIMARY KEY CLUSTERED  
   , SalaryAmt decimal (10,2) NULL  
   , SalaryBeginDT datetime2 GENERATED ALWAYS AS ROW START NOT NULL  
   , SalaryEndDT datetime2 GENERATED ALWAYS AS ROW END NOT NULL  
   , PERIOD FOR SYSTEM_TIME (SalaryBeginDT, SalaryEndDT)     
)    
WITH (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.EmpSalaryHistory));
 
INSERT INTO EmpSalary (EmpID, SalaryAmt)
VALUES (1,35899.00);
WAITFOR DELAY '00:05:00';
UPDATE EmpSalary 
   SET SalaryAmt=45999.00
WHERE EmpID = 1;
WAITFOR DELAY '00:05:00';
UPDATE  EmpSalary 
   SET SalaryAmt=52499.00
WHERE EmpID = 1;
WAITFOR DELAY '00:05:00';
UPDATE EmpSalary 
   SET SalaryAmt=65000.00
WHERE EmpID = 1;
 

In this code as you can see I created my dbo.EmpSalary and dbo.EmpSalaryHistory tables.  Once the table was created I then inserted an employee salary record, and then updated the salary record three different times.   I used the WAITFOR DELAY command to make it 5 minutes between each of the above statements that entered data into my table. This was used to simulate the salary being changed over time. 

Showing All Records Current and History

In order to track the salary history information of a specific employee over time I will need to read both the primary dbo.EmpSalary table, as well as the dbo.EmpSalarytable records.  By combining data from both tables I will be able to show the duration of time a given employee had a particular salary.  

To build a report that shows how the employee’s salary changed over time I will run the following code:

SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, 
             '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME ALL   
ORDER BY EmpID, SalaryBeginDT Desc  

When I run the code above I get the following results:

EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- --------
1     65000.00  999999   2016-07-12 13:24:43.0419579 9999-12-31 23:59:59.9999999 Current
1     52499.00  5        2016-07-12 13:19:43.0395947 2016-07-12 13:24:43.0419579 History
1     45999.00  5        2016-07-12 13:14:43.0364613 2016-07-12 13:19:43.0395947 History
1     35899.00  5        2016-07-12 13:09:43.0261472 2016-07-12 13:14:43.0364613 History

If you look at the results of this query you can see that 3 of the 4 records have a RecType of “History”.  This column identifies that these three records came for the history table associated with my EmpSalary table.  Additionally for each record I calculated a Duration column that shows how long each record was active.  In my case because I added a new salary every five minutes all the history records have a duration of 5 minutes, whereas the current records has a duration of 999999.

Finding Specific Records Active for a Given Timeframe

Management also want to run some point in time reports.  They wanted to look at a given date and time and then determine which record is active for the specific data/time.  Or they wanted to look at a date/time range and then identify all records that were active during that time period. 

With temporal data you can use the “AS OF” clause as I have done below to find out which salary records was active for a specific date/time:

  SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME AS OF '2016-07-12 13:19';

When I run this code I get the following results:

EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- -------
1     45999.00  5        2016-07-12 13:14:43.0364613 2016-07-12 13:19:43.0395947 History

Here you can see the SalaryBeginDT is less than or equal to the “AS OF” date value (‘2016-07-12 13:19’), and the SalaryEndDT is greater or equal to the “AS OF” date/time.   By using the “AS OF” constraint it was easy to find the employee salary record that was active on a particular date/time.  

Let me show you an edge case of using the “AS OF” data to determine what happens when you specify the “2016-07-12 13:19:43.0395947”.  This date as you can see is the SalaryEndDt of the above record. Here is the code to retrieve the record that SQL Server thinks is active for “2016-07-12 13:19:43.0395947”:

SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME AS OF '2016-07-12 13:19:43.0395947';

When I run this code I get the following output:

EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- -------
1     52499.00  5        2016-07-12 13:19:43.0395947 2016-07-12 13:24:43.0419579 History

Here you can see I got a totally different record than I had in the last query.  The record returned from this query has a begin date equal to the “AS OF” date of the query.  This shows that SQL Server doesn’t include records that have an end date/time equal to the “AS OF” date.

Finding all Records Valid for a Date Range

If you need to find all the historical records that were active for a given start and end date/time you can use the “FROM...TO”, “BETWEEN ...AND” or the “CONTAINED IN” clause.  Below are three similar examples where each one uses the same date/time, with a different data range clause:

-- Example #1 using FROM...TO
SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME FROM '2016-07-12 13:19:42' TO '2016-07-12 13:24:43.0419579'
ORDER BY SalaryEndDT DESC;
-- Example #2 Using CONTAINED IN
SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME CONTAINED IN( '2016-07-12 13:19:42', '2016-07-12 13:24:43.0419579')
ORDER BY SalaryEndDT DESC;
 -- Example #3 Using BETWEEN ... AND
 SELECT    
     EmpId
   , SalaryAmt
   , IIF (YEAR(SalaryEndDT) = 9999, '999999',DATEDIFF(mi,SalaryBeginDT, SalaryEndDT)) AS Duration
   , SalaryBeginDT   
   , SalaryEndDT 
   , IIF (YEAR(SalaryEndDT) = 9999, 'Current', 'History') AS RecType 
FROM EmpSalary FOR SYSTEM_TIME BETWEEN '2016-07-12 13:19:42' AND '2016-07-12 13:24:43.0419579'
ORDER BY SalaryEndDT DESC;
 

When I run this code I get the following output, where there is a different record set for each SELECT statement.

 
 
 
EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- -------
1     52499.00  5        2016-07-12 13:19:43.0395947 2016-07-12 13:24:43.0419579 History
1     45999.00  5        2016-07-12 13:14:43.0364613 2016-07-12 13:19:43.0395947 History
 
(2 row(s) affected)
 
EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- -------
1     52499.00  5        2016-07-12 13:19:43.0395947 2016-07-12 13:24:43.0419579 History
 
(1 row(s) affected)
EmpId SalaryAmt Duration SalaryBeginDT               SalaryEndDT                 RecType
----- --------- -------- --------------------------- --------------------------- -------
1     65000.00  999999   2016-07-12 13:24:43.0419579 9999-12-31 23:59:59.9999999 Current
1     52499.00  5        2016-07-12 13:19:43.0395947 2016-07-12 13:24:43.0419579 History
1     45999.00  5        2016-07-12 13:14:43.0364613 2016-07-12 13:19:43.0395947 History
 
(3 row(s) affected)
 

If you review my code above you can see that I issued three different SELECT statements where each statement used a different expression in the FOR SYSTEM_TIME clause.  The first statement used the “FROM ... TO” expression, the second statement used the CONTAINED IN expression, and the last statement used the BETWEEN expression.  Note that I used the same from and to dates and times for each of these different SELECT statements, yet each one produced a different result set. 

The FROM ... TO expression in the first statement returns rows that where active during some period between the begin date and end date of the expression.  In this example my date range is runs with a begin date of '2016-07-12 13:19:42' to an end date of '2016-07-12 13:24:43.0419579'.  If you look at result set 1 you can see two rows were returned.  Where the first row had a SaleryBeginDT value of “2016-07-12 13:19:43.0395947” and SalaryEndDT of “2016-07-12 13:24:43.0419579which you can see is within the range of the FROM and TO dates.  In fact the end data of that record has the same end date as the “FROM...TO’ value.  Note that the “FROM...TO” date range only covers the last part of the date range of second rows. Therefore this row was active during the date/time range, but only partially.

In the second example only one row was returned.   This is because of the CONTAINED IN clause.  With the CONTAINED IN clause the records returned are those that are contained with the date range specified.  Meaning begin date and end date of the current or history records needs to be within the date range specified in the CONTAINED IN clause.  There is only a single record in my salary tables that meet this criteria.

The third example the SELECT statement in the code above used the “BETWEEN...AND” expression to identify the range value for searching.  In my example the following “BETWEEN...AND” clauses used these two dates/times for the date range: “2016-07-12 13:19:42” and “2016-07-12 13:24:43.0419579”.  As you can see this query returned three rows.  This query additionally returned the current employee salary record.  The “BETWEEN...AND” clause selects records that begin on the end date range value.

By using any one of these three clauses you should be able to write some T-SQL that will bring back any of the current or history rows that meet your date criteria, with a single SELECT statement.

Summary

As you can see it is quite easy to write a single select statement to bring back different sets of current and historical rows.  By using the “FOR SYSTEM_DATE ALL” criteria you can select all current and history records with a single SELECT statement.  If you want to bring back just a subset of current or historical rows you could use the “FROM...TO”, “BETWEEN ... AND” or the “CONTAINED IN” clauses to return just the rows you need.  Next time you want to query both current and history tables without building a union query consider using one of the SYSTEM_DATE selection criteria’s.

See all articles by Greg Larsen



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM