Real-time Operational Analytics in SQL Server 2016 – Part 2

Introduction

SQL Server 2016 introduces the Real-time Operational Analytics feature, which allows you to run both transactional/operational and analytics workloads on the same system – without having the need for another analytics system for reporting.

Analytics Architecture Multiple Sources

--Load data to these tables
INSERT INTO   SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex (SalesOrderID,   CarrierTrackingNumber,   
OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate)
SELECT SalesOrderID, CarrierTrackingNumber,   OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, 
LineTotal, rowguid, ModifiedDate FROM [AdventureWorks2014].Sales.[SalesOrderDetail]
GO 85
INSERT INTO SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex (SalesOrderID,   CarrierTrackingNumber,   
OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate)
SELECT SalesOrderID, CarrierTrackingNumber,   OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, 
LineTotal, rowguid, ModifiedDate FROM [AdventureWorks2014].Sales.[SalesOrderDetail]
GO 85

Now let me create a clustered index (row-store) and one non-clustered index (row-store) on the first table.

--Creating Indexes - Row Store index clustered index and Row   Store non-clustered index
CREATE CLUSTERED INDEX   [CIRS_SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex_SalesOrderID_SalesOrderDetailID]
ON   SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex
(SalesOrderID, SalesOrderDetailID)
GO
CREATE NONCLUSTERED INDEX   [NCRS_SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex_All]
ON SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex
(SalesOrderID, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal)
GO

Next, I create a clustered index (row-store) and a non-clustered columnstore index on the second table.

--Creating Indexes - Row Store clustered index and non-clustered   Column Store index
CREATE CLUSTERED INDEX   [CIRS_SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex_SalesOrderID_SalesOrderDetailID]
ON SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex
(SalesOrderID, SalesOrderDetailID)
 GO
CREATE NONCLUSTERED COLUMNSTORE INDEX   [NCCS_SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex_All]
ON   SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex
(SalesOrderID, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal)
GO
--Creating Non-Clustered Columnstore Index (NCCI) is just a DDL   operation and similar to any other 
--btree index that you would create on a rowstore table. No   changes to the application needed 
--because the NCCI is updateable in SQL Server 2016 so your transaction   workload will continue to run.

I have loaded same number of rows (~10 million) in each of these tables, and this is how you can verify it:

SELECT Count(*) FROM   SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex
SELECT Count(*) FROM   SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex
GO

Verifying the Number of Rows
Verifying the Number of Rows

Now we can run an aggregation query on the columns included in the index as shown below:

SELECT ProductID, SUM(OrderQty) AS 'QuatityOrdered', SUM(LineTotal) AS 'ProductWiseSale' FROM 
SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex
GROUP BY ProductID
ORDER BY ProductID
 
SELECT ProductID, SUM(OrderQty) AS 'QuatityOrdered', SUM(LineTotal) AS 'ProductWiseSale' FROM 
SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex
GROUP BY ProductID
ORDER BY ProductID

If we look at the execution plan of the above two queries, the first query (which involves a table with clustered index (row-store) and non-clustered index (row-store)) has 96% relative cost in the batch with respect to the second query (which involves a table with clustered index (row-store) and a non-clustered columnstore index) which has 4% relative cost in the batch. You can also notice, the second query runs significantly faster because it makes use of non-clustered columnstore index (which is update-able now in SQL Server 2016 and can be created on any table without making it read-only like previous versions of SQL Server) whereas first query makes use of a non-clustered index and runs slower (not that optimal for analytical queries, as you can guess):

Execution Plan
Execution Plan

When there is a non-clustered columnstore index, query optimizer picks it up for analytics queries when it finds it suitable and provides significant speed up over traditional B-Tree indexes. Now if we look at the IO (Input\Output) statistics of the above two queries this is how it looks:

(266 row(s) affected)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex'. Scan count 5, logical reads 64233, physical reads 2, read-ahead reads 62935, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 
(266 row(s) affected)
Table 'SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex'. Scan count 8, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 11116, lob physical reads 38, lob read-ahead reads 41988.
Table 'SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex'. Segment reads 11, segment skipped 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

If we look at the CPU time statistics of the above two queries this is how it looks:

(266 row(s) affected)
 
 SQL Server Execution Times:
   CPU time = 4688 ms,  elapsed time = 3743 ms.
 
(266 row(s) affected)
 
 SQL Server Execution Times:
   CPU time = 156 ms,  elapsed time = 210 ms.

The conclusion here is the second query, which uses non-clustered columnstore index runs several times faster than the first query, which uses only row-store (B-Tree) indexes, making it possible to run analytical queries side by side without having much impact on the operational system.

For columnstore index (whether clustered or non-clustered), data gets stored in row groups and then to column segments; you can use the queries below to find out information about them along with the number of rows\values stored and disk size:

SELECT object_name(object_id) as ObjectName, * FROM sys.column_store_row_groups
ORDER BY object_id, row_group_id
 
SELECT object_name(p.object_id) AS ObjectName, C.column_id, C.segment_id , C.partition_id, sum(C.on_disk_size) AS on_disk_size, SUM(C.row_count) AS row_count
FROM sys.column_store_segments   C
INNER JOIN sys.partitions p
ON C.partition_id=p.partition_id
GROUP BY P.object_id, C.partition_id, C.column_id, C.segment_id 
ORDER BY P.object_id, C.partition_id, C.column_id, C.segment_id

Conclusion

SQL Server 2016 introduces Real-time Operational Analytics, the ability to run both operational\transactional and analytics workloads on the same database tables at the same time.

In this article we looked into different design considerations for leveraging the Real-time Operational Analytics feature, looked into different scenarios where it fits or where it does not fit, and then finally we got started with it for disk based tables.

In the next article of the series, we will learn about creating a filtered non-clustered columnstore index to minimize the overhead of maintaining it and will learn about making use of this feature on memory-optimized tables.

Resources

Real-time Operational Analytics in SQL Server 2016 – Part 1

Get started with Columnstore for real time operational analytics

Understanding New Column Store Index of SQL Server 2012

New Enhanced Column Store Index in SQL Server 2014

What’s New in SQL Server 2016

See all articles by Arshad Ali

Arshad Ali
Arshad Ali
Arshad Ali works with Microsoft India R&D Pvt Ltd. He has 8+ years of experience, mostly on Microsoft Technologies. Most recently, as a SQL Developer and BI Developer he has been working on a Data Warehousing project. Arshad is an MCSD, MCITP: Business Intelligence, MCITP: Database Developer 2008 and MCITP: Database Administrator 2008 certified and has presented at several technical events including SQL-School. On an educational front, he has an MCA (Master in Computer Applications) and MBA-IT. Disclaimer : I work for Microsoft and help people and businesses make better use of technology to realize their full potential. The opinions mentioned herein are solely mine and do not reflect those of my current employer or previous employers.

Latest Articles