Introduction
SQL Server 2016 introduces the Real-time Operational Analytics feature, which allows you to run both transactional/operational and analytics workloads on the same system – without having the need for another analytics system for reporting.
--Load data to these tables INSERT INTO SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex (SalesOrderID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate) SELECT SalesOrderID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate FROM [AdventureWorks2014].Sales.[SalesOrderDetail] GO 85 INSERT INTO SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex (SalesOrderID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate) SELECT SalesOrderID, CarrierTrackingNumber, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal, rowguid, ModifiedDate FROM [AdventureWorks2014].Sales.[SalesOrderDetail] GO 85
Now let me create a clustered index (row-store) and one non-clustered index (row-store) on the first table.
--Creating Indexes - Row Store index clustered index and Row Store non-clustered index CREATE CLUSTERED INDEX [CIRS_SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex_SalesOrderID_SalesOrderDetailID] ON SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex (SalesOrderID, SalesOrderDetailID) GO CREATE NONCLUSTERED INDEX [NCRS_SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex_All] ON SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex (SalesOrderID, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal) GO
Next, I create a clustered index (row-store) and a non-clustered columnstore index on the second table.
--Creating Indexes - Row Store clustered index and non-clustered Column Store index CREATE CLUSTERED INDEX [CIRS_SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex_SalesOrderID_SalesOrderDetailID] ON SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex (SalesOrderID, SalesOrderDetailID) GO CREATE NONCLUSTERED COLUMNSTORE INDEX [NCCS_SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex_All] ON SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex (SalesOrderID, OrderQty, ProductID, SpecialOfferID, UnitPrice, UnitPriceDiscount, LineTotal) GO --Creating Non-Clustered Columnstore Index (NCCI) is just a DDL operation and similar to any other --btree index that you would create on a rowstore table. No changes to the application needed --because the NCCI is updateable in SQL Server 2016 so your transaction workload will continue to run.
I have loaded same number of rows (~10 million) in each of these tables, and this is how you can verify it:
SELECT Count(*) FROM SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex SELECT Count(*) FROM SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex GO
Verifying the Number of Rows
Now we can run an aggregation query on the columns included in the index as shown below:
SELECT ProductID, SUM(OrderQty) AS 'QuatityOrdered', SUM(LineTotal) AS 'ProductWiseSale' FROM SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex GROUP BY ProductID ORDER BY ProductID SELECT ProductID, SUM(OrderQty) AS 'QuatityOrdered', SUM(LineTotal) AS 'ProductWiseSale' FROM SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex GROUP BY ProductID ORDER BY ProductID
If we look at the execution plan of the above two queries, the first query (which involves a table with clustered index (row-store) and non-clustered index (row-store)) has 96% relative cost in the batch with respect to the second query (which involves a table with clustered index (row-store) and a non-clustered columnstore index) which has 4% relative cost in the batch. You can also notice, the second query runs significantly faster because it makes use of non-clustered columnstore index (which is update-able now in SQL Server 2016 and can be created on any table without making it read-only like previous versions of SQL Server) whereas first query makes use of a non-clustered index and runs slower (not that optimal for analytical queries, as you can guess):
Execution Plan
When there is a non-clustered columnstore index, query optimizer picks it up for analytics queries when it finds it suitable and provides significant speed up over traditional B-Tree indexes. Now if we look at the IO (Input\Output) statistics of the above two queries this is how it looks:
(266 row(s) affected) Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'SalesOrderDetailWithRowStoreClusteredIndexAndRowStoreNonClusteredIndex'. Scan count 5, logical reads 64233, physical reads 2, read-ahead reads 62935, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. (266 row(s) affected) Table 'SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex'. Scan count 8, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 11116, lob physical reads 38, lob read-ahead reads 41988. Table 'SalesOrderDetailWithRowStoreClusteredIndexAndNonClusteredColumnStoreIndex'. Segment reads 11, segment skipped 0. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
If we look at the CPU time statistics of the above two queries this is how it looks:
(266 row(s) affected) SQL Server Execution Times: CPU time = 4688 ms, elapsed time = 3743 ms. (266 row(s) affected) SQL Server Execution Times: CPU time = 156 ms, elapsed time = 210 ms.
The conclusion here is the second query, which uses non-clustered columnstore index runs several times faster than the first query, which uses only row-store (B-Tree) indexes, making it possible to run analytical queries side by side without having much impact on the operational system.
For columnstore index (whether clustered or non-clustered), data gets stored in row groups and then to column segments; you can use the queries below to find out information about them along with the number of rows\values stored and disk size:
SELECT object_name(object_id) as ObjectName, * FROM sys.column_store_row_groups ORDER BY object_id, row_group_id SELECT object_name(p.object_id) AS ObjectName, C.column_id, C.segment_id , C.partition_id, sum(C.on_disk_size) AS on_disk_size, SUM(C.row_count) AS row_count FROM sys.column_store_segments C INNER JOIN sys.partitions p ON C.partition_id=p.partition_id GROUP BY P.object_id, C.partition_id, C.column_id, C.segment_id ORDER BY P.object_id, C.partition_id, C.column_id, C.segment_id
Conclusion
SQL Server 2016 introduces Real-time Operational Analytics, the ability to run both operational\transactional and analytics workloads on the same database tables at the same time.
In this article we looked into different design considerations for leveraging the Real-time Operational Analytics feature, looked into different scenarios where it fits or where it does not fit, and then finally we got started with it for disk based tables.
In the next article of the series, we will learn about creating a filtered non-clustered columnstore index to minimize the overhead of maintaining it and will learn about making use of this feature on memory-optimized tables.
Resources
Real-time Operational Analytics in SQL Server 2016 – Part 1
Get started with Columnstore for real time operational analytics
Understanding New Column Store Index of SQL Server 2012