Column Store Index, which improves performance of data warehouse queries several folds, was first introduced in SQL Server 2012. Unlike regular indexes or heaps, which store data in B-Tree structure (in row-wise fashion), the column store index stores data in columnar fashion and uses compression aggressively to reduce the disk I/O needed to serve the query request. Apart from the benefits it provides, it has several limitations in SQL Server 2012. SQL Server 2014 enhances the columnstore index and overcomes several of the earlier limitations. In this article, I am going to talk about the new enhanced columnstore index feature in SQL Server 2014.
What’s Improved in SQL Server 2014 with Respect to Columnstore Index of SQL Server 2012
Columnstore index has been designed to substantially accelerate common data warehouse queries, which require scanning, aggregation and filtering of large amounts of data or joining multiple tables like a star schema (primarily performs bulk loads and read-only queries). Although it has several limitation in SQL Server 2012, SQL Server 2014 overcomes them:
- We can create only one non-clustered column store index on a table although we can include all the columns of the table in the single index.
- SQL Server 2014 enhances it to add support for creating Clustered Columnstore Index.
- When we create a column store index and it makes the table read only,
- With SQL Server 2014, you can create a columnstore index without having much impact on write-ability on the table. This means you can issue some INSERT, UPDATE, DELETE statements with a table with clustered columnstore index. No more tedious workaround is required for writing data to a table with columnstore index in this release like the previous release.
For columnstore index, ALTER INDEX … REBUILD has a new COLUMNSTORE_ARCHIVE data compression option that further compresses the specified partitions of a columnstore index, resulting in even less disk space being used. You can use this option for archival, or for other situations that require a smaller data storage size and can afford more time for storage and retrieval.
Getting Started with Column Store Index in SQL Server 2014
Here is the basic syntax for creating clustered columnstore index as well as for creating non-clustered columnstore index. For exhaustive detail and explanation, click here.
--Create a clustered columnstore index. CREATE CLUSTERED COLUMNSTORE INDEX index_name ON table_name WITH ( DROP_EXISTING OR MAX_DOP) ON partition_scheme_name | filegroup_name | default --Create a non-clustered columnstore index. CREATE NONCLUSTERED COLUMNSTORE INDEX index_name ON table_name COLUMN columns_name WITH ( DROP_EXISTING OR MAX_DOP) ON partition_scheme_name | filegroup_name | default
Normally, when we create a clustered index on a row store we need to specify columns going to be part of the clustered index whereas clustered columnstore index includes all columns in the table itself and is the method for storing the entire table. This is the reason, if you notice in the above syntax, it does not specify columns with clustered columnstore index.
Points to Remember
- In SQL Server 2014, you can still create a non-clustered columnstore index as you can in SQL Server 2012 though the non-clustered columnstore index is used for read-only queries, and is not updateable. Only clustered columnstore index is updateable.
- You can create clustered columnstore index in the Enterprise, Developer, and Evaluation editions and once created a table cannot have any type of non-clustered index, unique constraints, primary key constraints, or foreign key constraints.
- If your table has a non-clustered columnstore index you can create unique constraints, primary key constraints, or foreign key constraints, though the constraints cannot be included in the non-clustered columnstore index.
- To change definition of non-clustered columnstore index, you must drop and re-create the non-clustered columnstore index instead; you cannot use ALTER INDEX statement. Though you can use ALTER INDEX to disable and rebuild a columnstore index.
- When creating non-clustered columnstore index you cannot include sparse columns and cannot use INCLUDE or ASC or DESC clause.
- When you create clustered columnstore index, the index itself contains the data whereas in the case of non-clustered columnstore index, additional storage is required to store a copy of the columns in the non-clustered columnstore index.
- Clustered columnstore index cannot be combined with any other indexes (it’s the only index) whereas in the case of non-clustered columnstore index you can create other indexes on the table as well.
In this article, I talked about the basics of clustered columnstore index, its benefits, comparison with non-clustered columnstore index and how to get started with creating it. In my next article, I am going to talk in more detail about different concepts, key terms, and internal workings of columnstore index and provide you with some examples and some performance tests.