A Look at SQL Server 2008 Change Tracking

April 9, 2010

Before SQL Server 2008, you had to build a custom solution if you wanted to keep track of the changes to the data in your tables. SQL Server 2008 has a new offering called Change Tracking that keeps track of each DML event type and the keys of the row that was affected.

Introduction

Before SQL Server 2008, if you wanted to keep track of the changes to the data in your tables, you had to build a custom solution. This usually involved triggers on the tables being tracked, a table or tables to write the keys and a timestamp to custom code to query the changes, and jobs to cleanup all that extra data. Change Tracking in SQL Server 2008 makes all this customization a thing of the past.

An Overview

Would you like to see what rows have changed in your tables without examining every insert, update, and delete? SQL Server 2008 has a new offering called Change Tracking that, when turned on for a table, will keep track of each DML event type and the keys of the row that was affected. This means that at anytime, you can query to find out which rows have had an insert, update, and delete against them. You can retrieve the keys from Change Tracking and get the latest data from that table. You can even have it tell you which columns where involved in the change.

Note that if you need to know all the changes that have happened to a row and not just the latest, SQL server 2008 has another new feature called Change Data Tracking that can give you this information. You can find more about Change Data Tracking at the MSDN article, "Basics of Change Data Capture". If you aren’t sure which one you need, there is a comparison of the two technologies available at MSDN.

Enabling Change Tracking

To enable Change Tracking on a table, you must first enable it for the database. ALTER DATABASE is used to accomplish this as in the following example. Note that AUTO_CLEANUP and CHANGE_RETENTION are optional and only applicable when enabling change tracking. The former specifies whether the change tracking data should be cleaned up and the latter specifies the length of time the data is kept prior to cleanup as expressed in minutes, hours, or days. The default is 2 days.

ALTER DATABASE ExampleDB
SET CHANGE_TRACKING = ON
(AUTO_CLEANUP = ON , CHANGE_RETENTION = 5 DAYS)

To disable change tracking, set it to OFF. All tables in the database must have change tracking disabled prior to disabling it for the database.

ALTER DATABASE ExampleDB
SET CHANGE_TRACKING = OFF

Similarly, to enable change tracking for a given table, you would execute something similar to this.

ALTER TABLE Orders
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON)

The table must have a primary key defined to enable change tracking. The primary key is stored with the change tracking information to allow you to join back to the table to retrieve the current column values. More on this in a bit.

The WITH (TRACK_COLUMNS_UPDATED = ON) is an option that tells SQL Server to record information about which columns were altered during the change. This could allow you to pull the modified columns only instead of the entire row when working with the changed rows. Again, you won’t be able to see the previous values of those columns, just which ones changed.

To disable change tracking on the table…you guessed it:

ALTER TABLE Orders
DISABLE CHANGE_TRACKING

Querying Change Tracking

Now that we have a table being tracked, let’s do something useful with that information. Based on what we set up above, the database will keep a log of all the modifications on our Orders table for 5 days. For review, the following statements set up change tracking on the database and on the Orders table we create here.

ALTER DATABASE ExampleDB
SET CHANGE_TRACKING = ON
(AUTO_CLEANUP = ON , CHANGE_RETENTION = 5 DAYS)
 
 
CREATE TABLE Orders (OrderID     BIGINT not null Primary key,
                     CustID      BIGINT not null,
                     OrderStatus VARCHAR(20),
                     OrderDTM    DATETIME,
                     UpdateDTM   DATETIME)
 
ALTER TABLE Orders
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON)

Let’s insert an order and see what we get from change tracking.

INSERT INTO Orders
VALUES (1, 5, 'Received', GETDATE(), GETDATE())

The table function that we can use to see what changes were tracked is CHANGETABLE. Here we pass it the table we want changes for and NULL for the last version since we do not have one.

SELECT * 
FROM CHANGETABLE(CHANGES Orders, NULL) CT
 
 
SYS_CHANGE_VERSION   SYS_CHANGE_CREATION_VERSION SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS         SYS_CHANGE_CONTEXT OrderID
-------------------- --------------------------- -------------------- -------------------------- ------------------ -----------
1                    1                           I                    NULL                       NULL               1
 
(1 row(s) affected)

The results show us that the insert (SYS_CHANGE_OPERATION = ‘I’) was captured and the primary key associated with the change version is OrderID 1.

Now let’s try an update to that row. We’ll change the order status and the update date and time. This time we should be able to see a value in SYS_CHANGE_COLUMNS.

UPDATE Orders
SET OrderStatus = 'Packed',
    UpdateDTM   = GETDATE()
WHERE ORDERID   = 1

Instead of NULL , this time we know that we can use 1 for the last version to pass into the CHANGETABLE function.

SELECT * 
FROM CHANGETABLE(CHANGES Orders, 1) CT
 
SYS_CHANGE_VERSION   SYS_CHANGE_CREATION_VERSION SYS_CHANGE_OPERATION SYS_CHANGE_COLUMNS         SYS_CHANGE_CONTEXT OrderID
-------------------- --------------------------- -------------------- -------------------------- ------------------ -----------
2                    NULL                        U                    0x000000000300000005000000 NULL               1
 
(1 row(s) affected)

This shows change version 2 was an update operation. We can see that something was indeed recorded in SYS_CHANGE_COLUMNS , but it doesn’t look usable. There is another change tracking function that can help us make sense of this value. CHANGE_TRACKING_IS_COLUMN_IN_MASK takes the column id of the change tracked column you are interested in and the CHANGETABLE SYS_CHANGE_COLUMNS column.

In the example below, CHANGE_TRACKING_IS_COLUMN_IN_MASK is used to find all the Orders rows that have had an update since change version 1 where the columns involved in the change include the OrderStatus column.

SELECT O.OrderID, OrderStatus, UpdateDTM 
  FROM Orders O
  JOIN CHANGETABLE(CHANGES Orders, 1) CT
    ON O.OrderID = CT.OrderID
   AND CT.SYS_CHANGE_OPERATION = 'U'
 WHERE CHANGE_TRACKING_IS_COLUMN_IN_MASK( COLUMNPROPERTY( OBJECT_ID( 'Orders' ),'OrderStatus', 'COLUMNID' ), CT.SYS_CHANGE_COLUMNS ) = 1
 
OrderID              OrderStatus          UpdateDTM
-------------------- -------------------- -----------------------
1                    Packed               2010-04-05 18:47:31.387
 
(1 row(s) affected)

These are just some examples of how you can use change tracking. There are more change tracking functions available and more information that can be gleened from the CHANGETABLE.

Change Tracking and Surrogate Keys

One observation I feel compelled to share with you regarding change tracking is using change tracking with surrogate keys. Let’s say you have a centralized order database that is synchronized with your branch stores. The customer table contains a surrogate key of CustomerID. In addition, each branch assigns its branch number to the customer row when it’s responsible for a new customer entry. Now let’s say the company manages customer data across business units. Therefore, changes to customer data can occur from other systems including a centralized customer service department. When changes to a branch’s customer data occur, you want to synchronize those changes down to the branch. The problem here is that change tracking records only the primary key with the change information. So, if you want to limit the rows you synchronize, you have to retrieve all the changes from change tracking for the customer table (since your last synch version). You can then examine each of those rows for the correct branch number by joining back to the Customer table and limiting the results to those for the specific branch. The surrogate key, in this case, caused us to have a much larger working table than would otherwise be necessary if branch number had been a part of the primary key. This is not meant to steer your table design in any particular direction, just to provide food for thought when you may have a large set of changes that could be segregated down to a subset needed for synchronization.

Conclusion

If you need a lightweight solution for discovering what operations have occurred on your data rows, change tracking can save you significant effort. This only scratches the surface of what you can do with change tracking. SQL Server Books Online provides additional guidance on getting the most out of this new addition to SQL Server.

For More Information

MSDN: Change Tracking

MSDN: Change Tracking Functions

» See All Articles by Columnist Deanna Dicken








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers