Introducing Change Tracking in SQL Server 2008

Some applications are designed to pull data from a central
database into a local cache. For example, on-the-road sales personnel only
connect occasionally to the central database to get the latest inventory data. Or
heavy processing of business logic is offloaded from the production database
server, processed locally and sent back to the production database. These
applications need a way to track data changes in order to know what data has
been changed. Before SQL Server 2008, application developers had to implement
custom tracking solutions using triggers or timestamp columns, and creat
additional tables in order to track data changes. As we all know, triggers are
expensive. Each table involved in the DML operation is checked recursively via
an internal function for the presence of a trigger. In addition, because
triggers are executed as a part of the transactions that cause them to be
invoked, the transactions take longer to commit and introduce complicated
blocking issues. SQL Server 2008 provides a new feature, Change Tracking.
Change tracking is great for building one-way or two-way synchronization
applications and is designed to work with Sync Services for ADO.NET. Application
developers can use Change Tracking to synchronize any data between SQL Server
databases, or even between SQL Server and non-SQL Server databases. Compared
with replication, Change Tracking is more for developers than DBAs as it
provides developers with a flexible foundation to build synchronization
applications with .NET, but lacks stored procedure support or built-in
monitoring tools like Replication Monitor.

Enabling Change Tracking on an existing table does NOT
require any changes on the table schema. The only requirement is that the
table must already have a primary key. Change tracking information is recorded
synchronously on transaction commit time so it presents an accurate sequence of
DML operations. Although Change Tracking is executed synchronously with
transactions, its performance overhead is very light compared to triggers.
Besides, it captures only the values of the primary key columns of the changed
rows and records the values in the change tracking table. These values can be
joined with the base table to get the changed data. There is less storage
overhead compared with Change Data Capture (for more information, see MAK’s
CDC series
) because non-key columns are not captured in the change tracking
table. However, because only net changes, not intermediate changes, to data
rows can be queried, Change Tracking is not suitable for auditing.

In this article, we will demonstrate how to set up Change
Tracking on the table HumanResources.Department in the database AdventureWorks2008
and how to query the changed data. If you haven’t installed the AdventureWorks2008
sample database, you can download it from http://msftdbprodsamples.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=18407

Before you enable change tracking on a table, you need to
enable change tracking for the database. Execute the following command to
enable Change Tracking on the AdventureWorks2008 database.


ALTER DATABASE AdventureWorks2008
SET CHANGE_TRACKING = ON
(AUTO_CLEANUP=ON, CHANGE_RETENTION=8 hours)
GO

When this command runs, it also enables automatic cleanup of
change tracking information as the AUTO_CLEANUP option is turned on. The
retention period is set to at least 8 hours. An internal task runs every 30
minutes to remove old transactions. SQL Server will also begin to populate an
internal system table sys.syscommittab with transaction information. This
table will have one row for every transaction that causes data modifications on
tracked tables in the database. Although this table cannot be viewed in non-DAC
connections, the information in the table is exposed by a DMV, sys.dm_tran_commit_table.
Here are the columns in the DMV.

  • commit_ts: A monotonically increasing number that serves as a database-specific timestamp for each committed transaction.
  • xdes_id: A database-specific internal ID for the transaction.
  • commit_lbn: The number of the log block that contains the commit log record for the transaction.
  • commit_csn: The instance-specific commit sequence number for the transaction.
  • commit_time: The time when the transaction was committed.

If you never enabled change tracking on the database, then
the DMV is empty.

select * from sys.dm_tran_commit_table

To enable Change Tracking on the HumanResources.Department
table, run the following command.


ALTER TABLE HumanResources.Department
ENABLE CHANGE_TRACKING
GO

This command creates an internal table that is used to
record changes made to the HumanResources.Department table. The table name is change_tracking_[tableObjectID].
Since it is an internal table, it can only be viewed in DAC connection. In our
example, the object ID of the HumanResources.Department table is 757577737,
therefore, the table name is “change_tracking_757577737”. The table is empty as
shown below because we haven’t made any modifications to the HumanResources.Department
table.

Here are the columns of the table.

  • sys_change_xdes_id: Transaction ID of the transaction that
    modified the row.
  • sys_change_xdes_id_seq: Sequence identifier for the operation
    within the transaction.
  • sys_change_operation: Type of operation that affected the row:
    insert, update, or delete.
  • sys_change_columns: List of which columns were modified (used for
    updates, only if column tracking is enabled).
  • sys_change_context: Application-specific context information
    provided during the DML operation using the WITH CHANGE_ TRACKING_CONTEXT
    option.
  • k_[name]_[ord]: Primary key column(s) from the target table. [name]
    is the name of the primary key column, [ord] is the ordinal position in the
    key, and [type] is the data type of the column.

Because the HumanResources.Department table only has one
column, DepartmentID, in the primary key, there is only one primary key column
k_DepartmentID_00000001 in the change tracking table.

The sys_change_columns column can be used to track which
columns were modified for each update operation. (Insert and delete statements
always change all of the columns so this column is always NULL for inserts and
deletes). By enabling column tracking on the table, you can get only the data
from the columns that have been updated. This will limit the amount of data
returned and transferred through the network. This will also make consolidation
of incremental data changes more efficient as big size columns, such as varbinary(max)
and xml, are only returned when they have been updated. To enable column
tracking, set the TRACK_COLUMNS_UPDATED option.


ALTER TABLE HumanResources.Department
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON)

You will see an example of using column tracking in the next
article.

You can also use the sys_change_context column to track the
context in which changes to the table were made. The context is provided by
the client that issues DML statements. It can be a constant, such as an
application ID. A sample update statement is as follows.


DECLARE @originator_id varbinary(128)
SET @originator_id = CAST(‘MyApplication’ AS varbinary(128))
WITH CHANGE_TRACKING_CONTEXT (@originator_id)
UPDATE HumanResources.Department
SET GroupName=’Accounting’
WHERE DepartmentID=10

If you want to disable Change Tracking on a database, you
will need to first disable all change-tracked tables in the database. You can
look for the list of such tables from the sys.change_tracking_tables catalog
view. Here is a simple SQL statement to generate all the ALTER TABLE
statements.


SELECT ‘ALTER TABLE ‘ + object_name(object_id) + ‘ DISABLE CHANGE_TRACKING;’
FROM sys.change_tracking_tables

After you disabled Change Tracking for the tables, run the
following command to turn off Change Tracking in the database.

ALTER DATABASE AdventureWorks2008 SET CHANGE_TRACKING = OFF

Summary

This article introduced Change Tracking, and illustrated how to
enable Change Tracking on database, table and turn on various options, for
example, AUTO_CLEANUP, TRACK_COLUMNS_UPDATED. In the next article, we will
show you how to obtain data changes from change-tracked tables.

»


See All Articles by Columnist
Yan Pan

Yan Pan
Yan Pan (MCITP SQL Server 2008, MCITP SQL Server 2005, MCDBA SQL Server 2000, OCA Oracle 10g) is a Senior DBA for a leading global financial services firm, where her daily duties include administering hundreds of SQL Server and Oracle servers of every possible version, working with business units on software development, troubleshooting database issues, and tuning database performance. She has written a Wrox book titled “Microsoft SQL Server 2008 Administration with Windows PowerShell” with MAK who is also a columnist for DatabaseJournal.com. You can check out the book at many book stores, such as Barnes & Noble, Borders. Previously, Yan worked as a SQL Server DBA and a .NET developer at Data Based Ads, Inc., and developed two .NET Web applications. Before that, she worked at AT&T Research Labs. She designed OLAP cubes with SQL Server Analysis Services and developed PivotTable reports for regional managers. She has master’s degrees in Computer Science and Physics.

Latest Articles