Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Jan 23, 2009

SQL Server 2008's Change Data Capture - Tracking the Moving Parts

By Rob Garrison

In this first CDC article, we will track the changes that take place as SQL Server 2008’s Change Data Capture feature is started on a user table. Next month, we’ll look at the performance of CDC.

My focus here is not just showing how CDC works on the surface (there are many articles that cover that), but rather to see what we can find out about the various moving parts within SQL Server that actually make CDC function. Understanding the internals will help in an overall understanding of CDC.

Introduction: SQL Server 2008 Change Data Capture

SQL Server 2008's CDC functionality reads the transaction log to record changes in system tables associated with each table for which CDC is enabled. It writes those files to system tables in the same database, and those system tables are accessible through direct queries or system functions.

Let’s walk through some example code and see what changes in each step.

Tracking the Parts Using Example Code

Initially, we start with a brand new database. To follow the changes, open Object Explorer and expand these folders:

  • Databases
  • SQLServer Agent | Jobs
  • (Once the database is created) Security | Roles | Database Roles
  • (Once the database is created) Programmability | Stored Procedures | System Stored Procedures
  • (Once the database is created) Programmability | Functions | Table-valued Functions

Create Database

  CREATE DATABASE TestCdc

This gives us the new database, but there are no new jobs. Expand the Tables folder and the System Tables folder. Empty so far.

Enable Change Data Capture on Database TestCdc

  USE TestCdc
  -- Turn on Change Data Capture at database level
  EXEC sys.sp_cdc_enable_db

This step takes a few seconds. Now refresh the System Tables folder. There are now six system tables:

Notice that five of these are part of the “cdc” schema.

There are also four new rather odd looking “SQL inline table-valued function” entries in the sys.all_objects table:

  • fn_cdc_get_all_changes_◦...◦
  • fn_cdc_get_all_changes_...
  • fn_cdc_get_net_changes_◦...◦
  • fn_cdc_get_net_changes_...

(I used a non-proportional font with spaces marked with ‘’ to make it clear that there are embedded and trailing spaces in these names.)

Here is the query that will return those functions:

  SELECT [name] FROM sys.all_objects WHERE [name] LIKE '%...%'

There are multiple new stored procedures:

  • sp_batchinsert_lsn_time_mapping
  • sp_ins_dummy_lsn_time_mapping
  • sp_ins_instance_enabling_lsn_time_mapping
  • sp_ins_lsn_time_mapping

There are still no new jobs or roles.

Create a New Schema and User Table

  CREATE SCHEMA MySchema
  CREATE TABLE MySchema.Person (
      PersonID    int         IDENTITY  PRIMARY KEY,
      FirstName   varchar(32) NOT NULL,
      LastName    varchar(32) NOT NULL,
      UpdateCt    int         NOT NULL  DEFAULT 0
  )

This creates a new user table, but the table is not yet enabled for CDC.

This step does not affect the list of functions/procedures/roles we’re tracking.

Enable New User Table for CDC

  EXEC sys.sp_cdc_enable_table
      @source_schema        = N'MySchema',
      @source_name          = N'Person',
      @role_name            = N'ChangeDataAccessRole',
      @supports_net_changes = 1

This enables the new user table for CDC. Refresh the folders listed above. We now have

  • one new system table (cdc.MySchema_Person_CT)
  • two new jobs (cdc.TestCdc_capture and cdc.TestCdc_cleanup)
  • one new database role (ChangeDataAccessRole)
  • two new table-valued functions
  • three new stored procedures:
    • sp_batchinsert_389576426
    • sp_insdel_389576426
    • sp_upd_389576426

You can actually get the text for the stored procedures (sp_helptext 'cdc.sp_batchinsert_389576426' [your number will likely be different]). Interesting reading. It is clear that this stored procedure is generated and specific to this user table.

If you enable a second table for CDC, you will get a second set of sp_batchinsert_n, sp_insdel_n, and sp_upd_n procedures.

System Table cdc.MySchema_Person_CT

This table has the following columns:

  • __$start_lsn
  • __$end_lsn
  • __$seqval
  • __$operation
  • __$update_mask
  • PersonID
  • FirstName
  • LastName
  • UpdateCt

Details regarding this table are covered well in other articles including Books Online.

Job cdc.TestCdc_capture

The capture job is very interesting as you dig into it. From sysjobs, we see that this job is owned by sa, is in the category “REPL-LogReader” and has a description of “CDC Log Scan Job”.

From sysjobsteps, we see that it has two steps:

Step:

1

Name:

“Starting Change Data Capture Collection Agent”

Command:

RAISERROR(22801, 10, -1)

Raiserror Output:

“Starting the Change Data Capture Collection Agent job. To report on the progress of the operation, query the sys.dm_cdc_log_scan_sessions dynamic management view.”

Step:

2

Name:

“Change Data Capture Collection Agent”

Command:

sys.sp_MScdc_capture_job

There are two logical next steps after reading those details, but we’ll leave the second until we’ve inserted some data.

  sp_helptext N'sys.sp_MScdc_capture_job'
  SELECT * FROM sys.dm_cdc_log_scan_sessions

Procedures sys.sp_MScdc_capture_job and sys.sp_cdc_scan

sys.sp_MScdc_capture_job is the procedure that kicks off the whole CDC process. It first does some security checks and then calls sys.sp_cdc_scan, which does the real work. sys.sp_cdc_scan “executes the change data capture log scan operation.”

By default, sys.sp_cdc_scan loops continuously with a five-second delay between loops. Inside each normal loop, it calls sp_replcmds. (It is interesting to note that this is a replication procedure. Books Online says that calling this procedure with the default (1) will return “the next transaction waiting for distribution.” The default @maxtrans in this procedure is 500.)

Last, it reads from sys.dm_cdc_log_scan_sessions to prepare for a call to sp_sqlagent_log_jobhistory to record job history information.

Database Role ChangeDataAccessRole

If you open the Properties window for the new ChangeDataAccessRole role, you can see that the role is owned by “cdc” and has SELECT permissions on the two table-valued functions listed above and nothing else.

Conclusion

Digging into these details gives us insight into how SQL Server 2008’s CDC is implemented. So far, we haven’t even inserted a single record into the new table. We’ll pick that up next time as well as show performance comparison results.

Thoughts or comments? Drop a note in the forum.

Reference

Download the .sql files for this article.

» See All Articles by Columnist Rob Garrison

SqlCredit - Developing a Complete SQL Server OLTP Database Project



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM


















Thanks for your registration, follow us on our social networks to keep up-to-date