Listing 1: Create CDC_DEMO database
By reviewing my code you can see I created a database named CDC_DEMO and two tables named Boat and BoatListing.
Enabling Change Data Capture
In order to use change data capture you first need to enable this feature. Change data capture is enabled at the database level. The script in Listing 2 can be used to enable change data tracking on my CDC_DEMO database.
Listing 2: Enable change data capture on a database
As you can see it doesn't take much to enable change data capture on a database. All you need to do is execute the sp_cdc_enable_db system stored procedure. If you want to enable change data capture on more than one database you will need to execute the sp_cdc_enable_db system stored procedures in every database you want to capture data.
Turning on Change Data Capture for a Table
Just because I enabled change data capture doesn't mean I am capturing changes to all tables in my CDC_DEMO database. Before I can start tracking changes, I need to enable change data capture on the tables I want to track. To enable change data tracking for my Boat table I will run the code in Listing 3.
Listing 3: Enable change data capture on a table
If you review the code in Listing 3 you can see I executed the sp_cdc_enable_table system stored procedure to enable change data capture on my Boat table. In my script in Listing 1 I told SQL Server to capture all columns in the change data capture table, and to create that table on the PRIMARY file group.
This stored procedure accepts other parameters. One of those parameters worth mentioning is @captured_column_list. If you use this parameter as well when you call the sp_cdc_enable_table sp, you can identify the specific columns you want to capture. Please refer to Books online for a complete list of parameters that can be passed to the sp_cdc_enable_table sp. The code in Listing 4 uses the @captured_column_list parameter.
Listing 4: Enabled change data capture to only capture specific set of columns
The only columns I am requesting to capture in Listing 4 are the Id, AskingPrice and LastUpdateUser columns of the BoatListing table.
After running the code in Listing 3 or 4 if I was to review my CDC_Demo database and the SQL Agent jobs, I would find some objects that were created as part of enabling change data capture. Those new objects are those used to manage, track and maintain the change data capture information. In my CDC_Demo database if I expand my System Tables view, the System Stored Procedures items, or the Table-valued Functions, I would find a number of objects that have a schema of "cdc". The "cdc" schema table objects contain the actual change data capture information. The "cdc" stored procedures and functions are the code behind capturing and managing change data capture information. When I look at SQL Agent jobs I find a number of jobs that start with "CDC.CDC_DEMO…". These SQL Agent jobs are those that run on my system to capture and maintain change data capture information.
Reading the Change Data Capture Information
Capturing data that has changed in my database is great. But once I capture it I might want to read it. In order to demo how to read change data capture information I first need to make some changes to my Boat and BoatListing tables so information will be captured in the change data capture table. I will make some changes to these tables by using the code in Listing 5.
Listing 5: Updating my demo table
To read the change data capture information I could read the tables in the "cdc" schema directly, although Microsoft recommends against this. Instead I will use a system function that was created when I enable change data capture. To demo how this works I will display all my changes to my Boat table by running the code in Listing 6.
Listing 6: Display Change Data Capture Information For my Boat table
When I run this code two rows will be returned. One row for each record inserted. If you review this code you see I called a function named "cdc.fn_cdc_get_all_changes_dbo_Boat" . This function returns the change data capture information for my Boat table. In order to identify the change data capture records I want this function to return, I pass it a from and to log sequence number (LSN). These log sequence numbers identify the time frame for which I want to return change data capture information. In order to identify the appropriate LSNs I use the "sys.fn_cdc_get_min_lsn" and "sys.fn_cdc_get_max_lsn" functions. The "sys.fn_cdc_get_min_lsn" is passed the schema and table name of the table I'm tracking changes in the format "<schema name>_<table name>", and returns the lowest LSN value contained in the change data tracking table. The "sys.fn_cdc_get_max_lsn" function returns the largest LSN values in the cdc.lsn_time_mapping table. These from and to LSN values are then used by "cdc.fn_cdc_get_all_changes_dbo_Boat" function to return INSERT, UPDATE and DELETE statements processed against my boat table during between those two LSN numbers.
My example code I've shown you does not create change data capture records. If your change data capture tables contains tons of changed records you might not want to display all changed records for a table. If you want to select your audit records based on a specific date, and time you could modify the code above to use the "cdc.fn_cdc_map_time_to_lsn" . This function is passed a date and time, plus a relative operator to identify a specific LSN value. For more information regarding how to use this function refer to Books Online.
SQL Server Agent Jobs that support Change Data Capture
As I mentioned earlier there are two different SQL Agent jobs that are set up when you enable change data capture. One job is responsible for capturing the changed data, while the other job cleans up or should I say purges old change data capture information.
The SQL Agent job that is used to capture the changed data is named "cdc.<database>_capture", where "<database> is the name of the database that has change data capture enabled. If SQL Server Agent is not up, or this job is not enabled, changed data capture data will not find its way to the change data capture tables. By default this job is set up to start when SQL Server Agent starts and once started it runs continuously.
When you enable change data capture on a database you also get a cleanup SQL Server Agent job automatically set up. This job has the following naming convention: "cdc.<database>_cleanup", where <database> is the name of the database that has change data capture enabled. This job by default runs at 2 AM and deletes captured data that is older than 3 days.
Viewing and Changing the Default Data Capture Configuration
You have no control over the default configuration settings used by the SQL Agent jobs that are created when you enable change data capture. But you can view and change the default settings for these jobs once they are created. To display the setting values for each change data capture job you use the sys.sp_cdc_help_jobs system stored procedure. If you want to change the settings of a change data capture job you use the sys.sp_cdc_change_job system stored procedure.
When I run the sys.sp_cdc_help_jobs on my CDC_DEMO database I get the output in Table 1.
Table 1: My default settings for change data capture jobs.
If you review the output you can see the different default configuration settings for each of my different change data capture jobs. Table 2, which is from Books Online shows the definitions for each of these columns. You can change these column values using the sys.sp_cdc_change_job system stored procedure.
Table 2: List of Change Data Capture configuration values
To see how to change the default values for my change data capture cleanup job review the code in Listing 7.
Listing 7: Display Change Data Capture Information For my Boat table
In Listing 7 I changed the retention period of my change data capture information. I changed that retention period to 86400, which means the cleanup job will only delete change data capture information after 60 days. To find out more about how to use the sys.sp_cdc_change_job system stored procedure, please refer to Books Online.
Building an Audit Mechanism Using Change Data Capture
Change data capture is a feature that you can use to capture changes to any table in a database. It just requires that you have a version of SQL Server that supports change data capture, and that you enable the database and tables for which you want to capture information. If you have some need to capture information about UPDATE, INSERT, and DELETE statements you might want to consider whether or not change date capture will meet your needs.
One additional note that is worth adding. If you would like to capture the person that is submitting the UPDATE, INSERT, or DELETE statement then you will need to make sure your application that submits the UPDATE, INSERT, or DELETE statement populates a column in the table being tracked with the person's credentials doing the updating. Change data capture only captures changed to the columns in a table. Therefore without the table containing a column that contains the person's credentials there is no way to tell who updated the table.