SQL Server 2005 Integration Services – Checkpoints – Part 26

In our previous
article
of this series, we discussed transactional support in SQL Server
2005 Integration Services. Such support is critical in order to ensure that
arbitrarily designated Control Flow tasks are executed as a single logical
unit, which either completes successfully or is rolled back in its entirety.
Demand for this functionality exists in a variety of scenarios, whenever
coherence among individual, inherently independent processes needs to be
enforced. However, transactions fail to provide sufficient resolution in
situations where it would be preferable to avoid rerunning certain processes,
due to their significant computational cost (or which, by virtue of changes
they introduce, should be performed only once). As a matter of fact, including
them in a transaction might potentially cause this cost to increase if another
task that is part of the same transaction fails (since this would trigger
potentially equally expensive rollback). Excluding them from the scope of
transaction does not prevent them from being executed again if the package
needs to be relaunched. Fortunately, there is a new mechanism known as
checkpoints, which helps to address this very issue.

While SQL Server has been relying from its inception on the concept of
checkpoints for transactional support and recovery, its Integration Services
implementation has been introduced as part of SQL Server 2005 release. In its
new incarnation, checkpoints can be applied to Control Flow tasks (which
implies that they can not be associated with individual components of the Data
Flow) and functions by recording information,  at the time of failure,
about the restart point (in addition to execution results of all completed
containers) in a text file, whose location and name you arbitrarily assign. By
referencing recorded content, the SSIS run-time engine can avoid rerunning all
tasks that have already completed. (The checkpoint file also contains package
configuration, as well as the most recent values of system and user-defined
variables, in order to ensure that the package uses the same settings during
subsequent reruns).

While checkpoints are applied to individual Control Flow tasks, some of
their properties are configured on the package level. To view them, right click
on the empty area of the Control Flow tab of the SQL Server 2005 Business
Intelligence Development Studio and select Properties from the context
sensitive menu. In the Properties window, scroll to the Checkpoints section.
From here, you have an option to control the following settings:

  • SaveCheckpoints – must be set to True (the default is False) in
    order to enable checkpoint support,

  • CheckpointFileName – designates the location and name of the
    checkpoint log file. This must be configured in order for checkpoint
    functionality to work as expected.

  • CheckpointUsage – determines package behavior following the
    restart after a failure and takes on one of three possible values – Never
    (precludes use of checkpoints, regardless of configuration of the other two
    properties), Always (enforces checkpoint use providing that value of SaveCheckpoints
    property is True and the file that CheckpointFileName points to actually
    exists. If the latter condition is not satisfied, package execution fails), and
    IfExists (just as the Always setting, mandates checkpoint use if SaveCheckpoints
    has been enabled and CheckpointFileName property has a valid value. Unlike
    Always, following package failure, it will simply launch it from the beginning,
    in cases where the checkpoint file does not exist).

In addition, you also need to set the FailPackageOnFailure property (in the
Execution section of the Properties window) to True for each task and container
to which you want to apply checkpoint functionality. This way, their failure
results in package termination and its subsequent launch starts by initiating
the same checkpoint-enabled component that did not complete successfully
(rather than running from the beginning of the Control Flow). Note that since
containers are viewed in this case as individual units of work, their entire
content (including all child containers and tasks within their scope, even if
they already have been executed) will be processed again.

Now that we have presented the general concept and configuration settings of
transactions and checkpoints in the context of SQL Server 2005 Integration
Services, let’s take a look at a simple example demonstrating their use. Launch
Business Intelligence Development Studio and create a new project using the
Integration Services Project template. Our sample package will consist of three
Execute SQL tasks connected via Precedence Constraint to ensure their
sequential execution. To implement it, drag the icon representing SQL Execute
Task to the Control Flow area of the SSIS designer. Display its Editor window
by right clicking on its outline and selecting the Edit option from the
context-sensitive menu. Verify that the ConnectionType in the SQL Statement
area of the General section is set to OLE DB, and configure a new connection to
the AdventureWorks database on your instance of SQL Server 2005. Choose Direct
input in the SQLSourceType entry, and type the following in the SQLStatement
text box:

UPDATE HumanResources.Employee
SET SalariedFlag = 1
WHERE (EmployeeID = 1)

Next, drag another Execute SQL Task icon from the Toolbox
into the Control Flow area, directly below the first one and connect the two
via Precedence Constraint with its default settings. From its Editor dialog
box, set ConnectionType and Connection entries identically to the first task
and enter the following statement (which will fail due to the datatype
mismatch) as the value of SQLStatement property with SQLSourceType set to Direct
input:

UPDATE  HumanResources.Employee
SET  VacationHours = 'five jillion'
WHERE  (EmployeeID = 1)

Repeat the same procedure for the third Execute SQL Task, this time
assigning the following value to the SQLStatement property (which copies to the
row of data for Employee with ID of 2 number of VacationHours assigned to the
employee with ID of 1):

UPDATE HumanResources.Employee
SET VacationHours = 
 (SELECT VacationHours
 FROM HumanResources.Employee
 WHERE (EmployeeID = 1))
WHERE (EmployeeID = 2)

Now, let’s check how the various transaction settings might affect the
outcome of execution of this package. Start by assigning Required value to the TransactionOption
on the package level (to display the appropriate Properties window, right-click
on the Control Flow area of the SSIS designer outside of any specific
component, such as a task or container, and select Properties entry from the
context sensitive menu) and selecting the Supported value for the same property
for each of the tasks. Launching the package will result in the failure of the
second task (as expected) triggering rollback of the changes introduced by the
first task (since both are part of the same transaction initiated on the
package level) – so the original value of SalariedFlag does not change. Note
that the result would be different if you left the default value of the package
TransactionOption (NotSupported) and set each of the Execute SQL Tasks with
Required value, since this would result in three separate transactions (so
failure of the second would not trigger rollback of the first).

Revert to the original settings, by assigning NotSupported value to the TransactionOption
package property and Supported to all three Execute SQL Tasks. If you execute
the package at this point, it will complete the first task successfully and
fail on the second. Unlike previously though, the change to the SalariedFlag
entry is permanent. Despite this, relaunching the package will still result in
exactly the same sequence of actions, including execution of the first task. To
eliminate this unnecessary step, assign a valid path and file name to the CheckpointFileName
package property, set Checkpoint usage to IfExists, and SaveCheckpoints to
True. Next, set the FailPackageOnFailure for the second task to True. Once
configuration changes are made, start execution again. The initial run will be
identical to the ones we have experienced previously. Successful completion of
the first task is followed by the failure of the second. However, you should
also notice that an XML formatted checkpoint file, whose name and location you
specified as the value of the CheckpointFileName property, has been created and
populated with relevant information. If you run the package again, the first
task will be bypassed and flow of execution will start from the second one.

»


See All Articles by Columnist
Marcin Policht

Marcin Policht
Marcin Policht
Being a long time reader, I'm proud to join the crowd of technology gurus gathered here. I have a fair share of Microsoft exams behind me, a couple of acronyms after my name - MCSE, MCSD, and MCT, decent familiarity with majority of MS BackOffice products (SMS, SQL, Exchange, IIS), programming and scripting languages (VB, C++, VBScript with wsh) and several years of practical experience with Windows environment administration and engineering. My focus these days is on SQL Server and Windows 2000, and I'll attempt to share the most interesting experiences with these products.

Latest Articles