In the concluding article of this series, we will discuss transaction support in
SQL Server 2000 Data Transformation Services. Transactions within DTS packages
serve the same purpose as in other SQL Server based operations – they provide
the set of features described collectively using the term ACID, which is an
acronym derived from four transactional characteristics – atomicity,
consistency, isolation and durability. Atomicity means that a particular unit
of work is either fully completed or not performed at all, consistency refers
to the state of the underlying data, isolation specifies that each unit of work
executes independently of others, and durability reflects permanent status of
data changes.
By default, each task performs its operations independently of others within
the same DTS package (from the transactional point of view). If an error is
encountered, execution of the current task is abandoned. While you can use
settings in the Error handling section on the Logging tab of the Package
properties dialog box ("Fail package on the first error") to specify
whether the entire package will fail at this point or control execution of
subsequent tasks with workflow elements (which determine whether the next task
will execute on success, failure, or completion), neither one of these two
methods has any influence on operations completed prior to the failure. In
order to accomplish this, you need to configure all related tasks to
participate in the same transaction. This way, you can ensure that either all
of them will complete successfully or will roll back to the original state if
any one fails.
At the same time, however, you need to be aware that a package can execute
only a single transaction at a time and that not every package is capable of
properly handling transactions. The limitations are imposed by the types of
connections and tasks employed within a package:
-
connections rely
on their data provider for transactional support. This support is inherent to
the Microsoft OLE DB Provider for SQL Server, ODBC data source (as long as the
corresponding ODBC driver has implemented and set the SQL_ATT_ENLIST_IN_DTC
connection attribute), and Microsoft Data Link (as long as the corresponding
OLE DB provider has implemented the ITtransactionJoin interface). You can find
more detailed information on the subject in the ODBC and OLD DB documentation. -
tasks can be
grouped in three categories based on their level of transactional support. The
first one consists of tasks that are not transaction-aware, by the virtue of
the way they operate, such as FTP, Dynamic Properties, Send Mail, or Copy SQL
Server Object tasks. Tasks in the second category (ActiveX Script and Execute
Process) can create and manage transactions internally (as part of a script or
a process), but cannot participate in DTS transactions involving other tasks.
Finally, the third one includes tasks that fully support transactions – Bulk
Insert, Data Driven Query, Transform Data, Execute Package, Execute SQL and
Message Queue.
Since DTS relies on the Microsoft Distributed Transaction Coordinator (MS
DTC) when processing its transactions, you should ensure that the MS DTC
service is running before you launch a package. In addition, you need to
consider several configuration options that affect transactional behavior of a
package and its tasks (tasks represent units of execution and are frequently
referred to as steps, when discussing their role in DTS-based transactions):
-
"Use
transactions" checkbox on the Advanced tab of the Package Properties
dialog box is required to turn on transaction functionality within the package.
By default, this setting is enabled. -
"Commit on
successful package completion" checkbox on the Advanced tab of the Package
Properties dialog box will commit any transactions that are still open when the
package completes. This does not affect in any way transactions that have
already committed (as part of tasks level settings). -
"Transaction
isolation level" drop-down list on the Advanced tab of the Package
Properties dialog box determines (as you can expect), the transaction isolation
level (which directly affects the locking mechanism) and can take one of the
following values:-
"Read
Committed" – default setting for transaction isolation, which indicates
that changes caused by other transactions executing in parallel with the
current one are not visible until these transactions commit. -
"Read
Uncommitted" – allows viewing uncommitted changes caused by other
transactions executing in parallel with the current one -
"Repeatable
Read" – ensures that data being read by the current transaction stays
consistent until this transaction commits -
"Serializable"
– most restrictive isolation level ensuring complete independence of the
current transaction (highest locking level) -
"Chaos"
– like "Read Uncommitted," permits viewing uncommitted changes, but
without holding update locks until the end of transaction. Due to its lack of
support on the SQL Server platform and inability to roll back, it is rarely
used.
The less restrictive options (read
uncommitted and chaos) improve concurrency (the ability to execute multiple
transactions simultaneously) but increase the possibility of inconsistent reads
(such as non-repeatable reads, dirty reads, and phantom reads), while the more
restrictive options (read committed, repeatable read, or serializable) have the
opposite effect. -
"Read
-
"Join
transaction if present" checkbox on the Options tab of the Workflow
Properties dialog box (accessible from a task’s context sensitive menu by
selecting Workflow -> Workflow Properties menu option) causes a task to be
included in a transaction, providing one already is in progress and has not
committed yet. Otherwise, execution of the task results in a new transaction. -
"Commit
transaction on successful completion of this step" checkbox on the Options
tab of the Workflow Properties dialog box triggers transaction commit when the
task completes successfully (assuming that "Use transactions"
checkbox on the package level is marked on). -
"Rollback
transaction on failure" checkbox on the Options tab of the Workflow
Properties dialog box triggers rollback of the transaction whenever the task
fails (again, assuming that "Use transactions" checkbox on the
package level is marked on).
Now let’s look at a very simple example illustrating the use of transactions
in DTS packages. We will create two tasks, configure them as part of the same
transaction and examine the results. As our data stores, we will use a text
file containing a single entry:
4;Speedy Gonzalez;(504)555-1234
and the Shippers table in the Northwind database. Start by
creating the text file (e.g. C:TempShippers.txt) and copying the line listed
above into it. Next, launch DTS Designer and insert a Microsoft OLE DB Provider
for SQL Server Connection pointing to the Northwind database into a new
package. From the Task menu (or the toolbar on the left hand side) select Bulk
Insert Task and configure its properties by typing a description (e.g. BULK
INSERT Shippers), selecting the existing Northwind connection from the Existing
connection listbox, pointing to Shippers as the destination table, and
specifying C:TempShippers.txt as your source data file). Ensure that a
semicolon is used as column delimiter. Close the Bulk Insert Task Properties
dialog box, right click on the task icon, and select Workflow -> Workflow
Properties context sensitive menu option. In the Workflow properties dialog
box, switch to the Options tab and select the "Join transaction if
present" check box. This will trigger the creation of a new transaction
when the task is launched. Close the dialog box. Create the second task by
selecting Execute SQL Task from the Task menu. In the Task Properties dialog
box, type in a Description (e.g. "INSERT ShipperID 1"), select Northwind
in the Existing connection listbox, and enter the following SQL Statement:
SET IDENTITY_INSERT Northwind.dbo.Shippers ON
GO
INSERT INTO Shippers (ShipperID, CompanyName, Phone)
VALUES (1, ‘Road Runner’, ‘(503) 555-4567’)
GO
SET IDENTITY_INSERT Northwind.dbo.Shippers OFF
GO
This statement will attempt to insert a duplicate row into the Shippers
table, which, of course, will result in failure. Since we are planning to run
both tasks as part of the same transaction, the bulk insert changes will be
rolled back and the content of the Shippers table will remain the same as prior
to package execution. In order to complete our configuration, bring up the
Workflow Properties dialog box for the Execute SQL task, set the "Join
transaction if present" and "Rollback transaction on failure"
checkboxes, and confirm the configuration by clicking on the OK command button.
From the DTS Designer interface, insert the workflow "On Success"
element between the Bulk Insert and Execute SQL tasks. Save and execute the
package. In the Executing DTS Package dialog box, you should be able to see the
progress of the execution, indicating success of the first task and failure of
the second. Finally, check the content of Shippers table and verify that it has
not changed.
You can experiment with additional configuration options (e.g. by setting
"Commit transaction on successful completion of this step" for the
first task) to further explore the transactional capabilities built into DTS.
You can also refer to Books On Line for discussion on more elaborate package
designs, such as the checkpointing package transactions, inherited transactions
or parallel execution.
Even though this article concludes our introductory overview of SQL Server
2000 Data Transformation Services, we will revisit this subject when discussing
DTS Security in our upcoming SQL Server 2000 Security series.