SQL Server 2005 Integration Services - Transaction Support - Part 25
May 22, 2006
Transactional functionality is essential in database operations, in order to satisfy atomicity, consistency, isolation and durability requirements (which are frequently referred to collectively using the acronym ACID). These terms describe features inherent to data processing, which ensure that each modification has a clearly defined outcome, resulting in either success or failure (atomicity), thus preventing potential corruption of data (consistency), executes independently from other changes (isolation), and, once completed, leaves underlying data in the same state until another transaction takes place (durability). Complying with these rules is especially important when dealing with multiple, interdependent changes. The scope of these changes can vary, ranging from a few database objects within the same database or server instance, to any number of distributed systems, as is frequently the case in online transaction processing scenarios. Incorporating several distinct activities into the same transaction guarantees that if any of them fails, all of the remaining ones are rolled back, returning the system to its original state (in other words, all of them have to either succeed or fail - no other outcome is possible). One common example of applying this principle involves asynchronous environments where tasks are initiated via message queuing. By including task processing into the scope of transactions, you can ensure that messages will remain in the queue until their content is properly handled.
The same concepts are relevant within the context of SQL Server 2005 Integration Services. Transactions are typically employed in order to complete a sequence of tasks as a single unit of work that either succeeds or fails as a whole. Some of the actions which are executed as part of a task might not be transactional by nature (and therefore their rollback is simply not possible), but adding them into the scope of a transaction will result in having their outcome affecting other tasks, with full transactional capabilities.
Even though transactions were supported in Data Transformation Services (predecessor to SQL Server 2005 Integration Services), their functionality was somehow limited (for a more detailed review of their features, refer to one of our earlier DatabaseJournal articles). Among the most relevant improvements introduced in SSIS is the more granular scope of transactions, which can now be assigned to arbitrarily designated Control Flow tasks and containers. While the option to configure transactional support on the package level still exists (as was the case with Data Transformation Services), the main difference is the ability to have concurrent transactions associated with "sibling" containers, executing in parallel (DTS packages were restricted to one active transaction only). Transactional behavior is controlled by assigning one of three possible values to the TransactionOption property of tasks, containers, or a package (this is done from Transactions section of their respective Properties window within the Business Intelligence Development Studio):
In addition, the Transactions section in the Properties window of a task, container and package includes the IsolationLevel property. As the name indicates, its role is to set the transaction isolation level, which determines the locking mechanism used by SELECT statements for the duration of task or container execution (in other words, a protection mechanism used to isolate transaction activities from other processes accessing the same data), and can have one of the following values:
The less restrictive options (read uncommitted and chaos) improve concurrency (the ability to execute multiple transactions simultaneously) but increase the possibility of inconsistent reads (such as non-repeatable reads, dirty reads, and phantom reads), while the more restrictive options (read committed, repeatable read, or serializable) have the opposite effect. The snapshot isolation features superior concurrency without negative impact on inconsistent reads; however, its benefits need to be considered in the context of its implications on performance and tempdb size increase.
As with SQL Server 2000 Data Transformation Services, SSIS transactional features we described require that Distributed Transaction Coordinator service is running on the system where the packages are executed. This service is responsible for coordinating transactions that span multiple resource managers, such as databases and message queues. Since by default, the service is configured with manual startup type, you should consider changing it to automatic (and make sure it is started prior to launching your package).
Note that, in general, you should limit the use of transactions to an absolute minimum, since they affect concurrency through a /resource locking mechanism (even Snapshot isolation benefits only activities that query the data being modified - simultaneous update attempts still result in blocking). Strive to include only tasks that require it and fully support rollback, in a transaction. Keep in mind that certain T-SQL statements are not allowed in transactions (for their listing, refer to SQL Server 2005 Books Online). Be aware of blocking issues that might force you to divide longer transactions into smaller execution units. In our next article, we will provide examples demonstrating the use of transactions in SSIS packages and introduce the concept of checkpoints, which constitute an alternative method of controlling package execution.