dcsimg

Transactions 101

May 30, 2001

Introduction

SQL Server uses many mechanisms to ensure the integrity of its databases. One of these mechanisms is the concept of a transaction.

A transaction is the smallest unit of work SQL Server will do. One transaction can involve a single operation on one row of one table, or it can involve hundreds of operations on multiple tables in multiple databases on multiple servers. The developer usually controls when the transaction begins and ends. The server makes sure that either the entire transaction runs to completion or instead rolls it back to the point where it originally began. There's no middle ground.

All T-SQL statements must run within a transaction; there are no provisions for doing otherwise.

A unit of work is a transaction if it meets the ACID test. ACID is an acronym for atomic, consistent, isolated, and durable.

Atomic

The entire transaction has to run or the server will restore the data to the point before the transaction started. For instance, if an update statement modifies 100 rows and one fails because of a check constraint, every row will be restored to the point where it was before the update started.

It's important that the developer both understands how the server works and does his or her part. Potential problem areas are:

  1. The developer explicitly begins a transaction but, due to a typing or logic error, never ends it. The server will wait forever for the transaction to complete, up to the point the server is restarted. Most likely, the user will tire and break the connection or the DBA will break it for them. Either way, the server will roll back the transaction and no work will be done.
  2. The connection is set for implicit transactions, the developer issues a statement that begins a transaction, and neither commits nor rolls the transaction back. The outcome will be the same as for explicit transactions, though perhaps harder to debug.
  3. The developer mixes T-SQL and database API calls. If the developer uses ADO's BeginTrans to begin a transaction, they shouldn't use a T-SQL COMMIT TRANSACTION to commit it. BOL says this may produce undefined results.

Consistent

Consistency means the data never appears to other transactions to be in a transitory state. They either see the data as it was before the transaction began or after it was committed. Consistency involves a tradeoff of concurrency--the ability of multiple users to share data without impacting one another. In practice, maintaining perfect consistency imposes an unacceptable hardship on other users, and SQL Server won't enforce it without instructions from the developer. Read-only databases combine performance and consistency, and as such are a potential workaround.

Isolated

SQL Server's design permits multiple users to work with the same data simultaneously. A lack of controls to isolate transactions from one another introduces three potential problems: dirty reads, nonrepeatable reads, and phantoms.

Dirty reads are reads of another transaction's uncommitted data modifications. If the other transaction is rolled back, the first transaction has effectively read data that never existed. SQL Server won't permit this without instructions from the developer.

Nonrepeatable reads are instances where a transaction reads rows, another transaction modifies or deletes those rows and commits its changes, and the first transaction re-reads the rows. SQL Server will permit this if not told otherwise. Affected applications should be designed to handle nonrepeatable reads. Timestamp, datetime, and smalldatetime data types can be used, the latter two in conjunction with triggers.

Phantoms are instances where a transaction reads rows satisfying a search condition, another transaction inserts and commits rows that satisfy the search condition, and the first transaction rereads using the same search condition and gets a different set of rows. SQL Server permits this if not instructed otherwise.

SQL Server follows the ANSI specifications for isolation and allows four distinct levels. They are implemented using locks of increasing scope and duration.

Transaction Isolation Level Allow Dirty reads Allow Nonrepeatable reads Allow Phantoms
READ UNCOMMITTED Yes Yes Yes
READ COMMITTED No Yes Yes
REPEATABLE READ No No 7Yes/6No
SERIALIZABLE No No No

The levels of interest are READ COMMITTED, which is SQL Server's default level, and SERIALIZABLE, which can turn a multi-user system into a single-user system for the duration of the transaction.

There are three ways to control the isolation level: T-SQL, database API calls, and locking hints.

/* Either of the following is set prior to beginning
** the transaction and remains in effect until reset or the
** connection is closed.
*/

-- T-SQL
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE

-- ADO
conC.IsolationLevel = adXactSerializable

/*
** Setting a locking hint gives you more
** control than a connection-level setting.
** It also overrides a connection-level
** setting. See Locking Hints in BOL for
** details.
*/

select *
from swynk_table with (HOLDLOCK)

Durable

Durability guarantees that a committed transaction will be permanently written to the database, even if the system fails during the process. It also guarantees that an uncommitted transaction will be completely rolled back. This is the theory. In practice, there are a few things to note, and some gotchas.

SQL Server allows you to restore to a specific point in time, assuming you have backups. You can restore the database to the state it was in before you inadvertently committed the deletion of a million rows. There are several products for SQL 6.5, and at least one for 7.0 and 2000, that go beyond SQL Server's recovery capability. Nothing is permanent.

Your hardware--particularly the disk subsystem--and file system settings can break durability. Concerning hardware, keep in mind the old motorcycle helmet advertisement--"If you have a $20 head, buy a $20 helmet." The system, especially the disk controller(s), should be outfitted for a database server. Write-caching controllers are problematic because they lie to SQL Server that data has been written to the disks before it physically has. If the system fails, you're relying on the controller to pick up exactly where it left off. See Q234656.

Use NTFS for the file system but don't enable NTFS compression. Compression isn't supported, kills performance, and may corrupt your data. See Q231347.








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers