This presentation started with a discussion of why one needs locks.
I will not summarize this in detail, [If anyone is interested in an article about transactions and locking in the abstract, I might consider writing one.] but will simply state that there are four problems that need to be prevented:
- Dirty read: one transaction sees another’s uncommitted changes
- Lost update: one transaction overlays another’s update
- Non-repeatable reads: one transaction sees a different value when it revisits a record
- Phantoms: insertions or deletions cause records to appear or disappear; prevention requires locking of a predicate (WHERE clause).
ANSI has defined four "isolation levels" which prevent some or all of these problems (or phenomena):
Isolation Level |
Phenomena allowed |
Read Uncommitted |
Dirty read, non-repeatable read, phantoms |
Read committed |
Non-repeatable read, phantoms |
Repeatable read |
Phantoms |
Serializable |
— |
In SQL Server 6.5, repeatable read and serializable were treated as if they were the same, although (according to the presenter) 6.5 “faked out” serializable. In 7.0, SQL Server correctly implements all ANSI defined isolation levels.
It is important to note that true isolation is expensive in terms of concurrency. ANSI has specified the four levels so that an application can choose between the amount of concurrency and the amount of correctness. None of the ANSI levels will permit lost update.
In the following discussion, "X" stands for an exclusive, or write, lock. "S" stands for a share, or read, lock. In general, many people can have S locks on the same resource. Only one person can have an X lock on a resource, and no one can have an S lock on a resource where there is an X lock. No one can get an X lock when there is an S lock on a resource (with the possible exception of the holder of an S lock who wants to upgrade the lock to an exclusive lock). This is the way locks work in all sorts of systems; it is not just in SQL Server.
SQL Server 7 uses what is called "Multi-Granular Locking" which has the goal of providing the concurrency of row-level locking with the performance of higher level locks. This can use row-level locking for OLTP inserts and page locks (or table locks) for DSS retrieval. Because there can be a mix of lock types in the same transaction, it would be possible to have problems. For example, if Transaction 1 were to lock a table in exclusive mode, and Transaction 2 were to lock a single row in the table in exclusive mode, both would think they had the ability to update the same row, and the result would be a lost update. What SQL Server 7 does to avoid this is to place intent locks at higher levels of granularity when a row is locked. For example, if it places an X lock on a row of a table, it must place an intention exclusive lock on the page that contains the row and on the table that contains the page. These locks tell the rest of the world "something beneath me is locked". These are called IX locks. IX locks are compatible with S locks, so readers are not locked out as they would be with an X lock. For example, consider the following diagram:
(new graphic coming…)
Because the IX and IS locks are compatible, both transactions can proceed. There are 6 standard multi-granular lock modes:
Mode |
Description |
S |
Share (read) lock |
X |
Exclusive (write) lock |
U |
Update; used to evaluate prior to writing |
IS |
Intent share; there is a share lock at a lower level |
IX |
Intent Exclusive: there is an X lock at a lower level |
SIX |
Share with Intent Exclusive |
[I am not completely sure of the function of the Share with Intent Exclusive lock yet.]
The Update lock has been implemented so that it avoids the conversion deadlock frequently experienced when attempting to convert an S lock to an X lock in SQL Server 6.5.
Locks in SQL Server 7 can be applied to
- Tables
- Pages
- Rows
- Key Ranges
The internal format of a lock is
Resource type |
database id |
resource specific data |
Samples:
Table
5 |
12 |
325658 (objectid) |
Page
6 |
12 |
2:328 (file#:page#) |
Row
9 |
12 |
2:328:11 (file#:page#:slot on page) |
Key Range
7 |
12 |
325658:2:2341186 (objectid:indexid:6byte hash) |
It is important to note that SQL Server 7 uses a fixed row id which does not change unless the row is deleted and reinserted somewhere else. An RID is an 8-byte File#, Page# and Slot#. RID’s are reused. They are used as a lock resource. In a clustered index, rows are identified by their unique clustering key, and that is used as a lock resource. [Note that if you have a non-unique clustered index, SQL Server adds a "uniquifier".] If there is a non-clustered (secondary) index on a table with a clustered index, the clustering key is used in the non-clustered index rather than the RID. If there is no clustered index, the RID is used. These are what will be used as a lock resource.
Key range locks are necessary to guarantee serializable operations and require additional work. All of the components of the key (user-defined, clustering, uniquifier) are hashed to generate a 6 byte field. Collisions are rare but possible. If there is a collision, locks will be escalated to exclusive in order to guarantee serializability. Key range locking requires that, if the WHERE clause specifies "salary between $30000 and $50000" subsequent retrievals (within a transaction) will always return the exact same set of rows (no phantoms). There are some composite lock modes that include both a range and a row component for key range locks.
Range |
Row |
Mode |
Description |
— |
S |
S |
share row lock |
IS |
S |
IS-S |
serializable range scan |
The same resources are locked for a range lock as for row locks, but the lock manager compatibility tables are different. The range part prevents inserts into or deletions from the range; the row part prevents modification of the rows.
[One question that was not answered to my satisfaction is whether or not there is full serializability on non-indexed data.]
Scan locking varies depending on whether or not repeatable reads are required. In the default (read committed) level of isolation, share locks are released once the page is no longer needed just as they are in SQL Server 6.5. A rescan of the same data may find new rows, modified rows or deleted rows. With repeatable reads, share locks are retained until the transaction reaches its commit point. A rescan may see new rows, but rows can not be deleted or modified. When a serializable level of isolation is required, locks are retained on scanned rows and scanned ranges. No changes, deletions or insertions can occur.
SQL Server 7.0 includes dynamic locking. Row level locking is not always the right choice. A smaller the lock granularity means higher the overhead. For example placing row-level locks while scanning a million row table would mean one million calls to the lock manager. So sometimes SQL server will determine that page or table level locking is the optimal way to scan. The locking strategy is chosen at run time (as part of query plan development) based on input from the query processor. The query processor will default to row-level locking for individual row modifications and inserts. When scanning large volumes of data, it will tend to take page or table locks. It is done automatically but you will have control over the default locking granularity for a table.
It can also use different granularities for different parts of a query. For example, while scanning an index (contiguous compact range of pages) it can use page locks. Then when it has the row-locator, it can use row locks.
Special access methods for index modifications (invisible to user) have been put into place so that page splits are not expensive in terms of the locks they hold. This has been done while insuring that there is still consistency. They are using a "system transaction" to ensure this. This is not the same as the transaction that users define.
There are new Locking Hints:
Granularity Hints: ROWLOCK, PAGLOCK, TABLOCK
Isolation level hints: READCOMMITTED, REPEATABLEREAD, SERIALIZABLE, READUNCOMMITED (HOLDLOCK, NOLOCK supported for compatibility)
There is also a hint named READPAST, which they describe as useful for implementing work queues. This will "skip" rows that are currently locked. For example, if you have n processes picking up work from an input queue, each can get their own item using the following statement:
SELECT TOP 1 * from work_queue
(rowlock, readpast, updlock) order by entry
Worker can delete and commit or rollback and let the next person try to get the entry.
Probably the best thing of all is that there is now a user configurable lock timeout. You can specify a timeout in milliseconds or choose not to wait at all.
There are enhanced versions of sp_who and sp_lock, and a command called sp_blockinglocks which shows only blocking locks. There is still a traceflag that you can use to study deadlock (-T1204). [They showed it on the command line but I assume it can be issued interactively as well.] SQL Profiler understands locking and will provide information about lock acquisition, release, waits, etc., and can be used to capture deadlock events. There is a new table called syslockinfo which has much more information than the old syslocks. [Interestingly, this presenter encouraged querying this table, as opposed to the other presenter who discouraged it! Oh well, I’m sure it’s also part of the mindset of the presenter.]