Understanding Transaction Logs
In my last article I gave a brief
description of what the transaction log is. However, we did
not go over what a transaction is. The simplest definition
of a transaction is a "logical unit of work." In other
words, it is a grouping of SQL statements that perform one
logical task. The classic example that is used most of the
time to describe transactions is that of a bank transfer.
Say you have two bank accounts (Account A and Account B) and
you want to transfer funds from one to the other. If you
execute the two SQL statements individually (one to remove
the funds from Account A and one to add funds to Account
B), it is possible to remove funds from Account A
successfully but have the second statement fail and not add
the funds to Account B. In the real world this would make
for some very unhappy customers. To overcome this problem
we can group statements into transactions. When statements
are grouped into transactions, either they all complete
successfully and are "committed," or if any one of the
statements fails they are all "rolled back" and the data is
left in the condition it was in before the transaction
began. Until we get into data integrity, this definition
will do.
The transaction log stores all the
changes as they are made to the database. Once all the
statements in the transaction have made their modifications
to the database, the transaction is said to be committed
once the commit is recorded in the transaction log. Once the
transaction is committed, why does it need to stay in the
log? To understand, you need to know what is going on under
the hood of SQL Server.
As data is requested from a database, the information is
stored in memory. Because accessing memory is so much
faster than accessing the disk subsystem, SQL Server can
improve performance dramatically.
First request for data from a database (request is made,
pages are loaded into memory, data is sent to the client):
Request
for data that is already in memory (pages are cashed in
memory, so they are used):
When you make a modification to the data in a database, SQL Server
makes the changes to the data stored in memory (not on the
disk) and an entry is made in the transaction log. Pages in memory that
contain data that has been changed are called "dirty
pages."
In order
to have pages written back to disk and release memory, SQL
Server uses a thread called the lazywriter. When the
lazywriter writes pages to
disk, it uses the least recently used (LRU) page list to
determine what pages should be written to disk. Pages that
have not been used recently are at the beginning of the LRU
list, however recently modified pages are at the end of the
LRU list. Because of this, very heavily accessed pages may
not be written to disk for a large amount of time (or
never, in theory).
Page 4: Transaction Logs (Continued)
»
See All Articles by Columnist Michael Aubert