Checkpoint process: from SQL Server 6.5 to 7.0

A checkpoint occurs each time the system needs to write
chunks of data from the data cache to the disk. You can read in documentation that 7.0 checkpoints are less intrusive than the 6.5 one. How come?

A frequency question

The frequency of this process is handled by the value of the recovery interval option. In 6.5, a default value of 5 meant that in case of a failure, the system would take no more than five minutes per database to recover, so the system was writing modified pages in memory (called dirty pages) every 30,000 transactions. In 7.0, a default value of 0 means that SQL Server handles the checkpoint frequency so that the system will recover in less than 1 minute, in case of a failure.
If course you could think, that is equivalent to set the value (in 6.5) of recovery interval to 1. Not exactly, because the checkpoint mechanism has changed a little bit.

A process question

Each time a query needs to access a data page, it reads it from disk if it’s not in memory. If the query modifies data in that page, it becomes dirty and it is desynchronized from the same page on disk. It is generally written back to disk during the checkpoint process, is then clean and remains clean in memory.
In 6.5 every dirty page was written to disk during a checkpoint. So if a lot of pages had been modified since the last checkpoint, the actual checkpoint could last quite a long time.
In 7.0, a dirty page is written to the disk if it was dirty during the previous checkpoint. That means there are less pages written during each 7.0 checkpoint than during 6.5 checkpoints. Let’s take an example: if page number 25 becomes dirty, it will not be written to the disk by the next checkpoint, but by the checkpoint after the next one.
The positive effect of that strategy is the length of the checkpoint: shorter than in 6.5. The side effect of it is the recovery time that could be longer. But in fact, choosing to do more frequent but shorter checkpoint is a better strategy that doing less frequent but longer checkpoints.

Wrap-up

In 6.5 the checkpoint process was writing every dirty page to the disk. In 7.0, it writes only the pages that were already dirty at the last checkpoint; those that are dirty but were clean at the last checkpoint will be written by the next checkpoint.
Do not modify the recovery interval value unless you benchmark your modification and conclude a fixed value is better. But remember is always have a side effect on recovery time.
Last but not least, the max async io option has a direct effect on checkpoint performance. With the right value, you minimize the checkpoint time.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles