Rapid Database Recovery with Differential Backups
June 23, 2000
Differential backups are a new feature in SQL 7 that can speed up your recovery times when restoring from database and transaction log backups.
What is a differential backup?
A differential backup is a record of all the pages that have changed since the last full database dump. They differ from Transaction Log backups in that they always go back as far as the last full dump, so if you take two differential backups one after the other, the second contains everything in the first, plus all subsequent changes.
If you did two successive transaction log backups, the second backup would only contain changes to the database made after you took the first transaction log backup.
Creating more efficient backup strategies
The simplest way (and prior to SQL 7 the only way) to ensure you were able to recover your database to a specific point in time would be to take regular transaction log dumps, and to restore the database dump and all subsequent transaction logs up to the desired recovery point.
In this example we take a full backup at midnight, and a transaction log backup every hour. To recover the database to it's status at 9:30 we have to first restore the full database dump, then all the transaction log dumps from 1 am to 10 am, specifying "stop at 9:30" for the last (10 am) restore.
This technique works just fine, but we have just had to restore a total of 11 backup files. We can do better than this if we use differential backups. In this next example we add in a differential backup at 8 am. Now to restore our database to it's state of play as at 9:30, we only have to restore 4 files, the full database dump, the differential, and then the transaction log backups from 9 am and 10 am.
We can extend this into the day by making further differential dumps--you only need to restore the original database dump, the most recent prior differential, and the subsequent transaction log dumps, to get back to any point in time.
Let's take the example above and extend it later into the day. Here we take a second differential backup at 5 pm, and we also continue to take out hourly transaction log dumps. To restore the database to 5:30, all we need to restore is the original database dump, the second differential dump (remember, it includes the contents of the previous differential) and the 6 pm transaction log dump with a "stop at" parameter of 5:30.
In the example above we have only had to restore from 3 files. Without differential dumps we would have had to restore every transaction log from 1 am to 6 pm - a total of 19 files when you include the full database dump.
Timing is everything
I chose 8 am and 5 pm for the differential backups in the above example because it represents more-or-less the boundaries of the normal working day. Obviously timings for backups should be dictated by your own environment.
Remember--you still need all your transaction log dumps
Differential backups do not give you the ability to stop at a specific time, so in the above examples if we wanted to restore the database to 4 am, you need to restore the database dump and the subsequent transaction logs up until the 4 am one. The 5 am differential backup contains the data we need, but because it's internal structure is different to that of a transaction log backup, it does not know how to stop at a given point in time.