Have you ever wondered, especially in the case of a data warehousing scenario, why the transaction log file grows bigger and bigger and sometimes even much bigger than your actual database’s data files? What caused it to happen? How do you control it? How does the recovery model of a database control the growing size of the transaction log? These are some of the questions I am going to explain to you in this article.
Transaction logging is the internal mechanism of SQL Server that keeps logging all transactions and the database modifications that are made by each transaction as a string of log records in a serial sequence as they are created. This is important to bring the database back to a consistent state if there is a system failure. But what about the growth of the transaction log, how is its size controlled? You can learn more about Transaction Log and its architecture here.
Recovery Model is one of the mechanisms which controls and manages the growth of the transaction log file. Recovery Model controls how transactions are logged, whether there is automatic log truncation, whether the transaction log requires and/or allows backing up the transaction log, and what kind of restore operations are available.
Every database in SQL Server has a property called Recovery Model, which could have either of Simple, Bulk-logged and Full value based on your different needs for performance, storage space, and protection against data loss. You need to evaluate the trade-off between performance of your bulk operations (index creation or bulk loads), storage space needed for storage of the transaction log, the possibility of data loss (loss of committed transactions) and simplicity in backup and restore operations. A database can be switched to another recovery model at any time in order to meet the changing business needs but before doing that, please, evaluate its impact.
Depending on your need you might need to use more than one Recovery Model. For example consider you have a mission-critical OnLine Transaction Processing (OLTP) database, so in order to be able to restore to any point in time (so that no committed transactions are lost), you should use Full recovery model but when you are performing some bulk operations (Index Creation, SELECT INTO, INSERT SELECT, BCP, BULK INSERT) you can switch to Bulk-logged recovery model for minimal logging and after completion switch back to Full recovery model. This reduces the chances of filling up the available transaction log space during bulk operations.
Full Recovery Model
As its name implies, Full recovery model logs every transaction and maintains it there until a transaction log backup is taken (or full backup is taken, which includes everything). With this recovery model, you can devise a disaster recovery plan that includes a combination of full backup (or/and differential backup) and transaction log backups. To control the size of the transaction log, you need to take a transaction log backup so that it gets truncated.
With Full recovery model, you can recover to an arbitrary point in time (for example, prior to application or user error) from transaction log backups and hence no work is lost due to lost or damaged data files.
Advantage – Full recovery model provides complete protection against data loss. In the unfortunate case of disaster or application\user error, you can restore to the point-in-time by using the available transaction log backups (assuming your transaction backups are complete up to that point in time).
Disadvantage – With full recovery model, you need to setup a regular transaction log backup to ensure the growth of transaction log files are under control otherwise it will keep on growing until your next full backup. Also, if the transaction log is damaged, changes since the most recent transaction log backup must be redone.
When to use it – Full recovery model is recommended for OLTP databases, where you have mostly short lived transactions and you don’t want to lose data for a committed transaction. There are certain other features – like AlwaysOn, Database mirroring, Log shipping, Transaction replication, Change data capture – in SQL Server that require Full recovery model when you are using them.
Bulk-logged Recovery Model
Bulk-logged recovery model is similar to Full recovery model with the exception that bulk data modification operations (Index Creation, SELECT INTO, INSERT SELECT, BCP, BULK INSERT) are minimally logged in this case and hence it reduces the performance impact but at the same time, you might not be able to do point-in-time restore. As a recommended practice, Bulk-logged recovery model is used with full recovery model, i.e. you should generally have Full recovery model for normal operations and switch to Bulk-logged recovery model temporarily when you are starting occasional bulk operations. Finally at completion of bulk operation, reverse back to Full recovery model. It’s also recommended to take a transaction log backup after switching back to Full recovery model if point-in-time recovery is important.
Like Full recovery model, the transaction log file will keep on growing and hence you need to take transaction log backups frequently. If there are no bulk operations, Bulk-logged is the same as Full recovery model and you can recover to the point-in-time as the transaction log contains full sequential records of all the changes made to the database.
Advantage – Allows better performance for bulk data operations by only doing minimal logging for these transactions and not letting the transaction log grow significantly because of these bulk data operations (lowering log space consumption for bulk operations).
Disadvantage – There is a possibility of data loss if the log is damaged or bulk-logged operations occurred since the most recent log backup and hence changes since that last backup must be redone.
When to use it – Its recommended to switch to Bulk-logged recovery model before starting any occasional bulk operations and then reverse back to Full recovery model after completion of bulk operations. This way you can still restore to point-in-time (as long as your last transaction log backup does not include a bulk operation) and can have bulk operations logged minimally.
Note – Minimal logging means logging only information needed to recover the transaction without supporting point-in-time recovery. With minimal logging, the transaction log keeps track of pages that were changed by bulk operations based on Bulk Changed Map (MCP) page instead of logging each individual change (older value or new value). This way the transaction log remains smaller but when you take a backup of the transaction log, it includes all the changed pages and hence even though the transaction log remains smaller, the transaction log backup might be much bigger than this.
Simple Recovery Model
Simple recovery model is the simplest of all. It maintains only a minimum amount of information in the SQL Server transaction log file. SQL Server, on its own, truncates the transaction log files (excluding logs from any open transactions) and removes the information related to transactions which have reached transaction checkpoints (data has been written to the data file) so that the space can be reused, leaving no transaction log entries for disaster recovery purposes. Having said that, with Simple recovery model, the data is recoverable only to the most recent full database or differential backups (no transaction log backups are supported). Under the Simple recovery model, transaction log truncation happens after a checkpoint or as soon as you change the recovery model of your database to Simple recovery model.
Managing databases with Simple recovery model is much easier but it comes at the expense of higher data loss exposure if a data file is damaged. You can only restore from the latest full\differential backups; this means you will automatically lose any data modifications made between the time of the latest full/differential backup and the time of the failure. Hence, if you use Simple recovery model for your database, you should keep the backup interval long enough to keep the backup overhead from affecting production work and at the same time short enough to prevent a significant amount of data loss.
Advantage – Manageability with Simple recovery model is much easier–no need to take transactional backups. It reclaims transaction log spaces from check-pointed transactions to ensure the growth of transaction log files are under control. Bulk-logged operations perform much better because of minimal logging and as minimal transaction log space is used.
Disadvantage – You will lose any data modifications made between the time of the latest full/differential backup and the time of the failure when you want to restore.
When to use it – In a data warehousing scenario, where you mostly have bulk operations while data loading and in case of failure, the data can be regenerated from the data source. You can also prefer using Simple recovery model in your development or test environment to ensure the growth of the transaction log files are controlled.
Changing Recovery Model with T-SQL
You can use ALTER DATABASE command with the SET RECOVERY option to change the recovery model of a database. For example, the query below changes the recovery model of the AdventureWorks database to Full recovery model.
ALTER DATABASE AdventureWorks SET RECOVERY FULL ;
You can query the sys.databases catalog view to verify the recovery model of the database as shown below:
SELECT name, recovery_model, recovery_model_desc FROM sys.databases WHERE name = 'AdventureWorks' ;
You can use the command below to change the recovery model to Bulk-logged or Simple, just replace your database name in place of AdventureWorks:
--Changing recovery model to Bulk-logged ALTER DATABASE AdventureWorks SET RECOVERY BULK_LOGGED ; --Changing recovery model to Simple ALTER DATABASE AdventureWorks SET RECOVERY SIMPLE ;
When you create a new database, it inherits the recovery model from the model database, which is by default Full recovery model. To change the default recovery model, you can use the ALTER DATABASE statement, as mentioned above, to change the recovery model of the model database.
Please note, if you intend to maintain a sequence of transaction log backups, you cannot switch to or from Simple recovery model.
Changing Recovery Model with SQL Server Management Studio (SSMS)
You can also change the recovery model for a database in SQL Server Management Studio. In the Object Explorer, right click on your database under the Databases node and then click on Properties. In the Database Properties dialog box, click on the Options tab and then change the recovery model as shown below:
In this article we discussed Recovery Model, which is one of the mechanisms that controls and manages the growth of the transaction log file. I discussed different types of Recovery Model, with its advantages, disadvantages and when to choose them over others. Then I talked about changing recovery model for a database either using T-SQL or SQL Server Management Studio.