SqlCredit, Part 18: Exploring the Performance of SQL 2005's OUTPUT Clause
August 21, 2008
Download the files for this article.
This is part 18 of a series. If you have not read part 17, you can find it here. That article is a good introduction to SQL Server 2005s OUTPUT clause.
What Were Testing
Part 17 introduced SQL Server 2005s OUTPUT clause. Here, we will look at multiple ways to do the same thing. The scenario is that you have a pre-loaded staging table, and you need to load a destination table and a change-tracking history table. If there are n records in the staging table, you will load n records to the destination table and n records to the history table.
We will test multiple ways to accomplish this, two of which use the OUTPUT clause.
For this test, weve made two main changes to the basic design from Part 17: added more columns and changed all tables to regular tables (not variables). There are now 15 bit columns. The only table that remains a table variable is the OUTPUT table variable (@insertedSet). Weve also created two versions (Base and Trig) of the destination table so that the trigger added to DestinationTableTrig will not affect DestinationTableBase. The history tables also have two versions (Base and Trig) to allow a foreign key from the history table to the destination table.
CREATE TABLE dbo.StagingTable ( ColID int NOT NULL IDENTITY, X16384 bit NOT NULL, ... X00001 bit NOT NULL CONSTRAINT PK_StagingTable PRIMARY KEY CLUSTERED (ColID) ) CREATE DestinationTable[Base|Trig] TABLE ( ColID int NOT NULL IDENTITY, X16384 bit NOT NULL, ... X00001 bit NOT NULL, ComputedNum AS ( (X16384 * 16384) + (X00001 * 1) ), CreateDate datetime NOT NULL DEFAULT GETDATE(), ColGUID uniqueidentifier NOT NULL DEFAULT NEWSEQUENTIALID(), CONSTRAINT PK_DestinationTable[Base|Trig] PRIMARY KEY CLUSTERED (ColID) ) CREATE TABLE dbo.HistoryTable[Base|Trig] ( HistoryID int NOT NULL IDENTITY(-1, -1), SourceID int NOT NULL, ComputedNum int NOT NULL, SourceCreateDate datetime NOT NULL, SourceGUID uniqueidentifier NOT NULL, HistoryCreateDate datetime NOT NULL DEFAULT GETDATE() CONSTRAINT PK_HistoryTable[Base|Trig] PRIMARY KEY CLUSTERED (SourceCreateDate, HistoryID) ) GO ALTER TABLE ... ADD CONSTRAINT ... FOREIGN KEY(SourceID) REFERENCES dbo.DestinationTable[Base|Trig] (ColID) GO
Set-up: Populating the Staging Table
Before starting any of the tests, we create the database, create the tables, create the trigger, create the stored procedures, and populate the staging table with n records.
Test 1a: Set-Based INSERTs Using OUTPUT (no TRAN)
For the first test, we use code based on Part 17s OUTPUT code. Here is the basic code:
INSERT INTO dbo.DestinationTableBase ... OUTPUT ... INTO <tableVariable> SELECT ... FROM dbo.StagingTable INSERT INTO dbo.HistoryTableBase ... SELECT ... FROM <tableVariable>
This inserts the whole set of records from the staging table within an implicit transaction and then writes the history table records from the OUTPUT table variable.
Test 1b: Set-Based INSERTs Using OUTPUT (TRAN)
Test 1a but with a BEGIN TRAN/COMMIT wrapping the two INSERT statements.
Test 2: Set-Based INSERTs Using an INSERT Trigger
For the second test, we add an INSERT/UPDATE trigger to the destination table (DestinationTableTrig). This simple trigger inserts into the history table based on the data inserted into the destination table. The guts of the trigger look like this:
INSERT INTO HistoryTableTrig ... SELECT ... FROM Inserted
The INSERT code is then quite simple.
INSERT INTO dbo.DestinationTableTrig ... SELECT ... FROM dbo.StagingTable
Test 3: Iterative INSERTs Using Straight SQL
For the third test, were using very 100-level SQL. A real beginner would probably bring up BOL and struggle through running this in a cursor, but using a counter and a WHILE loop is so much less painful. Heres the basic plan:
WHILE @ct < <StagingTable's recordCount> BEGIN INSERT INTO dbo.DestinationTableBase ... SELECT ... FROM dbo.StagingTable WHERE ColID = @ct SELECT @thisID = SCOPE_IDENTITY() INSERT INTO dbo.HistoryTableBase ... SELECT ... FROM dbo.DestinationTableBase WHERE ColID = @thisID END
Test 4: Iterative INSERTs Using an INSERT Trigger
The fourth test uses the trigger but iterates through the table one record at a time. The INSERT code is again fairly simple.
WHILE @ct < <StagingTable's recordCount> BEGIN INSERT INTO dbo.DestinationTableTrig ... SELECT ... FROM dbo.StagingTable WHERE ColID = @ct END
Test 5a: Set-Based INSERTs Using OUTPUT in Stored Procedure (no TRAN)
The fifth test uses code similar to test 1, but inside of a stored procedure.
Test 5b: Set-Based INSERTs Using OUTPUT in Stored Procedure (TRAN)
Test 5a but with a BEGIN TRAN/COMMIT wrapping the two INSERT statements inside the stored procedure.
The tests were run with differing record counts in the staging table. The time to run each test was recorded.
* N: Multiple statements without an explicit transaction. Y: Multiple statements with an explicit transaction. I: Single statement with an implicit transaction.
Note that I ran the whole test script twice each time and only recorded the second set of timings.
Clear the Table Variable?
One unexpected bit I learned during this test is illustrated with this simple test.
DECLARE @t1 TABLE (ColID tinyint IDENTITY, Col1 varchar(11)) DECLARE @t1inserted TABLE (ColID tinyint, Col1 varchar(11)) -- Insert 'Hello World' once INSERT INTO @t1 OUTPUT Inserted.ColID, Inserted.Col1 INTO @t1inserted VALUES ('Hello World') -- Clear the "base" table DELETE FROM @t1 -- Insert 'Hello World' once INSERT INTO @t1 OUTPUT Inserted.ColID, Inserted.Col1 INTO @t1inserted VALUES ('Hello World') SELECT * FROM @t1 SELECT * FROM @t1inserted
Contents of @t1:
ColID Col1 ----- ----------- 2 Hello World
Contents of @t1inserted (the table variable):
ColID Col1 ----- ----------- 1 Hello World 2 Hello World
Notice that the table variable retains its records from one call to the next. It makes perfect sense once you think about it, and it is the correct behavior, but you have to be aware of it when using the OUTPUT clause. You may need to clear the table variable at times to make sure it is empty before doing the next INSERT INTO ... OUTPUT ... INTO <tableVariable> call. I had to do that between calls in the test script.
The code for this simple test is in the attached HelloWorld_InsertMultiple.sql file.
The OUTPUT clause is very useful for pulling automatic values out of an INSERT statement (or UPDATE). What I was looking at here was how it performs compared to other options.
To no ones surprise, the set-based solutions are far superior to the record-at-a-time solutions. As more and more records are involved, the record-at-a-time solutions get even worse.
It is important to see that wrapping the creation of the base records and the history records is faster using an explicit transaction (whether coded directly or wrapped in a stored procedure). It is also important to note that none of these solutions is as fast as a set-based trigger.
Many SQL Server developers believe that triggers are generally evil, and I tend to fall into that camp. There are two places where I think triggers are appropriate. One is the creation of history records, which is the exact case we have here. The other is for directly populating tables that are used to populate a data warehouse.
So, after building and optimizing the OUTPUT clause, the best-performing option for this particular scenario is a trigger. Whether you think its evil or not, it does perform well.
I may have missed some optimization here. Do you see something that would make the OUTPUT clause work as fast as the trigger? If so, please drop a comment in the forum
Download the files for this article.SqlCredit - Developing a Complete SQL Server OLTP Database Project
Performance Testing SQL 2008's Transparent Data Encryption
SQL Server 2008's Change Data Capture - Tracking the Moving Parts
Performance Testing - SQL Server 2008 versus SQL Server 2005
Exploring SQL Server's Index INCLUDEs
Exploring SQL 2005's Ranking Functions - NTILE() and ROW_NUMBER()
SqlCredit - Part 19: Exploring SQL 2005's Ranking Functions - RANK() and DENSE_RANK()
SqlCredit, Part 18: Exploring the Performance of SQL 2005's OUTPUT Clause
SqlCredit - Part 17: Exploring SQL 2005's OUTPUT Clause
SqlCredit - Part 16: The Cost of Bloat
SqlCredit - Part 15: The Cost of Distribution
SqlCredit - Part 14: The Cost of Translation
SqlCredit - Part 13: More on Indexed Persisted Computed Columns
SqlCredit - Part 12: Exploring Indexed Persisted Computed Columns
SqlCredit - Part 11: Change Tracking Using History Records
SqlCredit - Part 10: MAC Performance and Updating SqlCredit
SqlCredit - Part 9: Message Authentication Codes
SqlCredit Part 8: Comparing Encrypt/DecryptByCert and Encrypt/DecryptByKey
SqlCredit Part 7: Performance Impact of EncryptByCert and DecryptByCert
SqlCredit Part 6: Exploring EncryptByCert and DecryptByCert
SqlCredit - Part 5: Adding Card, Vendor, and Purchase, Plus Much Refactoring
SqlCredit - Part 4: Schema and Procedure Security
SqlCredit - Part 3: Schema/Proc Updates and Automated Unit Testing
SqlCredit - Part 2: Creating the Database, Tables, CRUD Procedures
SqlCredit - Developing a Complete SQL Server OLTP Database Project