Download the files for this article.
This is part 18 of a series. If you have not read part 17, you can find it here. That article is a good introduction to SQL Server 2005’s OUTPUT clause.
What We’re Testing
Part 17 introduced SQL Server 2005’s OUTPUT clause. Here, we will look at multiple ways to do the same thing. The scenario is that you have a pre-loaded staging table, and you need to load a destination table and a change-tracking “history” table. If there are n records in the staging table, you will load n records to the destination table and n records to the history table.
We will test multiple ways to accomplish this, two of which use the OUTPUT clause.
The Tables
For this test, we’ve made two main changes to the basic design from Part 17: added more columns and changed all tables to regular tables (not variables). There are now 15 bit columns. The only table that remains a table variable is the OUTPUT table variable (@insertedSet). We’ve also created two versions (“Base” and “Trig”) of the destination table so that the trigger added to DestinationTableTrig will not affect DestinationTableBase. The history tables also have two versions (“Base” and “Trig”) to allow a foreign key from the history table to the destination table.
CREATE TABLE dbo.StagingTable ( ColID int NOT NULL IDENTITY, X16384 bit NOT NULL, ... X00001 bit NOT NULL CONSTRAINT PK_StagingTable PRIMARY KEY CLUSTERED (ColID) ) CREATE DestinationTable[Base|Trig] TABLE ( ColID int NOT NULL IDENTITY, X16384 bit NOT NULL, ... X00001 bit NOT NULL, ComputedNum AS ( (X16384 * 16384) + (X00001 * 1) ), CreateDate datetime NOT NULL DEFAULT GETDATE(), ColGUID uniqueidentifier NOT NULL DEFAULT NEWSEQUENTIALID(), CONSTRAINT PK_DestinationTable[Base|Trig] PRIMARY KEY CLUSTERED (ColID) ) CREATE TABLE dbo.HistoryTable[Base|Trig] ( HistoryID int NOT NULL IDENTITY(-1, -1), SourceID int NOT NULL, ComputedNum int NOT NULL, SourceCreateDate datetime NOT NULL, SourceGUID uniqueidentifier NOT NULL, HistoryCreateDate datetime NOT NULL DEFAULT GETDATE() CONSTRAINT PK_HistoryTable[Base|Trig] PRIMARY KEY CLUSTERED (SourceCreateDate, HistoryID) ) GO ALTER TABLE ... ADD CONSTRAINT ... FOREIGN KEY(SourceID) REFERENCES dbo.DestinationTable[Base|Trig] (ColID) GO
The Tests
Set-up: Populating the Staging Table
Before starting any of the tests, we create the database, create the tables, create the trigger, create the stored procedures, and populate the staging table with n records.
Test 1a: Set-Based INSERTs Using OUTPUT (no TRAN)
For the first test, we use code based on Part 17’s OUTPUT code. Here is the basic code:
INSERT INTO dbo.DestinationTableBase ...
OUTPUT ... INTO <tableVariable>
SELECT ...
FROM dbo.StagingTable
INSERT INTO dbo.HistoryTableBase ...
SELECT ...
FROM <tableVariable>
This inserts the whole set of records from the staging table within an implicit transaction and then writes the history table records from the OUTPUT table variable.
Test 1b: Set-Based INSERTs Using OUTPUT (TRAN)
Test 1a but with a BEGIN TRAN/COMMIT wrapping the two INSERT statements.
Test 2: Set-Based INSERTs Using an INSERT Trigger
For the second test, we add an INSERT/UPDATE trigger to the destination table (DestinationTableTrig). This simple trigger inserts into the history table based on the data inserted into the destination table. The guts of the trigger look like this:
INSERT INTO HistoryTableTrig ... SELECT ... FROM Inserted
The INSERT code is then quite simple.
INSERT INTO dbo.DestinationTableTrig ... SELECT ... FROM dbo.StagingTable
Test 3: Iterative INSERTs Using Straight SQL
For the third test, we’re using very 100-level SQL. A real beginner would probably bring up BOL and struggle through running this in a cursor, but using a counter and a WHILE loop is so much less painful. Here’s the basic plan:
WHILE @ct < <StagingTable's recordCount> BEGIN INSERT INTO dbo.DestinationTableBase ... SELECT ... FROM dbo.StagingTable WHERE ColID = @ct SELECT @thisID = SCOPE_IDENTITY() INSERT INTO dbo.HistoryTableBase ... SELECT ... FROM dbo.DestinationTableBase WHERE ColID = @thisID END
Test 4: Iterative INSERTs Using an INSERT Trigger
The fourth test uses the trigger but iterates through the table one record at a time. The INSERT code is again fairly simple.
WHILE @ct < <StagingTable's recordCount> BEGIN INSERT INTO dbo.DestinationTableTrig ... SELECT ... FROM dbo.StagingTable WHERE ColID = @ct END
Test 5a: Set-Based INSERTs Using OUTPUT in Stored Procedure (no TRAN)
The fifth test uses code similar to test 1, but inside of a stored procedure.
Test 5b: Set-Based INSERTs Using OUTPUT in Stored Procedure (TRAN)
Test 5a but with a BEGIN TRAN/COMMIT wrapping the two INSERT statements inside the stored procedure.
The Results
The tests were run with differing record counts in the staging table. The time to run each test was recorded.
|
|
Records: |
Records: |
Records: |
Test |
Tran |
Elapsed ms |
Elapsed ms |
Elapsed ms |
1a (Set-Based INSERTs, OUTPUT) |
N |
16 |
820 |
35,953 |
1b |
Y |
16 |
793 |
36,306 |
2 (Set-Based INSERTs, Trigger) |
I* |
30 |
593 |
19,923 |
3 (Iterative INSERTs, Straight SQL) |
N |
190 |
12,426 |
376,663 |
4 (Iterative INSERTs, Trigger) |
I |
140 |
8,016 |
244,296 |
5a (Set-Based INSERTs, OUTPUT, in SP) |
N |
16 |
1,056 |
29,370 |
5b |
Y |
13 |
730 |
27,830 |
* N: Multiple statements without an explicit transaction. Y: Multiple statements with an explicit transaction. I: Single statement with an implicit transaction.
Note that I ran the whole test script twice each time and only recorded the second set of timings.
Clear the Table Variable?
One unexpected bit I learned during this test is illustrated with this simple test.
DECLARE @t1 TABLE (ColID tinyint IDENTITY, Col1 varchar(11)) DECLARE @t1inserted TABLE (ColID tinyint, Col1 varchar(11)) -- Insert 'Hello World' once INSERT INTO @t1 OUTPUT Inserted.ColID, Inserted.Col1 INTO @t1inserted VALUES ('Hello World') -- Clear the "base" table DELETE FROM @t1 -- Insert 'Hello World' once INSERT INTO @t1 OUTPUT Inserted.ColID, Inserted.Col1 INTO @t1inserted VALUES ('Hello World') SELECT * FROM @t1 SELECT * FROM @t1inserted
Contents of @t1:
ColID Col1 ----- ----------- 2 Hello World
Contents of @t1inserted (the table variable):
ColID Col1 ----- ----------- 1 Hello World 2 Hello World
Notice that the table variable retains its records from one call to the next. It makes perfect sense once you think about it, and it is the correct behavior, but you have to be aware of it when using the OUTPUT clause. You may need to clear the table variable at times to make sure it is empty before doing the next INSERT INTO … OUTPUT … INTO <tableVariable> call. I had to do that between calls in the test script.
The code for this simple test is in the attached HelloWorld_InsertMultiple.sql file.
Conclusion
The OUTPUT clause is very useful for pulling “automatic” values out of an INSERT statement (or UPDATE). What I was looking at here was how it performs compared to other options.
To no one’s surprise, the set-based solutions are far superior to the record-at-a-time solutions. As more and more records are involved, the record-at-a-time solutions get even worse.
It is important to see that wrapping the creation of the base records and the history records is faster using an explicit transaction (whether coded directly or wrapped in a stored procedure). It is also important to note that none of these solutions is as fast as a set-based trigger.
Many SQL Server developers believe that triggers are generally evil, and I tend to fall into that camp. There are two places where I think triggers are appropriate. One is the creation of history records, which is the exact case we have here. The other is for directly populating tables that are used to populate a data warehouse.
So, after building and optimizing the OUTPUT clause, the best-performing option for this particular scenario is a trigger. Whether you think it’s evil or not, it does perform well.
I may have missed some optimization here. Do you see something that would make the OUTPUT clause work as fast as the trigger? If so, please drop a comment in the forum
Download the files for this article.