MERGE Your DML Statements in SQL Server 2008

Introduction

MERGE is a new statement introduced in the SQL:2003 standard
for performing multiple DML (Data Manipulation Language) statements against a
target table at once. In this article we’ll look into ways to take advantage
of this powerful addition to SQL Server 2008.

What is MERGE

In SQL Server 2005 and prior, if you needed to insert,
update, and delete from a table based on changes to another table, you had to
write all three statements separately. With SQL Server 2008, all the changes
to the target table can be combined into one MERGE statement. The advantage
being that the tables being used for the MERGE are processed one time instead
of one per INSERT, UPDATE, or DELETE.

MERGE takes a source query to use as a basis for determining
what differences exist between it and the target table. Match conditions are
established to direct when and what data are affected in the target table by
the existence of a match or the lack of a match.

Below is the syntax for MERGE from SQL Server 2008 Books
Online. I believe the syntax is best understood by looking at examples.

[ WITH <common_table_expression> [,...n] ]
MERGE
     [ TOP ( expression ) [ PERCENT ] ] 
     [ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ]
     USING <table_source> 
      ON <merge_search_condition>
     [ WHEN MATCHED [ AND <clause_search_condition> ]
         THEN <merge_matched> ] [...n ]
     [ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ]
         THEN <merge_not_matched> ]
      [ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ]
         THEN <merge_matched> ] [...n ]
     [ <output_clause> ]
     [ OPTION ( <query_hint> [ ,...n ] ) ]    
;

MERGE Examples

There are many potential uses for the MERGE statement such
as with integration of systems, data warehousing, and even to facilitate
movement of reference data changes from development through testing and into
production. Here we’ll examine the first one, integration between systems.

Assume you work for the e-commerce division of a large pet
store corporation. That company has many brick-and-mortar pet stores as well
as your online storefront for pet supplies. Your job is to make sure that certain
data is properly maintained between your storefront and the pet stores in order
to ensure the customer has a consistent experience across the corporation.
The most important piece of information to be concerned with is the customer
data. If John Smith walks into the pet shop in Chesapeake and purchases a
puppy, you want to know near real time. So when he returns home and realizes
he needs squeaky toys to substitute for his now maimed tennis shoes, he can
come straight to your online pet supply store and get a warm fuzzy when he
realizes the site already knows him and has recommended products he might find
useful for improving his experience with his new billy goat..er…puppy.

When John Smith purchased that puppy, the Chesapeake store
collected data about him and sent it off to a service which in turn calls your
stored procedure to update the online store’s database. Your stored procedure
used to look like the following:

CREATE PROCEDURE SaveCustomer (
      @custID     INTEGER,
      @custName   VARCHAR(100),
      @custAddr1  VARCHAR(100),
      @custAddr2  VARCHAR(100),
      @custCity   VARCHAR(50),
      @custState  CHAR(2),
      @custZip    VARCHAR(9),
      @custPhone  VARCHAR(10),
      @custEmail  VARCHAR(100))  AS
BEGIN
 
IF EXISTS (SELECT 1 FROM customer WHERE custID = @custID)
      UPDATE customer
         SET custName  = @custName,
             custAddr1 = @custAddr1,
             custAddr2 = @custAddr2,
             custCity  = @custCity,
             custState = @custState,
             custZip   = @custZip,
             custPhone = @custPhone,
             custEmail = @custEmail
      WHERE custID = @custID
ELSE
      INSERT INTO customer (custID, custName, custAddr1, custAddr2, custCity, 
                            custState, custZip, custPhone, custEmail)
      VALUES (@custID, @custName, @custAddr1, @custAddr2, @custCity, @custState, 
              @custZip, @custPhone, @custEmail)
END

Now that you’ve discovered MERGE, you re-write your stored
procedure to look like this.

ALTER PROCEDURE SaveCustomer (
      @custID     INTEGER,
      @custName   VARCHAR(100),
      @custAddr1  VARCHAR(100),
      @custAddr2  VARCHAR(100),
      @custCity   VARCHAR(50),
      @custState  CHAR(2),
      @custZip    VARCHAR(9),
      @custPhone  VARCHAR(10),
      @custEmail  VARCHAR(100))  AS
BEGIN
 
MERGE dbo.customer AS cust
    USING (SELECT @custID, @custName, @custAddr1, @custAddr2, 
    @custCity, @custState, @custZip, @custPhone, @custEmail) 
    AS src (custID, custName, custAddr1, custAddr2, 
    custCity, custState, custZip, custPhone, custEmail)
    ON cust.custID = src.custId
    WHEN MATCHED THEN 
        UPDATE SET custName       = src.custName,
                   cust.custAddr1 = src.custAddr1,
                   cust.custAddr2 = src.custAddr2,
                   cust.custCity  = src.custCity,
                   cust.custState = src.custState,
                   cust.custZip   = src.custZip,
                   cust.custPhone = src.custPhone,
                   cust.custEmail = src.custEmail
      WHEN NOT MATCHED THEN   
          INSERT (custID, custName, custAddr1, custAddr2, custCity, 
                  custState, custZip, custPhone, custEmail)
          VALUES (src.custID, src.custName, src.custAddr1, src.custAddr2, 
                  src.custCity, src.custState, src.custZip, src.custPhone, src.custEmail)
END

Let’s break this down. The first thing after the keyword
MERGE is the target table. The USING clause specifies the source result set.
In this case, the source comes from the arguments passed into the stored
procedure. The source result set is aliased as src and the columns are given
aliases as well. ON specifies the condition to be used to determine a match
between the source and target tables. WHEN MATCHED THEN tells the engine what
action should be performed on the target table should the ON clause result in a
match between the tables. In our case, if the customer is found by custID, we
want to uspdate that row in our customer table with the values passed in. If a
matching row is not found, we drop into the action found under WHEN NOT
MATCHED. This time, the customer is not found by custID, so we want to insert
him into our customer table. Simple as that.

Another example of a use for MERGE, as I mentioned above is
the movement of reference data changes through your development, test, and
production environments. Over the course of application maintenance, there is
often need to change or add to the reference data supplied to the application.
This reference data may be the products you offer, the countries you do
business in, part numbers, etc.

You write the script up that includes the data changes you
are making for the next release and it’s packaged up and sent to testing.
Something else breaks in testing that is part of the larger build. Now your
script has to be smart enough to know if it’s already been run in the test
database. Let’s take a look at this. Here’s your original script for a new
product entry. You’ve been bitten before by your script being run more than
once in the same database, so you’ve wrapped it in an IF condition.

IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201005)
INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag)
VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y')
GO
IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201006)
INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag)
VALUES (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y')
GO

This same script would look like this using MERGE.

MERGE Products AS p
USING (VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y'), 
              (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y') )
      AS src (srcProductID, srcProductName, srcManufName, srcActiveFlag)
      ON p.ProductID = src. ProductID
WHEN NOT MATCHED THEN
      INSERT (ProductID, ProductName, ManufName, ActiveFlag)
      VALUES (srcProductID, srcProductName, srcManufName, srcActiveFlag)

Personally, I find this syntax easier to write when there
are many new data values to add. You can put all the new data together in the
USING clause instead of in multiple VALUES clauses seperated by INSERTS as in
the previous script.

Conclusion

SQL Server 2008 offers a new DML statement that allows you
to combine INSERT, UPDATE, and DELETE operations into one. By processing the
source and target tables on time using MERGE you can realize performance gains
in addition to script simplification.

For More Information

MSDN – MERGE

TechNet – Optimizing
MERGE Statement Performance

»


See All Articles by Columnist

Deanna Dicken

Deanna Dicken
Deanna Dicken
Deanna Dicken is an architect with approximately 20 years of IT experience. In that time she has worked on many large-scale, mission-critical applications and has been involved at all phases of the lifecycle. She has also contributed to three SQL Server MCSE books, co-authored Learn SQL in a Weekend, and tech edited many other titles. She lives on the outskirts of Indianapolis, Indiana with her husband, Curtis, and children, Kylee and Zach.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles