MERGE is a new statement introduced in the SQL:2003 standard
for performing multiple DML (Data Manipulation Language) statements against a
target table at once. In this article we’ll look into ways to take advantage
of this powerful addition to SQL Server 2008.
What is MERGE
In SQL Server 2005 and prior, if you needed to insert,
update, and delete from a table based on changes to another table, you had to
write all three statements separately. With SQL Server 2008, all the changes
to the target table can be combined into one MERGE statement. The advantage
being that the tables being used for the MERGE are processed one time instead
of one per INSERT, UPDATE, or DELETE.
MERGE takes a source query to use as a basis for determining
what differences exist between it and the target table. Match conditions are
established to direct when and what data are affected in the target table by
the existence of a match or the lack of a match.
Below is the syntax for MERGE from SQL Server 2008 Books
Online. I believe the syntax is best understood by looking at examples.
[ WITH <common_table_expression> [,...n] ] MERGE [ TOP ( expression ) [ PERCENT ] ] [ INTO ] <target_table> [ WITH ( <merge_hint> ) ] [ [ AS ] table_alias ] USING <table_source> ON <merge_search_condition> [ WHEN MATCHED [ AND <clause_search_condition> ] THEN <merge_matched> ] [...n ] [ WHEN NOT MATCHED [ BY TARGET ] [ AND <clause_search_condition> ] THEN <merge_not_matched> ] [ WHEN NOT MATCHED BY SOURCE [ AND <clause_search_condition> ] THEN <merge_matched> ] [...n ] [ <output_clause> ] [ OPTION ( <query_hint> [ ,...n ] ) ] ;
There are many potential uses for the MERGE statement such
as with integration of systems, data warehousing, and even to facilitate
movement of reference data changes from development through testing and into
production. Here we’ll examine the first one, integration between systems.
Assume you work for the e-commerce division of a large pet
store corporation. That company has many brick-and-mortar pet stores as well
as your online storefront for pet supplies. Your job is to make sure that certain
data is properly maintained between your storefront and the pet stores in order
to ensure the customer has a consistent experience across the corporation.
The most important piece of information to be concerned with is the customer
data. If John Smith walks into the pet shop in Chesapeake and purchases a
puppy, you want to know near real time. So when he returns home and realizes
he needs squeaky toys to substitute for his now maimed tennis shoes, he can
come straight to your online pet supply store and get a warm fuzzy when he
realizes the site already knows him and has recommended products he might find
useful for improving his experience with his new billy goat..er…puppy.
When John Smith purchased that puppy, the Chesapeake store
collected data about him and sent it off to a service which in turn calls your
stored procedure to update the online store’s database. Your stored procedure
used to look like the following:
CREATE PROCEDURE SaveCustomer ( @custID INTEGER, @custName VARCHAR(100), @custAddr1 VARCHAR(100), @custAddr2 VARCHAR(100), @custCity VARCHAR(50), @custState CHAR(2), @custZip VARCHAR(9), @custPhone VARCHAR(10), @custEmail VARCHAR(100)) AS BEGIN IF EXISTS (SELECT 1 FROM customer WHERE custID = @custID) UPDATE customer SET custName = @custName, custAddr1 = @custAddr1, custAddr2 = @custAddr2, custCity = @custCity, custState = @custState, custZip = @custZip, custPhone = @custPhone, custEmail = @custEmail WHERE custID = @custID ELSE INSERT INTO customer (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) VALUES (@custID, @custName, @custAddr1, @custAddr2, @custCity, @custState, @custZip, @custPhone, @custEmail) END
Now that you’ve discovered MERGE, you re-write your stored
procedure to look like this.
ALTER PROCEDURE SaveCustomer ( @custID INTEGER, @custName VARCHAR(100), @custAddr1 VARCHAR(100), @custAddr2 VARCHAR(100), @custCity VARCHAR(50), @custState CHAR(2), @custZip VARCHAR(9), @custPhone VARCHAR(10), @custEmail VARCHAR(100)) AS BEGIN MERGE dbo.customer AS cust USING (SELECT @custID, @custName, @custAddr1, @custAddr2, @custCity, @custState, @custZip, @custPhone, @custEmail) AS src (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) ON cust.custID = src.custId WHEN MATCHED THEN UPDATE SET custName = src.custName, cust.custAddr1 = src.custAddr1, cust.custAddr2 = src.custAddr2, cust.custCity = src.custCity, cust.custState = src.custState, cust.custZip = src.custZip, cust.custPhone = src.custPhone, cust.custEmail = src.custEmail WHEN NOT MATCHED THEN INSERT (custID, custName, custAddr1, custAddr2, custCity, custState, custZip, custPhone, custEmail) VALUES (src.custID, src.custName, src.custAddr1, src.custAddr2, src.custCity, src.custState, src.custZip, src.custPhone, src.custEmail) END
Let’s break this down. The first thing after the keyword
MERGE is the target table. The USING clause specifies the source result set.
In this case, the source comes from the arguments passed into the stored
procedure. The source result set is aliased as src and the columns are given
aliases as well. ON specifies the condition to be used to determine a match
between the source and target tables. WHEN MATCHED THEN tells the engine what
action should be performed on the target table should the ON clause result in a
match between the tables. In our case, if the customer is found by custID, we
want to uspdate that row in our customer table with the values passed in. If a
matching row is not found, we drop into the action found under WHEN NOT
MATCHED. This time, the customer is not found by custID, so we want to insert
him into our customer table. Simple as that.
Another example of a use for MERGE, as I mentioned above is
the movement of reference data changes through your development, test, and
production environments. Over the course of application maintenance, there is
often need to change or add to the reference data supplied to the application.
This reference data may be the products you offer, the countries you do
business in, part numbers, etc.
You write the script up that includes the data changes you
are making for the next release and it’s packaged up and sent to testing.
Something else breaks in testing that is part of the larger build. Now your
script has to be smart enough to know if it’s already been run in the test
database. Let’s take a look at this. Here’s your original script for a new
product entry. You’ve been bitten before by your script being run more than
once in the same database, so you’ve wrapped it in an IF condition.
IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201005) INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag) VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y') GO IF NOT EXISTS (SELECT 1 FROM Products WHERE ProductID = 201006) INSERT INTO Products (ProductID, ProductName, ManufName, ActiveFlag) VALUES (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y') GO
This same script would look like this using MERGE.
MERGE Products AS p USING (VALUES (201005, 'Big Bounce Squeaky Toy', 'Fetch Boy, LLC', 'Y'), (201006, 'Fuzzy Bear', 'ChewThis, Inc', 'Y') ) AS src (srcProductID, srcProductName, srcManufName, srcActiveFlag) ON p.ProductID = src. ProductID WHEN NOT MATCHED THEN INSERT (ProductID, ProductName, ManufName, ActiveFlag) VALUES (srcProductID, srcProductName, srcManufName, srcActiveFlag)
Personally, I find this syntax easier to write when there
are many new data values to add. You can put all the new data together in the
USING clause instead of in multiple VALUES clauses seperated by INSERTS as in
the previous script.
SQL Server 2008 offers a new DML statement that allows you
to combine INSERT, UPDATE, and DELETE operations into one. By processing the
source and target tables on time using MERGE you can realize performance gains
in addition to script simplification.
For More Information
MSDN – MERGE
TechNet – Optimizing
MERGE Statement Performance