SQL Server 2008 MERGE Statement

What do you do when your application logic requires that you
INSERT a record if the record doesn’t exist, or UPDATE the record if it does
exist? If you are using SQL Server 2005 or below you would have to write a series
of T-SQL statements that check to see if the record exists, and then write an
UPDATE or INSERT statement depending on your existence checks. With SQL Server
2008, Microsoft has simplified the amount of T-SQL code it requires to perform
INSERT/UPDATE logic, by implementing the MERGE statement. This article will
explore how to use the MERGE statement.

What can you do with the MERGE Statement

You can do more with the MERGE statement then just inserting
and updating records as described above. You can also use it to delete
records. Another thing you can do with the MERGE statement is to perform a
specific number of merge actions based on a TOP clause. The MERGE statement
can also generate output that will identify which records where inserted,
updated, or deleted. The best way to describe how the new MERGE statement works
is to go through some examples, and examine how they work.

Basic INSERT or UPDATE Example

For the first example, I will perform updates or inserts to
a target table based on a source table. The source table will contain the
records you either want to insert or update, and the target table will be the
table where the inserts or updates will actually be performed. This kind of
operation is sometimes referred to as an “upsert”.

For this example I will be performing inserts and updates to
a “Customers” table. My “Customer” table will be considered the target table.
The records I want to insert or update will be contained in a table named
“NewCustomers”. This table is known as the source table.

Here is the code to create and populate the target and
source tables:


SET NOCOUNT ON;
— Create Target Table
CREATE TABLE Customers (LastName VARCHAR(50), FirstName VARCHAR(50));
INSERT INTO Customers VALUES (‘Doe’, ‘Jane’);
— Create Source Table
CREATE TABLE NewCustomers(LastName VARCHAR(50), FirstName VARCHAR(50));
INSERT INTO NewCustomers VALUES (‘Doe’, ‘John’);
INSERT INTO NewCustomers VALUES (‘Smith’, ‘Doris’);

I’ve created a Customers table with a single record, and a
NewCustomers table with two different records. The records in the NewCustomers
table will be used to update and insert records in to the Customers table. The
first record in the NewCustomers table is going to be used to update the single
existing Customers record, where as the second record in the NewCustomers table
is going to be inserted into the Customers table. To do the insert and update,
I will use the following MERGE statement:


— Update and Insert into Customers
MERGE Customers AS C
USING NewCustomers AS NC
ON C.LastName = NC. LastName
WHEN MATCHED THEN
UPDATE SET C.FirstName = NC.FirstName
WHEN NOT MATCHED THEN
INSERT (LastName, FirstName) VALUES (NC.LastName,NC.FirstName);

Here the target table is identified by naming it immediately
following MERGE statement, in my example that would be “Customers”. Next, the
MERGE statement uses the “USING” clause to identify the source of the updates
and inserts, which in this case is the NewCustomers table. In order to
identify which records are updates and which ones are inserts I need to
identify a key column to match on between the Customers and NewCustomers
table. To accomplish this matching of records between the two tables the MERGE
statement uses the “ON” clause just as you would do when you are joining two
tables. Here I am joining the source and target tables based on the LastName
column in source and target tables. The different “WHEN” criteria of the MERGE
statement identify the conditions for when an update or insert is to be
performed. When a record is found in both source (NewCustomers) and target
(Customers) tables then an UPDATE statement is performed to update on the FirstName
column. This condition meets the “WHEN MATCHED” criteria above. When a join does
not find a record in the target (Customers) table for a record in the source (newCustomer),
then this is the “WHEN NOT MATCHED” criteria is executed. When this occurs an
INSERT statement is performed to insert the LastName and FirstName value into
the Customers table. After I have run my MERGE statement, my updated
Customers table contains the following records:


LastName FirstName
—————————————— ————————————————–
Doe John
Smith Doris

By looking at the output, you can see the MERGE statement
changed the FirstName column of the “Doe” record from “Jane” to “John”. The
MERGE did an UPDATE to accomplish this by using the “MATCHED” criteria. The
second record was created when the “NOT MATCH” criteria inserted a record.

Basic Insert, Update and Delete Example using MERGE

The MERGE statement can be
used to maintain a target table based on information in a source table. By
maintaining a target table, I mean the MERGE statement can perform the
following actions:

  • Insert a new record from the
    source if the record is not in the target.
  • Update a target record if the
    record exists in the source table.
  • Delete a record in the target
    table if the record is not found in the source.

To
show how this works let me go through an example. My example will use the MERGE
statement to maintain a student roster for the students enrolled in the current
year. For this example, my target table will be “Students”. Here is the code
to create and populate my “Students” table:


CREATE TABLE Students (LastName VARCHAR(50),
FirstName VARCHAR(50),
Address VARCHAR(100),
Age INT);
INSERT INTO Students
VALUES (‘Fritz’, ‘David’, ‘181 Kline Street’, 14)
,(‘Reese’, ‘Paul’ , ‘4429 South Union’, 14)
,(‘Brown’, ‘Jake’ , ‘5401 Washington Ave’,14);

My student table needs to be
updated every year with a list of all the new students as well as the returning
students. To do this I need a source table that will contain those records
that need to be inserted, or updated. Here is the code to create and populate
my source table:


CREATE TABLE NewYearRoster(LastName VARCHAR(50),
FirstName VARCHAR(50),
Address VARCHAR(100),
Age INT);
INSERT INTO NewYearRoster
VALUES (‘Fritz’, ‘David’, ‘181 Kline Street’, 15)
,(‘Reese’, ‘Paul’, ‘1950 Grandview Place’, 15)
,(‘Adam’, ‘Wilbur’, ‘4231 W. 93rd’, 15);

My merge example will be
using the source table (NewYearRoster) to maintain the target table (Students)
by inserting new students, updating students information if they were enrolled
last year (meaning already in Student table), and delete student records if
they are not enrolled for the new year (meaning not in NewYearRoster table).
To accomplish this here is my MERGE statement:


MERGE Students AS T
USING NewYearRoster AS S
ON S.LastName = T. LastName and
S.FirstName = T.FirstName
WHEN MATCHED THEN
UPDATE SET T.Address = S.Address,
T.Age = S.Age
WHEN NOT MATCHED THEN
INSERT (LastName,
FirstName,
Address,
Age)
VALUES (S.LastName,
S.FirstName,
S.Address,
S.Age)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;

After I run the above command,
my Student table contains the following records:


LastName FirstName Address Age
————– —————- ———————————————— ———–
Fritz David 181 Kline Street 15
Reese Paul 1950 Grandview Place 15
Adam Wilbur 4231 W. 93rd 15

If you review this with my
original Students table, you will find that the above MERGE statement did the
following:

  • David Fritz’s record got the Age
    columns updated (also the address was updated but it was the same in both the
    source and target tables)
  • Paul Reese’s record got the
    Address and Age columns updated
  • Jake Brown’s student record was
    deleted
  • Adam Wilbur’s record was inserted

Adding Additional MERGE Search Conditions

You can add additional
conditions to the MATCH or NOT MATCH clauses to make your MERGE search
conditions more complicated. Below I have modified my above simple insert,
update, or delete example to only update existing records if the address in the
NewYearRoster is different from the current Students address record:


MERGE Students AS T
USING NewYearRoster AS S
ON S.LastName = T. LastName and
S.FirstName = T.FirstName
WHEN MATCHED and T.Address <> S.Address THEN
UPDATE SET T.Address = S.Address,
T.Age = S.Age
WHEN NOT MATCHED THEN
INSERT (LastName,
FirstName,
Address,
Age)
VALUES (S.LastName,
S.FirstName,
S.Address,
S.Age)
WHEN NOT MATCHED BY SOURCE THEN
DELETE;

I added my additional MERGE
search condition to the “WHEN MATCHED” clause, so now it reads “WHEN MATCHED
and T.Address <> S.Address THEN UPDATE…”. By doing this the David
Fritz‘s Students record does not get updated because this record has the same
address in both the target and source tables.

Output Clause

If you want to find out what
records are being inserted, updated, or deleted when you run a MERGE statement
you can use the OUTPUT clause. This clause will output information you specify
for every record inserted, updated, and/or deleted. I’ve modified my basic
insert, update, and delete example above to output the type of action taken,
plus the First and Last name. Here is that modified MERGE statement.


MERGE Students AS T
USING NewYearRoster AS S
ON S.LastName = T. LastName and
S.FirstName = T.FirstName
WHEN MATCHED THEN
UPDATE SET T.Address = S.Address,
T.Age = S.Age
WHEN NOT MATCHED THEN
INSERT (LastName,
FirstName,
Address,
Age)
VALUES (S.LastName,
S.FirstName,
S.Address,
S.Age)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
OUTPUT $action, Inserted.LastName, Inserted.FirstName, Deleted.LastName, Deleted.FirstName;

When I run this statement, I
get the following output:


$action LastName FirstName LastName FirstName
———- ———— ————- ————- ——————-
INSERT Adam Wilbur NULL NULL
UPDATE Fritz David Fritz David
UPDATE Reese Paul Reese Paul
DELETE NULL NULL Brown Jake

Now you can tell this
statement performed 4 different actions, 1 insert, 2 updates and 1 delete
statement.

Using Top Clause in MERGE Statement

You can add a TOP clause to
the MERGE statement. When you do this, the MERGE statement will limit the
number of actions performed to the number specified in the TOP clause. Below I
have modified my basic insert, update, and delete example to contain a TOP
clause to request that only 2 actions be taken:


MERGE TOP (2) Students AS T
USING NewYearRoster AS S
ON S.LastName = T. LastName and
S.FirstName = T.FirstName

WHEN NOT MATCHED THEN
INSERT (LastName,
FirstName,
Address,
Age)
VALUES (S.LastName,
S.FirstName,
S.Address,
S.Age)
WHEN NOT MATCHED BY SOURCE THEN
DELETE
WHEN MATCHED THEN
UPDATE SET T.Address = S.Address,
T.Age = S.Age
OUTPUT $action, Inserted.LastName, Inserted.FirstName, Deleted.LastName, Deleted.FirstName;

Here is the output produced
by the OUTPUT clause when I run this MERGE statement:


$action LastName FirstName LastName FirstName
———- ———— ————- ————- ——————-
UPDATE Fritz David Fritz David
UPDATE Reese Paul Reese Paul

Here the MERGE statement
only performed 2 UPDATE statements. If you look at my MERGE statement I
arranged the WHEN clause to have the UPDATE statement as the last set of
actions to be taken. However, you can see that my MERGE statement did not
perform the actions in the order they where specified in the MERGE statement.
If you play with this statement you will find the updates, are performed first,
then inserts, and lastly delete statements will be performed.

Benefits of Using MERGE

The MERGE statement
simplifies the amount of work it takes to implement insert and update logic.
Now with a single statement you can easily implement insert, update, and delete
logic to handle criteria for maintaining a table. The MERGE statement handles
all the joining of the source and target. This minimizes the amount of code
you need to write to handle merge logic. The MERGE statement is just one of
the many enhancements that Microsoft has included in SQL Server 2008.

»


See All Articles by Columnist
Gregory A. Larsen

Gregory Larsen
Gregory Larsen
Gregory A. Larsen is a DBA at Washington State Department of Health (DOH). Greg is responsible for maintaining SQL Server and other database management software. Greg works with customers and developers to design and implement database changes, and solve database/application related problems. Greg builds homegrown solutions to simplify and streamline common database management tasks, such as capacity management.

Latest Articles