Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Slideshows
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MySQL

Posted May 11, 2015

MySQL Union and Union All Operators Primer

By Rob Gravelle

Unless you’re writing application code that can deal with multiple result sets, it’s usually preferable to merge several similar queries together so that their results may be viewed as one data set.  MySQL supports a way to accomplish this task using pure SQL thanks to the UNION and UNION ALL operators.  This tutorial offers a brief overview on how to use both and provides a few examples to get you started.

What the Union and Union All Operators Do

Both the UNION and UNION ALL operators achieve the common goal of combining the results of two or more SELECT statements.  As such, each query within the UNION must select the same number of fields with compatible data types - i.e. that may be converted using implicit conversion.  The one, albeit substantial, difference between the two is that UNION removes duplicate rows between the various SELECT statements whereas UNION ALL returns all rows from the query even if an identical row exists in more than one of the SELECT statements.

The UNION and UNION ALL operators are simple enough to use; just insert them between each SELECT statement that you want to combine:

SELECT expression1, expression2, ... expression_n
 FROM tables
 WHERE conditions
 UNION (ALL)
 SELECT expression1, expression2, ... expression_n
 FROM tables
 WHERE conditions
 [UNION (ALL)
 SELECT expression1, expression2, ... expression_n
 FROM tables
 WHERE conditions]
 etc...;

Selecting from Unrelated Tables

Using SELECT DISTINCT or careful table joining is generally more efficient than UNIONs, but when tables are unrelated, that is to say that there are no Primary and Foreign Keys linking them, UNION can be employed to generate a master list that would otherwise require multiple queries.

For example, say you wanted to produce a list of all of the people associated with your business and had three separate tables for supplier, salespeople, and customer details.  Rather than work with three individual result sets, you could combine them using UNION:

SELECT id, name, email
 FROM suppliers
 UNION
 SELECT rep_id, name, email 
 FROM salespeople
 UNION
 SELECT id, name, email_address
 FROM customers;

Here is what the first several results might look like.  Notice the duplication of Tim Akeman, who exists in more than one table:

ID

name

email

13

Jack N

jack@acme.com

17

Tim Akeman

akeman_tim@acme.com

18

Reb Beech

thebeech@rogers.com

21

Nancy Willis

nwillis98@gmail.com

22

R Cook

cook.r@yahoo.com

17

Tim Akeman

akeman_tim@acme.com

26

N King

thekingisin@bell.ca

A useful bit of information to know about the UNION and UNION ALL operators is that if the field names do not match across each SELECT statement, the query will display the names from the first query in the results.

The above query doesn’t tell us anything about what type of contact each person is, but we can include that ourselves by hardcoding it in each query:

SELECT id, name, email, ‘supplier’ as contact_type
 FROM suppliers
 UNION
 SELECT rep_id, name, email, ‘salesperson’ as contact_type 
 FROM salespeople
 UNION
 SELECT id, name, email_address, ‘customer’ as contact_type
 FROM customers;

Now we can see that Tim Akeman is both a supplier and customer:

ID

post_name

post_title

contact_type

13

Jack N

jack@acme.com

supplier

17

Tim Akeman

akeman_tim@acme.com

supplier

18

Reb Beech

thebeech@rogers.com

salesperson

21

Nancy Willis

nwillis98@gmail.com

salesperson

22

R Cook

cook.r@yahoo.com

customer

17

Tim Akeman

akeman_tim@acme.com

customer

26

N King

thekingisin@bell.ca

customer

Using Group By and Order By Clauses

In working with data over time, you’ll find that creative solutions are often required.  One such situation arose recently when I was asked to fetch a list of all the names contained in my client’s restaurants database.  What first looked like a simple SELECT statement turned into a formidable task!  The crux of the challenge was that several restaurants were actually chains, so they appeared numerous times in the main table.  Using a SELECT DISTINCT was a no-go because the results contained the unique IDs. 

The answer was a combination of the GROUP BY clause and UNION operator.

Restaurant chains could be identified by their 'restaurant_base_name' key in the postmeta table.  The chain query included the GROUP BY clause so that only one of each was returned by the query.  Omitting restaurants that did not possess a 'restaurant_base_name' from the other query yielded all of the stand-alone restaurants.

Finally, both queries were joined using UNION ALL and the full results sorted using the ORDER BY statement.

-- non-chain restaurants
 SELECT ID, 
        post_name, 
        post_title 
 FROM   posts
 WHERE  post_type='restaurant' 
 AND    post_status='publish' 
 AND    ID not in (SELECT post_id 
                   FROM   postmeta
                   WHERE  meta_key='restaurant_base_name') 
 UNION ALL 
 -- chain restaurants
 SELECT p.id,
        p.post_name, 
        pm.meta_value as post_title 
 FROM   posts as p,
        postmeta as pm
 WHERE  p.id = pm.post_id
 AND    meta_key = 'restaurant_base_name'
 AND    post_status = 'publish'
 GROUP BY post_title 
 ORDER BY post_title -- sorts all the results

In the following result set, you can see a few chains mixed in, including, Carluccio’s, McDonald’s, Nando’s, and Pret A Manger:

ID

post_name

post_title Ascending

2525

aqua-kyoto

Aqua Kyoto

7479

babylon-restaurant

Babylon Restaurant

1747

hippodrome-circle-restaurant

Birmingham Hippodrome - Circle Restaurant

1415

carluccios-aberdeen

Carluccio's

6441

eds-easy-diner-soho

Ed's Easy Diner

3657

mcdonald

McDonald's

6570

nandos-ealingcommon

Nando's

2066

pret-bvictoriastreet

Pret A Manger

2547

rosa-lewis-bar-at-the-cavendish

Rosa Lewis Bar at the Cavendish

1704

the-akeman

The Akeman

1797

the-alford-arms

The Alford Arms

2594

2594-2

The Bay Fish & Chip Shop

1708

the-beech-house

The Beech House - Beaconsfield

1710

the-blue-boar

The Blue Boar

1714

the-crown-thistle

The Crown & Thistle

1716

the-kings-arms

The King's Arms

1718

the-old-post-office

The Old Post Office

Customizing the Overall Sorting

The overall sorting produced by the last ORDER BY statement is placing a lot of restaurants that should be near the top of the results at the bottom because of the leading “The” in their names.

We can change how the rows are sorted because the ORDER BY clause can accept expressions and function calls as well as field names.  In the code below, the IF() function tests for a leading “The “ and removes it using SUBSTRING() if necessary. 

The main result fields may be modified in the same way.  Here, the CONCAT() function reinserts the “The” enclosed in parentheses to show that it is not included in the sorting.  The expression is assigned the same name as the original field so that it may be referenced by the ORDER BY clause:

-- non-chain restaurants
 SELECT ID, 
        post_name, 
        IF(SUBSTRING(post_title, 1, 4) = "The ", 
             CONCAT('(The) ', SUBSTRING(post_title, 5)), 
             post_title ) AS post_title
 FROM   posts
 WHERE  post_type='restaurant' 
 AND    post_status='publish' 
 AND    ID not in (SELECT post_id 
                   FROM   postmeta
                   WHERE  meta_key='restaurant_base_name') 
 UNION ALL 
 SELECT p.id,
        p.post_name, 
        pm.meta_value as post_title 
 FROM   posts as p,
        postmeta as pm
 WHERE  p.id = pm.post_id
 AND    meta_key = 'restaurant_base_name'
 AND    post_status = 'publish'
 -- sorts all the results
 ORDER BY IF(SUBSTRING(post_title, 1, 6) = "(The) ", 
             SUBSTRING(post_title, 7), 
             post_title)

Much better!

ID

post_name

post_title

1704

the-akeman

(The) Akeman

1797

the-alford-arms

(The) Alford Arms

2525

aqua-kyoto

Aqua Kyoto

7479

babylon-restaurant

Babylon Restaurant

2594

2594-2

(The) Bay Fish & Chip Shop

1708

the-beech-house

(The) Beech House - Beaconsfield

1747

hippodrome-circle-restaurant

Birmingham Hippodrome - Circle Restaurant

1710

the-blue-boar

(The) Blue Boar

1415

carluccios-aberdeen

Carluccio's

1714

the-crown-thistle

(The) Crown & Thistle

6441

eds-easy-diner-soho

Ed's Easy Diner

1716

the-kings-arms

(The) King's Arms

3657

mcdonald

McDonald's

6570

nandos-ealingcommon

Nando's

1718

the-old-post-office

(The) Old Post Office

2066

pret-bvictoriastreet

Pret A Manger

1720

the-red-lion

(The) Red Lion

2547

rosa-lewis-bar-at-the-cavendish

Rosa Lewis Bar at the Cavendish

Conclusion

When choosing between the UNION and UNION ALL operators, keep in mind that the distinct sorting operation that is performed by the UNION operator can be time consuming. For that reason, consider the UNION ALL operator your first choice unless you are getting duplicates back.  In the previous example, we were able to engineer our query so that every row was unique, so that using UNION would be overkill.

See all articles by Rob Gravelle



MySQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MySQL Forum
Topic By Replies Updated
MySQL in high availability and transction secure banking application klamor 1 August 28th, 10:24 AM
MySQL rollback UAL225 1 August 28th, 10:15 AM
Browsing a DB file that uses MySql finleytech 1 July 26th, 10:51 AM
php cookie won't pass variable to next page ITdevGirl 0 June 22nd, 12:13 PM