More Efficient and Readable Select Queries with MySQL Subqueries

Did you know that MySQL has supported subqueries since version 4.1? In
fact, it even boasts a few MySQL-specific features! Whether you’re well
acquainted with subqueries or just a novice in this area, you’re bound to benefit
from today’s article, which will cover how to use subqueries in MySQL to get the
most out of your SELECT queries while making your SQL more readable.

The Subquery Explained

A subquery is one that is nested within another query. In fact it is
possible to nest subqueries within other subqueries. The main advantages of
subqueries are:

  • They allow queries that are structured so that it
    is possible to isolate each part of a statement.
  • They provide alternative ways to perform operations
    that would otherwise require complex joins and unions.
  • They are, in many people’s opinion, more readable
    than complex joins or unions. According to the MySQL docs, “it was the
    innovation of subqueries that gave people the original idea of calling the
    early SQL ‘Structured Query Language.’”

MySQL subqueries can return a single value, single row, single column,
or a table, which is a resultset containing one or more rows of one or more
columns. We’ll be taking a look at each of these possibilities a little later
on.

Basic Syntax

Here is an example to illustrate the basic syntax of a subquery:

SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);

The above example is made up of two parts: “SELECT * FROM t1 WHERE column1 =“ is refered to as the outer query (or outer statement). The
part between the parentheses, “(SELECT
column1 FROM t2)”, is the subquery. A subquery must always
appear within parentheses. Also note that subqueries can be
found on both in the column list and to the right side of the WHERE clause
after a comparison or logical operator such as = (equal), <> (not
equal), <= (less than or equal), >= (greater than or equal), < (less
than), > (greater than), BETWEEN (value lies between two values), NOT
(logical NOT), AND (logical AND), OR (logical OR).

Restrictions

There are few restrictions on the type of statements in which
subqueries can be used. The subquery’s outer statement must be of the type SELECT,
INSERT, UPDATE, DELETE, SET or DO statement, and you cannot modify a table and
select from it in the subquery. Other than that, a subquery can contain any of
the keywords or clauses that an ordinary SELECT can contain, including
DISTINCT, GROUP BY, ORDER BY, LIMIT, joins, index hints, UNION constructs,
comments, functions, and so on.

Some Examples

Some common uses for subqueries are to retrieve the results of an
aggregate function such as SUM or COUNT or to look up a description based on a
code value. In large databases, it’s common practice to use some sort of
general component ID that links to the unique ID field of data tables. Since
there could potentially be dozens or even hundreds of tables that link to the
code table, it is advantageous to use individual subqueries in the SELECT
clause, rather than create many relationships that all link to the same table.
The following query should help to clarify this point. It selects some client
information, including the ID, citizenship, date of birth, gender, a count of
purchases. The citizenship country and gender descriptions are retrieved from
the code_values table. Notice how the count selects from the same table. This
is accomplished by creating a second alias which is linked by the client_id:


SELECT client_id,
(SELECT english_desc
FROM code_values
WHERE code_value_id = cl.citizenship_country_cd) AS citizenship,
client_dob,
(SELECT english_desc
FROM code_values
WHERE code_value_id = cl.client_gender_cd) AS gender
(SELECT count(purchase_id)
FROM clients AS cl2
WHERE cl2.client_id = cl.client_id) AS purchase_count
FROM clients AS cl;

Subqueries that return a single value as the ones above do are called
scalar subqueries. Another place to put an aggregate function
is in the WHERE clause. In this example, a subquery is used instead of a JOIN to compare total_sales to the greatest
monthly_sales:


SELECT associate_name
FROM sales_associates
WHERE total_sales = (SELECT MAX(monthly_sales) FROM sales

The key to using a subquery after a comparison operator is that you
must be certain that the subquery will always return a single value. The following
SELECT statement looks harmless enough, but it could cause an error:


SELECT associate_name
FROM sales_associates
WHERE associate_name = (SELECT name
FROM sales
WHERE seller_id = 99
AND date_of_sale = ‘2003-05-05’);

The above statement will work as long as the seller did not make more
than one sale on the given date. If more than one row is returned, you’ll get
an error such as the following:

“SQL Error: Subquery returns more than 1 row.”

A safer way to compare one value against the results of a subquery is
to use the IN operator, as it will work with any number of rows. All you have
to do is replace the equals (=) with the IN keyword:


SELECT associate_name
FROM sales_associates
WHERE associate_name IN (SELECT name
FROM sales
WHERE seller_id = 99
AND date_of_sale = ‘2003-05-05’);

One special instance of a scalar subquery is one that uses a SELECT
CASE statement, which is a control construct that executes an SQL statement
based on a given condition. It can only return a scalar value, so it’s perfect
for choosing between fields:


(SELECT english_desc
FROM vcode_value
WHERE code_value_id = (SELECT CASE WHEN sfd.recomm_or_final_decision_ind = 1
THEN sfd.recommendation_cd
ELSE sfd.final_decision_cd
END )) AS Recommendation
FROM sts_final_decision sfd
WHERE sfd.business_no <> “”;

Here’s a more complex query that contains a total of four subqueries, two
of which utilize the SELECT CASE statement and one which features the IN
operator. The final_decision_desc is presumably an input parameter of a stored
procedure. Such a proc would allow the user to search for a record by text
description, as users don’t typically know the underlying code values. The
SELECT statement that follows the IN keyword fetches the code (or codes) that
matches the supplied final_decision_desc parameter, which is then compared to
those of the final_decision table:


SELECT final_decision_desc,
(SELECT english_desc
FROM vcode_value
WHERE code_value_id = (SELECT CASE WHEN fd.recomm_or_final_decision_ind = 1
THEN fd.recommendation_cd
ELSE fd.final_decision_cd
END )) AS Recommendation
FROM final_decision cd
WHERE fd.final_decision_cd IN (SELECT ctd.code_value_id
FROM vcode_table_data ctd
WHERE (SELECT CASE WHEN language_cd = ‘E’
THEN english_desc
ELSE french_desc
END) = final_decision_desc;

The examples provided here are really just a taste of what can be done
with subqueries. For more information on subqueries, be sure to check out the
MySQL documentation on Subquery Syntax. They also have a series of articles on the subject. Here is Part
1
.

»


See All Articles by Columnist
Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Latest Articles