Make Your MySQL Queries More Efficient with Temporary Tables

Knowing how to use temporary
tables in your stored procedures is a great way to save time and effort,
particularly when dealing with multiple queries that return very large and/or
similar resultsets. If you’re fetching
several resultsets that contain a lot of
common data, it can make sense to obtain
a single resultset that has all the data that’s required, store it in a temporary
table, and then select from this temporary table as needed. Similarly, when repeatedly referring to the same
subset of table data, it’s often faster and more economical to store results in
a temporary table, rather than repeating
a complex join several times. In today’s article, we’ll review just
how easily temporary tables can be utilized to make your stored procedures more
efficient.

When to use Temporary Tables

Temporary
Tables, or temp tables for short, allow you to create a short-term storage
place within the database for a set of data that you need to use several times
in a single series of operations. Temp tables come into play when it isn’t
possible to retrieve all the data that you require using one SELECT statement
or when you want to work with subsets of the same, larger resultset over
several successive operations. Temp tables are supported in MySQL 3.23 and
later.

To
create one, all you need to do is include the TEMPORARY keyword in a table
creation statement:


CREATE TEMPORARY TABLE temp_table (count int);

You can now use
the table in a query:


INSERT INTO temp_table VALUES((1),(2),(3));

SELECT * FROM temp_table;

Returns:

Count
—–
1
2
3

You don’t have
to worry about removing the table when you’re done because it will be deleted once
you log out. Attempting to refer to it in a later session will result in an
error such as:


Table ‘<database>.temp_table’ doesn’t exist.

If you
need to, you can delete temp tables using the same command as for regular ones:


DROP TABLE temp_table;

Using a Temp Table to Combine Data

One very common
use of temp tables is to combine data from different sources together to
produce a single result set. In the following example, the “temp_table_article”
database contains three tables: one that holds client data, one that
stores citizenship information, and a relationship table that
links them:


CREATE TABLE ‘temp_table_article’.’client’ (
’client_id’ INT UNSIGNED NOT NULL AUTO_INCREMENT,
’date_of_birth’ DATE,
’gender’ CHAR (1),
’logical_delete_indicator’ TINYINT (1) DEFAULT ‘0’ NOT NULL,
PRIMARY KEY(‘client_id’),
UNIQUE(‘client_id’)
);

INSERT INTO temp_table_article.client VALUES(1, ‘1976-02-12’, ‘M’, 1);
INSERT INTO temp_table_article.client VALUES(2, ‘1944-01-15’, ‘F’, 0);
INSERT INTO temp_table_article.client VALUES(3, ‘1956-06-04’, ‘M’, 1);
INSERT INTO temp_table_article.client VALUES(4, ‘1938-11-19’, ‘F’, 0);

CREATE TABLE ‘temp_table_article’.’citizenship’ (
’citizenship_id’ INT UNSIGNED AUTO_INCREMENT,
’country_code’ INT UNSIGNED,
’primary_citizenship’ TINYINT UNSIGNED DEFAULT ‘1’ NOT NULL,
’logical_delete_indicator’ TINYINT UNSIGNED DEFAULT ‘0’ NOT NULL,
PRIMARY KEY(‘citizenship_id’),
UNIQUE(‘citizenship_id’)
);

INSERT INTO temp_table_article.citizenship VALUES(1, 23, 1, 0);
INSERT INTO temp_table_article.citizenship VALUES(2, 22, 0, 1);
INSERT INTO temp_table_article.citizenship VALUES(3, 19, 1, 1);
INSERT INTO temp_table_article.citizenship VALUES(4, 20, 0, 0);
INSERT INTO temp_table_article.citizenship VALUES(5, 19, 1, 1);
INSERT INTO temp_table_article.citizenship VALUES(6, 56, 1, 0);
INSERT INTO temp_table_article.citizenship VALUES(7, 77, 1, 0);

CREATE TABLE ‘temp_table_article’.’client_citizenship_rel’ (
’relationship_id’ INT UNSIGNED NOT NULL AUTO_INCREMENT,
’client_id’ INT UNSIGNED NOT NULL,
’citizenship_id’ INT UNSIGNED NOT NULL, PRIMARY KEY(‘relationship_id’),
UNIQUE(‘relationship_id’)
);

INSERT INTO temp_table_article.client_citizenship_rel VALUES(1, 1, 1);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(2, 1, 2);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(3, 1, 3);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(4, 2, 4);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(5, 2, 5);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(6, 3, 6);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(7, 3, 7);
INSERT INTO temp_table_article.client_citizenship_rel VALUES(8, 2, 7);

The preceding tables’ relationship can be represented graphically as follows:

The preceding tables’ relationship can be represented graphically

I have
stated before that normalized structures such as the one above are great for
efficiently storing data, but are actually less-than-ideal for querying. We
can see for ourselves as soon as we try to obtain a list of active clients and
their related citizenships. Assuming that citizenships are not mandatory, we
would want all of the records in the client table whose
logical_delete_indicator are equal to 0. Joining the tables using common
fields, we could write the following SELECT statement:


SELECT cl.*,
cit.*
FROM temp_table_article.client AS cl,
temp_table_article.citizenship AS cit,
temp_table_article.client_citizenship_rel AS rel
WHERE cl.client_id = rel.client_id
AND cit.citizenship_id = rel.citizenship_id
AND cl.logical_delete_indicator = 0
AND cit.logical_delete_indicator = 0;

It does limit
the results to active clients, but it also misses those who do not possess any
associated citizenships. Here are the first three fields of the preceding
query’s results:


client_id date_of_birth gender …
2 1944-01-15 F
2 1944-01-15 F

We could try an
OUTER JOIN to include all of the client records, regardless of whether or not
they possess any related citizenships:


SELECT cl.*,
cit.*
FROM temp_table_article.client AS cl,
temp_table_article.citizenship AS cit
LEFT OUTER JOIN temp_table_article.client_citizenship_rel rel ON rel.client_id
WHERE cit.citizenship_id = rel.citizenship_id
AND cl.logical_delete_indicator = 0
AND cit.logical_delete_indicator = 0;

This solution is
even more inadequate than the first, as it produces a cartesian product, which
is a dataset of two relations whose join is not restricted by any criteria,
resulting in every tuple of the first relation being matched with every tuple
of the second relation:


client_id date_of_birth gender …
2 1944-01-15 F
4 1938-11-19 F
2 1944-01-15 F
4 1938-11-19 F
2 1944-01-15 F
4 1938-11-19 F
2 1944-01-15 F
4 1938-11-19 F
2 1944-01-15 F
4 1938-11-19 F

What we’d like to do
is append the extra records to the first query to include the orphan client
rows. One way to do that is by using a temporary table. Here’s the code to
store the fields that we are interested in:


CREATE TEMPORARY TABLE client_citizenship (
client_id INT NOT NULL,
date_of_birth DATE,
gender CHAR (1),
citizenship_id INT,
country_code INT ,
primary_citizenship TINYINT
);

To populate the
table, we’ll use an INSERT INTO statement. It supports values or the results
of a SELECT query, as we’ll be doing:


INSERT INTO client_citizenship
SELECT cl.client_id,
cl.date_of_birth,
cl.gender,
cit.citizenship_id,
cit.country_code,
cit.primary_citizenship
FROM temp_table_article.client AS cl,
temp_table_article.citizenship AS cit,
temp_table_article.client_citizenship_rel AS rel
WHERE cl.client_id = rel.client_id
AND cit.citizenship_id = rel.citizenship_id
AND cit.logical_delete_indicator = 0
AND cl.logical_delete_indicator = 0;

We can now
perform a “SELECT ALL” on our temp table to see how we’re coming along:


SELECT * FROM client_citizenship;

Returns:

client_id, date_of_birth, gender, citizenship_id, country_code, primary_citizenship
2, 1944-01-15, F, 4, 20, 0
2, 1944-01-15, F, 7, 77, 1

Now
all that’s missing are client records who do not possess associated
citizenships. To fetch those, we can perform a second INSERT which includes a
“NOT IN” clause. The idea is to find those client records that do not appear
in the relationship table. To do that, we can perform a subquery that
retrieves just the client_ids. Unfortunately, you cannot
refer to a temporary table more than once in the same query in MySQL, so we
can’t compare the client_ids directly to those which are already in the temp
table. For example, the following won’t work:


INSERT INTO client_citizenship
SELECT <fields>
FROM temp_table_article.client AS cl,
client_citizenship AS cc
WHERE cl.client_id <> cc.client_id
AND cl.logical_delete_indicator = 0;

It results in
the following error:


ERROR 1137: Can’t reopen table: ‘client_citizenship’

Instead of
reading from the temp table, we can retrieve the client_ids from the relationship
table:


INSERT INTO client_citizenship
SELECT cl.client_id,
cl.date_of_birth,
cl.gender,
NULL,
NULL,
NULL
FROM temp_table_article.client AS cl
WHERE cl.client_id NOT IN
(SELECT rel.client_id
FROM temp_table_article.client_citizenship_rel rel)
AND cl.logical_delete_indicator = 0;

Even though we
aren’t looking for citizenship fields, we still have to insert something in
every column of the temp table. Nulls are always a good choice, as they work
for any data type. Like last time, we want to check that the row has not been
logically deleted.

Now, we can
select our final resultset from the temp table:


SELECT * FROM client_citizenship;

Returns:

client_id, date_of_birth, gender, citizenship_id, country_code, primary_citizenship
2, 1944-01-15, F, 1, 23, 1
2, 1944-01-15, F, 7, 77, 1
4, 1938-11-19, F, NULL, NULL, NULL

In this example,
I used codes where ever possible, just like in a real world database. To pick
up the text descriptions, we would include subqueries to a code table for each
field in our final SELECT statement. We’ll be looking at subqueries next time,
so I’ll elaborate on this point then.

»


See All Articles by Columnist
Rob Gravelle

Robert Gravelle
Robert Gravelle
Rob Gravelle resides in Ottawa, Canada, and has been an IT guru for over 20 years. In that time, Rob has built systems for intelligence-related organizations such as Canada Border Services and various commercial businesses. In his spare time, Rob has become an accomplished music artist with several CDs and digital releases to his credit.

Latest Articles