Fetching Data from Multiple Tables using Joins

August 28, 2009

Normalization is almost universally applied to relational databases such as MySQL in order to optimize tables for general-purpose querying and to rid them of certain undesirable characteristics that could lead to a loss of data integrity. Doing so tends to promote better accuracy of queries, but it also leads to queries that take a little more work to develop, as the data may be spread amongst several tables. In today’s article, we’ll learn how to fetch data from multiple tables by using joins.

Two Syntaxes

There are two accepted syntax styles for writing joins: ANSI-style joins and theta-style joins. ANSI syntax uses the JOIN and ON keywords, as in the following example:

SELECT 	field1, 
	field2, 
FROM 	my_table 	t1
JOIN	my_other_table	t2
ON	t1.primary_id_field = t2.foreign_key_id_field
WHERE	t1.lastname	  = 'Smith';

The JOIN keyword is used to separate the names of the tables being joined, and the ON clause contains the relation showing which column is being used as the join key.

In theta-style syntax, the table joins are simply added to the WHERE clause:

SELECT 	field1, 
	field2, 
FROM 	my_table 	t1,
	My_other_table	t2
WHERE	t1.primary_id_field = t2.foreign_key_id_field
AND	t1.lastname	  = 'Smith';

MySQL also supports a nonstandard extension of the ANSI syntax that can be used as a sort of shorthand for when the join column has the same name in both joined tables:

SELECT 	field1, 
	field2, 
FROM 	my_table 	t1
JOIN	my_other_table	t2
USING	(t1.id_field)
WHERE	t1.lastname	  = 'Smith';

So Which Syntax is Best?

The ANSI syntax is generally preferable to theta style because it’s usually easier to read and understand, particularly when writing joins involving numerous tables. There are also some types of joins that can’t be written using theta-style notation in MySQL.

Join Types

In order to join tables together, there has to be some common data that allow those tables to be connected in some meaningful way. Although it’s possible to have more than one common column between two tables, most often, the join key will be the primary key of one table and a foreign key in the other.

To illustrate, we’ll perform queries against the following sample tables:

manufacturer

id

description

1

ACURA

2

ALFA ROMEO

3

ASUNA

4

AUDI

5

BMW

6

BUICK

7

CADILLAC

8

CHEVROLET

9

CHRYSLER

10

DAEWOO

11

DODGE

12

EAGLE

13

FORD

14

GEO

15

GMC

16

HONDA

17

HYUNDAI

18

INFINITI

19

ISUZU

20

JAGUAR

21

LADA

22

LEXUS

23

LINCOLN

24

MAZDA

25

MERCEDES-BENZ

26

MERCURY

27

NISSAN

28

OLDSMOBILE

29

PLYMOUTH

30

PONTIAC

31

PORSCHE

32

SAAB

33

SATURN

34

SUBARU

35

SUZUKI

36

TOYOTA

37

VOLKSWAGEN

38

VOLVO

39

JEEP

40

LAND ROVER

41

(NULL)

42

New Car Co

         

model

id

*manufacturer_id

description

1

ACURA

INTEGRA

2

ACURA

CL

3

ACURA

LEGEND

4

ACURA

RL

5

ACURA

NSX

6

ACURA

TL

7

ACURA

VIGOR

8

ACURA

EL

9

ASUNA

SEDAN

10

ASUNA

HATCHBACK

11

ACURA

NSX-T

12

ALFA ROMEO

SEDAN LS

13

ALFA ROMEO

SPIDER

14

AUDI

A4

15

AUDI

A6

16

AUDI

A8

17

AUDI

S4

18

AUDI

S6

19

AUDI

90 SERIES

20

AUDI

CABRIOLET

21

AUDI

100 SERIES

22

AUDI

V8 QUATTRO

23

BMW

318

24

BMW

323

25

BMW

328

26

BMW

M3

27

BMW

320

28

BMW

325

29

BMW

Z3 ROADSTER

30

BMW

ROADSTER M 3.2

31

BMW

528

32

BMW

540

33

BMW

525

34

BMW

530

35

BMW

M5

36

BMW

535

37

BMW

740

38

BMW

750

39

BMW

40

BMW

840

41

BMW

850

42

(NULL)

Custom

*Note: In a normalized database, the Manufacturer ID would be stored in the Models table. Here, I included the full description to better illustrate which manufacturers are associated with which models.

The first table contains automobile manufacturers; the second, some models that are built by the first several auto manufacturers. The common data between the two tables is the manufacturer, which is linked by manufacturer ID.

Now we’ll extract some data from the tables, using different join types in ANSI syntax.

Inner Join

An inner join is defined as a join in which unmatched rows from either table are not to be returned. In other words, the rows must match in both tables in order to be included in the result set.

SELECT 		t1.description AS 'Manufacturer', 
		t2.description AS 'Model'
FROM 		manufacturer 	t1
INNER JOIN	model		t2
ON		t1.id = t2.manufacturer_id
WHERE		t1.description = 'ACURA';

The "INNER" keyword is not required, but it is considered good practice to include it.

Typing the query above in the MySQL Command Line Client produces the following:

mysql> SELECT           t1.description AS 'Manufacturer',
    ->                  t2.description AS 'Model'
    -> FROM             manufacturer    t1
    -> INNER JOIN       model           t2
    -> ON               t1.id = t2.manufacturer_id
    -> WHERE            t1.description = 'ACURA';
+--------------+---------+
| Manufacturer | Model   |
+--------------+---------+
| ACURA        | INTEGRA |
| ACURA        | CL      |
| ACURA        | LEGEND  |
| ACURA        | RL      |
| ACURA        | NSX     |
| ACURA        | TL      |
| ACURA        | VIGOR   |
| ACURA        | EL      |
| ACURA        | NSX-T   |
+--------------+---------+
9 rows in set (0.00 sec)

Outer Join

Outer joins will return records in one table that aren’t matched in another. Outer joins can be further divided into the two types of left and right. In a left outer join, all records from the first (left-hand) table in a join that meet any conditions set in the WHERE clause are returned, whether or not there’s a match in the second (right-hand) table:

mysql> SELECT           t1.description AS 'Manufacturer',
    ->                  t2.description AS 'Model'
    -> FROM             manufacturer   t1
    -> LEFT JOIN        model          t2
    -> ON        		t1.id = t2.manufacturer_id
    -> WHERE		t1.description = 'New Car Co';

+--------------+----------------+
| Manufacturer | Model          |
+--------------+----------------+
| New Car Co   | (NULL)         |
+--------------+----------------+
19 rows in set (0.00 sec)

The ‘New Car Co’ is returned even though there are no associated models in the model table.

Similar to the left outer join, a right outer join returns all records from the second (right-hand) table in a join that meet any conditions set in the WHERE clause, whether or not there’s a match in the first (left-hand) table:

mysql> SELECT           t1.description AS 'Manufacturer',
    ->                  t2.description AS 'Model'
    -> FROM             manufacturer    t1
    -> RIGHT JOIN       model           t2
    -> ON               t1.id = t2.manufacturer_id
    -> WHERE		t2.description = 'Custom';
+--------------+--------+
| Manufacturer | Model  |
+--------------+--------+
| (NULL)       | Custom |
+--------------+--------+
1 row in set (0.00 sec)

The ‘Custom’ model is returned even though there is no associated manufacturer.

Cross-join

The cross-join, also referred to as a Cartesian product, returns all the rows in all the tables listed in the query.  Each row in the first table is paired with all the rows in the second table.  This happens when there is no relationship defined between the two tables. 

Note that, most of the time, we do not want a Cartesian join, and we end up with one because we failed to provide a filter on the join.  Result sets can get large quickly because the amount of data in the select is the number of rows in Table A multiplied by the number of rows in Table B. If you have more than two tables this multiplies at an exponential rate.

If we actually want a Cartesian join, then we should use the ANSI cross join to tell others reading the script that we actually wanted a Cartesian join. So why would we want one? One reason might be to produce all the combinations of 1, 2 and 3, which could be used as part of a password or ID generation process:

numbers

num

1

2

3

mysql> SELECT CONCAT(   CAST(t1.num AS CHAR),
    ->                  CAST(t2.num AS CHAR)) AS combinations
    -> FROM numbers t1, numbers t2;
+--------------+
| combinations |
+--------------+
| 11           |
| 21           |
| 31           |
| 12           |
| 22           |
| 32           |
| 13           |
| 23           |
| 33           |
+--------------+
9 rows in set (0.00 sec)

Now you’ve got every permutation of number combinations for two digits!

Knowing how to link tables is of great assistance in extracting data from normalized databases, but it may not always be enough. There will be times that no combination of joins will suffice to properly filter the data. In those cases, it may be necessary to use temporary tables. We’ll be looking at those in the next article.

» See All Articles by Columnist Rob Gravelle








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers