Just SQL Part IV – Joining Tables

As your database grows so will the need to get
information from more than one table. This article shows you the different join
options and some simple examples to raise your familiarity.

Wouldn’t it be nice if every piece of data were stored in
one table? Well this is obviously never going to happen. The simple rules of
normalization dictate that there will be more than one table in your database.
As such, we have in our databases some tables that are not related and some
that are. This article is concerned with those tables in our databases that are
related and how we can join the information between them. This article will
also present the old method of writing SQL (SELECT…FROM…WHERE) and the new
ANSI syntax (SELECT…FROM…JOIN).

So what is this join? A join is nothing more than writing a
query that takes a set of rows from one or more tables and combines them
together. The tables that will participate in the join are listed in the FROM
clause of the SQL query and then, depending on the syntax we choose, rows from
each table are linked by either the WHERE clause or the JOIN clause. Of course,
any columns from the tables involved in the join may be included in the select
list for display.

For the discussion of joins, we will concentrate on the
following two very simple tables. Table_A and Table_B each have one column
named LETTER. I have given a set of values for each of these two tables and
have intentionally left empty cells to show where the values are equal and
where they are missing one value from either table. The reason will come clear
when we try to join these tables together for the examples. I have given some
sample data that contains all conditions. Where both tables have a value that
match, and where each table has a letter that does not exist in the other
table.

Table_A.letter

Table_B.letter

A

A

B

C

Equijoins or Inner Join

Also called a simple join where an equality operator is
supplied to join tables together. The following SQL joins Table_A against
Table_B where they share the same letter value.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A, Table_B
3 WHERE Table_A.letter = Table_B.letter;
LETTER LETTER
———- ———-
A A

ANSI Syntax

The ANSI syntax to product an equijoin is the INNER JOIN…ON
clause. The following SQL will produce the same results as the old method.
Notice that the ANSI syntax does not use a comma between tables in the FROM
clause but instead clearly describes the type of join to occur. Then the WHERE
clause turns into an ON clause keeping the arguments and predicate the same.

There are actually a few different ways to do this same join
depending on whether the two tables have the same column names or how you might
want to skimp on verbiage. You could code the query using just JOIN…ON
(leaving out the INNER keyword, NATURAL JOIN, or JOIN…USING (where a supplied
list of common columns are in a ‘using’ list). I personally like to stick with
the INNER JOIN…ON clause, because it is the most complete when describing
what conditions the query will be joined on. The others assume too much from
the table layouts and leave out specifics from the query that could confuse
someone latter down the line that may need to read your SQL code. Look these
variations up in the manuals if you care to use them. So, here is the explicit
SQL to do an equijoin with the ANSI syntax.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A INNER JOIN Table_B
3 ON Table_A.letter = Table_B.letter;
LETTER LETTER
———- ———-
A A

Self Joins

A self join is a special form of equijoin or INNER JOIN
where a table is joined against itself. This means that the table must exists
two times in the FROM clause of the SQL query. There are many different ways a
table can be joined to itself, basically the join methods we are showing in
this article, but since we have currently only shown the equijoin that is what
we will use in this example. Note that when joining a table to itself an alias
must be used for each of the tables in the FROM clause and then also used in
the select list and WHERE clause. This particular query does not show much
benefit but it is only because we have meaningless data. This type of join
often becomes more important when there is a self-referencing table or hidden
meaning across the columns of a table.


SQL >SELECT A1.letter, A2.letter
2 FROM Table_A A1, Table_A A2
3 WHERE A1.letter = A2.letter;
LETTER LETTER
———- ———-
A A
B B

ANSI Syntax


SQL >SELECT A1.letter, A2.letter
2 FROM Table_A A1 INNER JOIN Table_A A2
3 ON A1.letter = A2.letter;
LETTER LETTER
———- ———-
A A
B B

Left Outer Joins

Often times we need to return rows from one table even if
there are no matching rows that are produced through a join condition. For this
situation, we use outer joins.

A left outer join is where the table, on the left of a FROM
clause is required to return all of its rows regardless of having matching rows
from the table it is being joined on. So in the following SQL we want to join
Table_A to Table_B and show where they are equal on the column LETTER as well
as return rows from Table_A that do not have a match on Table_B.

The old method in Oracle was to supply the plus sign in
parentheses (+) next to all columns in the WHERE clause that may not have
values that will match against the table being joined to. Here is the old
method in Oracle to produce a left outer join.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A, Table_B
3 WHERE Table_A.letter = Table_B.letter(+);
LETTER LETTER
———- ———-
A A
B

ANSI Syntax

The ANSI syntax is much easier when performing outer joins.
All you need to do is change some simple verbiage in the FROM clause. Just
replace the INNER JOIN from the previous example to LEFT OUTER JOIN. It states
exactly what you want to perform. Join on column LETTER but return all rows
from Table_A (the left table in the FROM clause) regardless of having a good
match condition against Table_B.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A LEFT OUTER JOIN Table_B
3 ON Table_A.letter = Table_B.letter;
LETTER LETTER
———- ———-
A A
B

Right Outer Joins

A right outer join is just the opposite of a left outer
join. It states that you would like all rows from the right table in the FROM
clause to be returned regardless of having a true match defined in the WHERE
clause against the left side table in the FROM clause. Here is the old method
in Oracle for producing the right outer join.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A, Table_B
3 WHERE Table_A.letter(+) = Table_B.letter;
LETTER LETTER
———- ———-
A A
C

ANSI Syntax

Again the ANSI syntax is much easier. Now just use the RIGHT
OUTER JOIN clause.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A RIGHT OUTER JOIN Table_B
3 ON Table_A.letter = Table_B.letter;
LETTER LETTER
———- ———-
A A
C

Full Outer Joins

The full outer join is a special condition. With Oracle’s
old method of supplying (+) notation, there is no method to explicitly state in
a SQL statement that you want to return both left and right sides of a query
regardless of having a match. To accomplish this you must write a left outer
join SQL statement and UNION it with a right outer join SQL statement. Doing
that produces the following SQL and results.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A, Table_B
3 WHERE Table_A.letter = Table_B.letter(+)
4 UNION
5 SELECT Table_A.letter, Table_B.letter
6 FROM Table_A, Table_B
7 WHERE Table_A.letter(+) = Table_B.letter;
LETTER LETTER
———- ———-
A A
B
C

ANSI Syntax

Again, the ANSI syntax is much clearer and straight to the
point. Just include the FULL OUTER JOIN syntax to the FROM clause and you get
straight to what you want. No messy UNION statement.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A FULL OUTER JOIN Table_B
3 ON Table_A.letter = Table_B.letter;
LETTER LETTER
———- ———-
A A
B
C

Cartesian Products

A cartesian join is a query where there is actually no join
criteria between the table in the query. What is returned is a cartesian
product where each row from a table is matched against every row in the other
table. So in our example where Table_A has two rows and Table_B has two rows
there will be 2 X 2 or 4 rows returned.

The old method was to exclude the WHERE clause.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A, Table_B;
LETTER LETTER
———- ———-
A A
A C
B A
B C

ANSI Syntax

The ANSI syntax uses the CROSS JOIN clause without the ON
clause.


SQL >SELECT Table_A.letter, Table_B.letter
2 FROM Table_A CROSS JOIN Table_B;
LETTER LETTER
———- ———-
A A
A C
B A
B C

I hope that this article has introduced you to some joining
concepts–especially the ANSI syntax. I know I have grown accustomed to Oracle’s
old method of writing SQL for join conditions. Anytime we can switch our thinking
from one database vendor’s implementation to a standard that is used across
various vendors we will be better off. I think this is one situation where you
would want to switch. The ANSI syntax is much easier to understand and is
platform independent.

»


See All Articles by Columnist
James Koopmann

James Koopmann
James Koopmann
James Koopmann has fourteen years of database design, development and performance tuning experience. In addition, he has extensive database administration experience in Oracle and other relational databases in production environments, specializing in performance tuning of database engines and SQL based applications. Koopmann is an accomplished author with several technical papers in various Oracle related publications such as Oracle Magazine, Oracle Professional and SQL>UPDATE_RMOUG. He is a featured author and database expert for DatabaseJournal, a member of the editorial review committee for Select Journal (The Magazine for the International Oracle Users Group), an Oracle Certified Professional DBA and noted speaker at local Oracle User Groups around the country.

Latest Articles