A fundamental concept when working with relational databases
is that of Joins. Simply put, a Join allows you to combine records from two
separate tables. In Access, the most common Join is the Inner Join, which combines
records from two tables whenever there are matching values in a common field.
Another common Join is the Outer Join, which comes in two forms. A Left Outer
Join includes all of the records from the first (left) of two tables, even if there
are no matching values for records in the second (right) table, while a Right
Outer Join includes all of the records from the second (right) of two tables,
even if there are no matching values for records in the first (left) table.
Less well known, perhaps, is the Cartesian product, which produces every
possible combination of records between the two tables. In other words, if one
table contains five records and the other table contains four records, the Cartesian
product would contain twenty (5 x 4).
Creating a Cartesian Product
Queries are how Joins are used in Access, and most people use
the Query Builder to create their queries. While you’re probably familiar with
building queries in Access, just allow me to briefly discuss how to do this, so
that we’re all on the same page.
Figure 1 illustrates the details of two simple tables.
Figure 1: Details of two tables, each with two fields.
Table1 has five rows, Table2 has four rows.
If your tables have been created with Relationships (as they
should), when you drag related tables into the Query Builder, they will already
be joined by default. If no lines are present, you can add them by dragging the
field from one table and dropping on the related field in the second table.
Figure 2 shows how you would create a query on the two tables shown in Figure
1. In this case, I’m joining the two tables on Field2, and displaying all four
fields in the results.
Click for larger image
Figure 2: View in the Query Builder, with the two tables
joined on Field2
The SQL that’s generated for that case is shown in Listing 1,
while Figure 3 shows the results of running the query.
SELECT Table1.Field1,
Table1.Field2,
Table2.Field1,
Table2.Field2
FROM Table1 INNER JOIN Table2
ON Table1.Field2 = Table2.Field2;
Listing 1: The SQL for the query shown in Figure 2.
Note how the INNER JOIN operation relates the two tables.
Figure 3: The results of running the INNER JOIN query.
Since the query joins the two tables on Field2, it shows four rows: the rows in
each table with the same values for Field2.
To create a Cartesian product, you do not want any lines
connecting your tables: delete any that might be present.
Figure 4 illustrates how you can create a Cartesian product
through the Query Builder, while Listing 2 shows the SQL that’s generated for
that case. Note that even if you do not explicitly refer to one (or more) of
the tables (i.e.: you do not include any fields from the table in the list of
fields to be displayed), the query will still result in a Cartesian product.
Figure
4: Creating a Cartesian product in the Query Builder. Note that there is no
line joining the two tables
SELECT Table1.Field1,
Table1.Field2,
Table2.Field1,
Table2.Field2
FROM Table1, Table2;
Listing 2: The SQL for the Cartesian product shown in Figure 4
As is shown in Figure 5, running this query will result in a
total of twenty rows.
Figure
5: The results of running the Cartesian product query. Each row in Table1 is
joined to each row in Table2, resulting in a total of twenty rows.
The sample database that accompanies
this article also shows examples of using Left Join and Right Join on the same
two tables, so that you can see the differences between the four queries.
Using Cartesian Products
While some of the examples I’m about to present may seem a
little contrived, it’s my intent to show you that there are legitimate uses for
Cartesian products.
Example 1: Making a query read-only
By definition, Cartesian products are always read-only. That
means should you want to guarantee that the users can’t just change the AllowAdditions/AllowDeletions
or AllowEdits properties of a form and change the data, you could set the RecordSource
of the report to a query that’s a Cartesian product.
The simplest way to do this would be to have a table with a
single row in it. (What’s in that row is irrelevant). You’d simply have to add
a reference to that single row table in the SQL of the query, and you’ve made
the query read-only.
In Example 3, I’ll talk about the Anniversaries table that’s
included in the sample database that accompanies this article. For now, suffice
it to say that it’s a table that has three fields: a MonthNumber, a DayNumber
and a Description that’s intended to be used to represent events that occur
each year on a given date.
Query qryExample1a_UpdatableQuery is a typical query such as you’d
use for the RecordSource of a form to allow you to update the table. Its SQL is
shown in Listing 3.
SELECT Anniversaries.MonthNumber,
Anniversaries.DayNumber,
Anniversaries.Description
FROM Anniversaries
ORDER BY MonthNumber, DayNumber;
Listing 3: The SQL for qryExample1a_UpdatableQuery, an updatable query.
Note, though, that if I have a simple table with a single row
in it (such as is shown in Figure 6), I can use it in qryExample1b_NonupdatableQuery
(as shown in Listing 4), thus making it read-only.
Figure
6: As its name implies, TableWithOneRow has a single row of data in it. Since
we do not use that data, what’s stored in that row is irrelevant.
SELECT Anniversaries.MonthNumber,
Anniversaries.DayNumber,
Anniversaries.Description
FROM Anniversaries, TableWithOneRow
ORDER BY MonthNumber, DayNumber;
Listing 4: The SQL for qryExample1b_NonupdatableQuery, an
read-only query. Note that the only difference from Listing 3 is the inclusion
of ", TableWithOneRow" in the FROM clause.
As I’ve said, the content of TableWithOneRow is irrelevant. It’s
not uncommon to have a requirement for a table with a single row to hold
important pieces of information (client name, path to a logo file, etc.), so
you may already have such a table in your application.
Example 2: Making multiple copies of a record
In Example 1, I used a table with a single row so that I wouldn’t
duplicate the actual data of interest. Sometimes, though, you may want to
create multiple copies of the data in your tables.
A request I see in the newsgroups is along the lines of
"We’re going from a dot matrix printer that was capable of handling three
part carbon paper to a laser printer, which cannot handle multipage forms. How
can I print three copies of each report?" The easiest way to do this is to
ensure that the RecordSource of the report has three copies of each record
(sorted, of course, so that all of copy one are together, followed by all of
copy two, followed by all of copy three).
For this specific requirement, therefore, you could introduce
a table with three rows in it, but in this case, the data contained in the
three rows would matter, since you need to be able to sort the three copies of
the data appropriately. What you might do is have a field with values of, say,
"Original", "Customer Copy" and "File Copy".
Include that field (and sort on it) in the RecordSource, and your requirements
are met.
In the sample database that accompanies this article, I’ve
shown a different example of making multiple copies of a record. Access, like
other Office products, makes it very easy to print to a wide variety of
purchased labels, as well as to other formats such as business cards, binder
spine templates and so on. Sometimes you may wish to print multiple copies of a
particular label or card.
From the preceding discussion, it should be obvious that
should you want twenty copies of a given label, you could simply include a table
with twenty rows of data in the query being used as a RecordSource for the
report. However, do you really want to have to include tables with multiple
rows to handle every eventuality? Well, it turns out that you don’t really need
dummy tables with a huge number of rows: one table, with ten rows (the digits 0
through 9) can be used in a Cartesian product query to give you virtually every
possibility.
Figure
7: Table Digits simply contains the digits 0 through 9.
For instance, the sample database contains a query qryExample2a_NumericValuesFrom0To999
that makes three references to the Digits table in a Cartesian product query
illustrated in Listing 5.
SELECT H.WhatDigit*100+T.WhatDigit*10+U.WhatDigit AS NumericValue,
H.WhatDigit AS Hundreds,
T.WhatDigit AS Tens,
U.WhatDigit AS Units
FROM Digits AS H, Digits AS T, Digits AS U
ORDER BY 1;
Listing 5: The SQL for qryExample2a_NumericValuesFrom0To999,
which results in one thousand rows (the values 0 through 999). Note that it’s
not necessary to include the individual digit fields as I did.
Now, rather than having the RecordSource of the label report
simply be the typical query against my UserAddress table, I can use qryExample2b_QueryForLabelReport,
shown in Listing 6.
PARAMETERS [How many labels do you want? (max 1000)] Long;
SELECT C.NumericValue,
U.UserName,
U.UserAddressLine1,
U.UserAddressLine2,
U.UserCity,
U.UserStateOrProv,
U.UserPostCode
FROM UserAddress AS U,
qryExample2a_NumericValuesFrom0To999 AS C
WHERE C.NumericValue<[How many labels do you want? (max 1000)]))
ORDER BY 1;
Listing 6: The SQL for qryExample2b_QueryForLabelReport.
Note that it prompts the user for the number of copies required.
Note that in the query, I’m explicitly declaring that the
prompt [How many labels do you want? (max 1000)] is a Long Integer. Including
Parameters Declarations is usually a good idea (and is actually required in
certain situations, such as Crosstab queries)