Uses for Cartesian Products in MS Access
September 18, 2009
A fundamental concept when working with relational databases is that of Joins. Simply put, a Join allows you to combine records from two separate tables. In Access, the most common Join is the Inner Join, which combines records from two tables whenever there are matching values in a common field. Another common Join is the Outer Join, which comes in two forms. A Left Outer Join includes all of the records from the first (left) of two tables, even if there are no matching values for records in the second (right) table, while a Right Outer Join includes all of the records from the second (right) of two tables, even if there are no matching values for records in the first (left) table. Less well known, perhaps, is the Cartesian product, which produces every possible combination of records between the two tables. In other words, if one table contains five records and the other table contains four records, the Cartesian product would contain twenty (5 x 4).
Creating a Cartesian Product
Queries are how Joins are used in Access, and most people use the Query Builder to create their queries. While you're probably familiar with building queries in Access, just allow me to briefly discuss how to do this, so that we're all on the same page.
Figure 1 illustrates the details of two simple tables.
If your tables have been created with Relationships (as they should), when you drag related tables into the Query Builder, they will already be joined by default. If no lines are present, you can add them by dragging the field from one table and dropping on the related field in the second table. Figure 2 shows how you would create a query on the two tables shown in Figure 1. In this case, I'm joining the two tables on Field2, and displaying all four fields in the results.
Click for larger image
The SQL that's generated for that case is shown in Listing 1, while Figure 3 shows the results of running the query.
SELECT Table1.Field1, Table1.Field2, Table2.Field1, Table2.Field2 FROM Table1 INNER JOIN Table2 ON Table1.Field2 = Table2.Field2;Listing 1: The SQL for the query shown in Figure 2. Note how the INNER JOIN operation relates the two tables.
To create a Cartesian product, you do not want any lines connecting your tables: delete any that might be present.
Figure 4 illustrates how you can create a Cartesian product through the Query Builder, while Listing 2 shows the SQL that's generated for that case. Note that even if you do not explicitly refer to one (or more) of the tables (i.e.: you do not include any fields from the table in the list of fields to be displayed), the query will still result in a Cartesian product.
SELECT Table1.Field1, Table1.Field2, Table2.Field1, Table2.Field2 FROM Table1, Table2;Listing 2: The SQL for the Cartesian product shown in Figure 4
As is shown in Figure 5, running this query will result in a total of twenty rows.
The sample database that accompanies this article also shows examples of using Left Join and Right Join on the same two tables, so that you can see the differences between the four queries.
Using Cartesian Products
While some of the examples I'm about to present may seem a little contrived, it's my intent to show you that there are legitimate uses for Cartesian products.
Example 1: Making a query read-only
By definition, Cartesian products are always read-only. That means should you want to guarantee that the users can't just change the AllowAdditions/AllowDeletions or AllowEdits properties of a form and change the data, you could set the RecordSource of the report to a query that's a Cartesian product.
The simplest way to do this would be to have a table with a single row in it. (What's in that row is irrelevant). You'd simply have to add a reference to that single row table in the SQL of the query, and you've made the query read-only.
In Example 3, I'll talk about the Anniversaries table that's included in the sample database that accompanies this article. For now, suffice it to say that it's a table that has three fields: a MonthNumber, a DayNumber and a Description that's intended to be used to represent events that occur each year on a given date.
Query qryExample1a_UpdatableQuery is a typical query such as you'd use for the RecordSource of a form to allow you to update the table. Its SQL is shown in Listing 3.
SELECT Anniversaries.MonthNumber, Anniversaries.DayNumber, Anniversaries.Description FROM Anniversaries ORDER BY MonthNumber, DayNumber;Listing 3: The SQL for qryExample1a_UpdatableQuery, an updatable query.
Note, though, that if I have a simple table with a single row in it (such as is shown in Figure 6), I can use it in qryExample1b_NonupdatableQuery (as shown in Listing 4), thus making it read-only.
SELECT Anniversaries.MonthNumber, Anniversaries.DayNumber, Anniversaries.Description FROM Anniversaries, TableWithOneRow ORDER BY MonthNumber, DayNumber;Listing 4: The SQL for qryExample1b_NonupdatableQuery, an read-only query. Note that the only difference from Listing 3 is the inclusion of ", TableWithOneRow" in the FROM clause.
As I've said, the content of TableWithOneRow is irrelevant. It's not uncommon to have a requirement for a table with a single row to hold important pieces of information (client name, path to a logo file, etc.), so you may already have such a table in your application.
Example 2: Making multiple copies of a record
In Example 1, I used a table with a single row so that I wouldn't duplicate the actual data of interest. Sometimes, though, you may want to create multiple copies of the data in your tables.
A request I see in the newsgroups is along the lines of "We're going from a dot matrix printer that was capable of handling three part carbon paper to a laser printer, which cannot handle multipage forms. How can I print three copies of each report?" The easiest way to do this is to ensure that the RecordSource of the report has three copies of each record (sorted, of course, so that all of copy one are together, followed by all of copy two, followed by all of copy three).
For this specific requirement, therefore, you could introduce a table with three rows in it, but in this case, the data contained in the three rows would matter, since you need to be able to sort the three copies of the data appropriately. What you might do is have a field with values of, say, "Original", "Customer Copy" and "File Copy". Include that field (and sort on it) in the RecordSource, and your requirements are met.
In the sample database that accompanies this article, I've shown a different example of making multiple copies of a record. Access, like other Office products, makes it very easy to print to a wide variety of purchased labels, as well as to other formats such as business cards, binder spine templates and so on. Sometimes you may wish to print multiple copies of a particular label or card.
From the preceding discussion, it should be obvious that should you want twenty copies of a given label, you could simply include a table with twenty rows of data in the query being used as a RecordSource for the report. However, do you really want to have to include tables with multiple rows to handle every eventuality? Well, it turns out that you don't really need dummy tables with a huge number of rows: one table, with ten rows (the digits 0 through 9) can be used in a Cartesian product query to give you virtually every possibility.
For instance, the sample database contains a query qryExample2a_NumericValuesFrom0To999 that makes three references to the Digits table in a Cartesian product query illustrated in Listing 5.
SELECT H.WhatDigit*100+T.WhatDigit*10+U.WhatDigit AS NumericValue, H.WhatDigit AS Hundreds, T.WhatDigit AS Tens, U.WhatDigit AS Units FROM Digits AS H, Digits AS T, Digits AS U ORDER BY 1;Listing 5: The SQL for qryExample2a_NumericValuesFrom0To999, which results in one thousand rows (the values 0 through 999). Note that it's not necessary to include the individual digit fields as I did.
Now, rather than having the RecordSource of the label report simply be the typical query against my UserAddress table, I can use qryExample2b_QueryForLabelReport, shown in Listing 6.
PARAMETERS [How many labels do you want? (max 1000)] Long; SELECT C.NumericValue, U.UserName, U.UserAddressLine1, U.UserAddressLine2, U.UserCity, U.UserStateOrProv, U.UserPostCode FROM UserAddress AS U, qryExample2a_NumericValuesFrom0To999 AS C WHERE C.NumericValue<[How many labels do you want? (max 1000)])) ORDER BY 1;Listing 6: The SQL for qryExample2b_QueryForLabelReport. Note that it prompts the user for the number of copies required.
Note that in the query, I'm explicitly declaring that the prompt [How many labels do you want? (max 1000)] is a Long Integer. Including Parameters Declarations is usually a good idea (and is actually required in certain situations, such as Crosstab queries)