Example 3: Creating a calendar
I really agonized about including this particular example,
because it's going to be difficult to explain and the intent of this article is
the use of Cartesian products, not how to produce a calendar. However, this is
one of my most common uses of Cartesian products, so in the end I decided I had
no choice but to include it!
I often had the need to produce reports that look like a
calendar: one box for each day within a given period, with details of what's
happening that day inside the box. Producing the report itself isn't
particularly difficult: the easiest way is to have a report that represents the
details for a single day, and use that report seven times as a subreport (once
for each day of the week), and link the subreport to the parent report on the
specific date. The tricky part is to produce the appropriate RecordSet for the
report being used as the parent!
Before I get too carried away describing how to produce that
particular query, let's talk about the other information that's required for
the calendar report. Let's discuss a family planning calendar: the one that
hangs on the wall in the kitchen where you write all your upcoming events.
While I may be over-simplifying things, I find that the majority of events,
which need to be reported fall into one of two categories:
-
Those that are on the same day each year (such as birthdays and
anniversaries)
-
Those which vary from year to year
(Although it might not be obvious, I include statutory
holidays in the list of activities whose dates may vary from year to year.
Easter, for example, falls on a different day each year. Even holidays that
have fixed dates (like Christmas or New Year's Day) can have implications that
vary from year to year. For instance, when January 1st is a Saturday or Sunday,
I get the Monday after as the holiday off.)
Bottom line, I include two tables, with the remarkably
creative names of "Anniversaries" and "Holidays".
The Anniversaries table, which we saw in Example 1, consists of three fields: MonthNumber,
DayNumber and Description. Because the values of MonthNumber and DayNumber will
never exceed 31, I chose to make them Byte fields. Description is intended to
contain a brief description of the event suitable to appear on the calendar
("Grandma's birthday", "Mom & Dad's anniversary"), so I
set it as a 50-character Text field. Moreover, because it's possible for there
to be two or more events on the same day, I didn't set a Primary Key for the
table. The purists among you might want to add an Autonumber field that you can
use as a Primary Key.
The Holidays table only requires two fields: HolidayDate (a
Date/Time field, because it contains a complete date), and Description (again,
a 50 character Text field). Again, I saw no need for a Primary Key for this
table.
In addition to the events handled by the two tables above,
another category of events are those that occur on the same day for a finite
number of weeks during the year. These are the activities such as the class
every Tuesday night from 7:00 to9:00, the Wednesday night soccer game and the
Monday night Brownie meeting. These events are handled in a table named RecurringEvents,
which consists of four fields: DayOfWeek, StartDate, EndDate and Description.
The DayOfWeek field will indicate the day of the week on which the event
occurs: 1 for Sunday, 2 for Monday and so on up to 7 for Saturday. Because this
field is only going to hold values between 1 and 7, I made this a Byte field
also. The StartDate and EndDate Date/Time fields are the date of the first
occurrence of the event and the date of the last occurrence of the event. As
before, the Description field is a 50-character Text field.
Okay, now how do we generate a recordset that has one entry
for each day of the year? One approach, of course, is to create a table with
one row for each day. However, you can easily use a Cartesian product query to
produce this for you, using far smaller tables. Specifically, create three
tables: one named Days, one named Months, and one named Years. (Note the
plural: this is necessary since Day, Month and Year are all reserved words in
Access!)
The Days table consists of a single Byte field named DayNumber.
The table itself has 31 rows in it, containing the number 1 through 31. The
Months table consists of two fields: a Byte field named MonthNumber, and a Text
field named MonthNumber (although to be perfectly honest, there's no real need
for this text field), and contains 12 rows: one for each month. The Years table
consists of a single Byte field named YearNumber and contains as many rows as
you think is necessary, one for each year. (In the sample database that
accompanies this article, I included five rows, for the years 2008, 2009, 2010,
2011 and 2012). Figure 8 shows the details of these three tables.
Figure
8: Details of the Years, Months and Days tables.
Listing 7 shows how you'd use those three tables in a query
that returns one row for each day. Of course, since the Days table has 31 rows
in it, the Months table has 12 rows in it and the Years table has 5 rows in it,
the Cartesian product would return 1860 rows (31*12*5), whereas there are
actually only 1827 days between 1 January, 2008 and 31 December, 2012 (the
range represented by the three tables). Unfortunately, the DateSerial function
is a little too helpful: it will cheerfully accept numbers that aren't valid
combinations of month and day, which could end up causing duplicate rows. (For
example, DateSerial(2009, 11, 31) returns 1 December, 2009, the same as DateSerial(2009, 12, 1)) To prevent that, it's necessary to include a Where
clause to eliminate potential duplicates. Unlike the accommodating DateSerial
function, the IsDate function is more practical: if you pass it an invalid
date, it tells you so! To avoid any ambiguity between mm/dd/yyyy and dd/mm/yyyy
formats (is 02/10/2009 February 10 or October 2?), I use the yyyy-mm-dd format
in the call to the IsDate function.
SELECT DateSerial([YearNumber],[MonthNumber],[DayNumber]) AS WhatDate,
YearNumber,
MonthNumber,
DayNumber
FROM Days, Months, Years
WHERE IsDate([YearNumber] & "-" & [MonthNumber] & "-" & [DayNumber]
ORDER BY 1;
Listing 7: The SQL for qryExample3a_DaysOfTheYear. The
WHERE condition using the IsDate function eliminates invalid dates such as
February 30 that the DateSerial function would actually convert to duplicate
entries.
We're now ready to return to the recordset required for the
Calendar report! Figure 9 shows what this recordset needs to look like. Note
that not only do we need to show the seven days of the week on each row, but
when the month changes in the middle of the week, we need to use two separate
rows for the week: one to go on the previous month's calendar page and one to
go on the current month's calendar page.

Figure
9: The recordset required for the Calendar report.
Listing 8 shows the SQL for qryExample3b_Calendar, the query
that produces the recordset in Figure 9.
SELECT WhatDate AS SundayDate,
WhatDate + 1 AS MondayDate,
WhatDate + 2 AS TuesdayDate,
WhatDate + 3 AS WednesdayDate,
WhatDate + 4 AS ThursdayDate,
WhatDate + 5 AS FridayDate,
WhatDate + 6 AS SaturdayDate,
YearNumber AS SortYear,
MonthNumber AS SortMonth
FROM qryExample3a_DaysOfTheYear
WHERE Weekday([WhatDate])=1
UNION
SELECT WhatDate AS SundayDate,
WhatDate + 1 AS MondayDate,
WhatDate + 2 AS TuesdayDate,
WhatDate + 3 AS WednesdayDate,
WhatDate + 4 AS ThursdayDate,
WhatDate + 5 AS FridayDate,
WhatDate + 6 AS SaturdayDate,
YearNumber AS SortYear,
Month([WhatDate] +6 ) AS SortMonth
FROM qryExample3a_DaysOfTheYear
WHERE Weekday([WhatDate])=1
AND Month([WhatDate]) <> Month([WhatDate]+6)
AND Year([WhatDate] + 6) = YearNumber
UNION
SELECT Sunday(WhatDate) AS SundayDate,
Sunday(WhatDate) + 1 AS MondayDate,
Sunday(WhatDate) + 2 AS TuesdayDate,
Sunday(WhatDate) + 3 AS WednesdayDate,
Sunday(WhatDate) + 4 AS ThursdayDate,
Sunday(WhatDate) + 5 AS FridayDate,
Sunday(WhatDate) + 6 AS SaturdayDate,
Year(WhatDate) AS SortYear,
1 AS SortMonth
FROM qryExample3a_DaysOfTheYear
WHERE Month([WhatDate]) = 1
AND Day([WhatDate]) =1
Listing 8: The SQL to take the data from qryExample3a_DaysOfTheYear
and display each week as a single row.
Producing a query that returns the seven days of the week as
a single row isn't that difficult. By using a WHERE clause of Weekday([WhatDate])
= 1, you can return one row for each Sunday date. By adding the appropriate
number of days to that date, you can get the dates for Monday, Tuesday, and so
on. In order to be able to sort the data correctly, you also need to add a SortYear
and SortMonth, which will be the Year and Month of the SundayDate value. This
is what the first subquery in the Union query shown in Listing 8 does.
To handle the case where a week includes dates in two
different months, the second subquery in the Union query shown in Listing 8 calculates
SortMonth by determining the month for the Saturday of each week and only
returns those rows where Sunday and Saturday are in different months. The
query's third condition (Year([WhatDate] + 6) = YearNumber) eliminates the
second instance of the week if the year ends during the week. In other words, I
don't need a duplication of the week where Saturday is in the next year.
The third subquery in the Union query shown in Listing 8
produces the row corresponding to the first days of the year for those years
that don't start on Sunday. For example, in 2008, the first Sunday isn't until
January 6th, so the results of running the two subqueries already discussed
will not include the first five days of the year. I had to turn to some VBA
processing to handle this case.
Listing 9 shows a function that returns the Sunday before a
given date:
Function Sunday(WhatDate As Date) As Date
Sunday = WhatDate - WeekDay(WhatDate) + 1
End Function
Listing 9: For a given date, the Sunday function
returns the date of the previous Sunday.
This code takes advantage of the Weekday function that
returns 1 for Sunday, 2 for Monday and so on, until 7 for Saturday. Subtracting
the weekday number for a date from the date itself returns the date for the
preceding Saturday (e.g., because January 8, 2008 was a Tuesday, its weekday
number is 3, and subtracting that gives January 5, 2008 -- a Saturday). Adding 1
to that result gives you the date of the Sunday preceding the original date.
The third subquery uses the Sunday function to provide the
missing week (where necessary) by using the function to determine the Sunday
before January 1st of the given year, and then calculate all the other days of
that week.
Because the UNION operator automatically eliminates duplicate
rows, the query will eliminate rows where Sunday and Saturday are both in the
same month and where the week returned by the third subquery is also returned
by the first subquery.
It's almost anticlimactic discussing the last query required
for the Calendar report, the one that will be the RecordSource for the subreport
that shows the details for a given day!
SELECT DateSerial([YearNumber],[MonthNumber],[DayNumber]) AS WhatDate,
Anniversaries.Description
FROM Anniversaries, Years
UNION ALL
SELECT DISTINCT HolidayDate,
Description
FROM Holidays
UNION ALL
SELECT DISTINCT qryExample3a_DaysOfTheYear.WhatDate,
RecurringEvents.Description
FROM RecurringEvents INNER JOIN qryExample3a_DaysOfTheYear
ON Weekday(qryExample3a_DaysOfTheYear.WhatDate) = RecurringEvents.DayOfWeek
WHERE qryExample3a_DaysOfTheYear.WhatDate BETWEEN RecurringEvents.StartDate AND RecurringEvents.EndDate;
Listing 10: The SQL for query qryExample3c_ToDisplay,
which produces the details for a given day to be displayed in the subreport.
The first subquery in the Union query produces one entry for
each row in the Anniversaries table for each year in the Years table.
The second subquery adds one row for each row in the Holidays
table. Since my Holiday table actually contains holidays for a number of
different countries, I used SELECT DISTINCT to eliminate duplicates (i.e. dates
that are the same holiday in multiple countries).
Finally, the third subquery adds one row for each row in the RecurringEvents
table for each actual occurrence. It does this by joining the RecurringEvents
table to the results of qryExample3a_DaysOfTheYear using what I refer to as a
non-equijoin. In other words, it's an INNER JOIN, but the join condition
specified in the ON clause is not a simple equality statement, as it usually
is.
I know that was a long-winded explanation. Hopefully when you
look at the sample database, it'll all make sense to you!
Example 4: Creating a long running query
Believe it or not, I actually had a requirement recently to
create a query that was guaranteed to run for a long period of time!
In my current job, we're busy preparing to roll Windows 7
out. As you're all probably aware, one of the current "hot topics" is
Green Computing: reducing the power requirements, while not reducing the
computing abilities. Because of that, we're taking the Power Management
settings fairly seriously.
One concern that was raised to our team was what happens if
the user starts a long running query then goes to lunch. There was worry that
the machine would go into hibernation before the query completed, thereby
wasting all the work already done. (There was also concern that it could cause
database corruption)
To test this theory out, I constructed a Cartesian product
similar to the query qryExample2a_NumericValuesFrom0To999 we've already
discussed, only this time I allowed it to go to 9999.
SELECT U.WhatDigit+10*T.WhatDigit+100*H.WhatDigit+1000*Th.WhatDigit AS NumericValue
FROM Digits AS U, Digits AS T, Digits AS H, Digits AS Th;
Listing 11: The SQL for query qryExample4a_FirstPart,
which produces 10,000 rows.
I then created a second query, which joined the query above
to itself.
SELECT Count(*) AS TotalRows
FROM qryExample4a_FirstPart AS T1
INNER JOIN qryExample4a_FirstPart AS T2
ON T1.NumericValue = T2.NumericValue;
Listing 12: The SQL for query qryExample4b_SecondPart,
which joins qryExample4a_FirstPart to itself.
Since there are no indexes involved, joining a ten thousand
row table to itself is a fairly time-consuming exercise. (I find it takes about
11 to 12 minutes on my desktop). By setting the Hibernation time to a smaller
value (like 5 minutes), we were able to see the effects of the machine trying
to hibernate on the query.
(For those of you who care, the machine doesn't hibernate
while the query is running, but does hibernate five minutes after the query has
completed. However, we're running into some anomalies that appear to be related
to either the settings on our network switches or to the firewall product we're
using, so I can't really give you a definitive answer on what your Power
Management settings should be!)
Conclusion
I'll be the first to admit that some of the examples
presented above were a little frivolous. However, the fact of the matter is
that there can be legitimate uses for Cartesian products, and hopefully the
samples I provided will give you ideas for those times where they're
appropriate.
»
See All Articles by Columnist Doug Steele