Reporting what's not there

October 16, 2009

I tend to answer a lot of questions at the UtterAccess Discussion Forums (http://www.utteraccess.com) A recent question there got me thinking about the fact that it's easy to write queries that will show you data in the database that matches a criteria. However, if you want details of when there is no data in the database that matches the criteria, it becomes more difficult.

The specific question posed was something like:

How would i get this query to display a value of zero if there are no records for CABLE_POINT.CABLE_POINT_TYPE ="Break Out Splice Point" And CABLE_POINT.CAPACITY="12"

SELECT Count(CABLE_POINT.CABLE_POINT_TYPE) AS [No of 12f Loop Joint]

FROM CABLE_POINT

GROUP BY CABLE_POINT.PROJECT_ID, CABLE_POINT.CABLE_POINT_TYPE, CABLE_POINT.CAPACITY

HAVING (((CABLE_POINT.PROJECT_ID)=Forms!frmReqForm!txtProject_ID) And ((CABLE_POINT.CABLE_POINT_TYPE)="Break Out Splice Point") And ((CABLE_POINT.CAPACITY)="12"));

The query runs from a form where I have a combo box to select different projects according to their ID. Sometime a project might have a record of CABLE_POINT.CABLE_POINT_TYPE ="Break Out Splice Point" And CABLE_POINT.CAPACITY="12" present. Sometimes it won't. Therefore if there is no record present I want to the query to display "0"

Now, the problem with SQL Aggregate Function (such as Count, Avg, Min, Max or Sum) is that they only return values based on the number of rows selected that meet the given criteria. If no rows in the table are returned, then there's no value to be calculated. And while the DLookup Domain Aggregate function will return Null in that case (so that you can use the Nz function to return a value of zero), because nothing is returned by the query, there's nothing for you to grab to try and coerce into a value.

Query Basics

Before I go any further, I want to discuss a pet peeve of mine with Totals queries generated in Access. They're usually not generated correctly!

  SELECT fieldlist
       FROM table
	  WHERE selectcriteria
GROUP BY groupfieldlist 
[ HAVING groupcriteria]

Both WHERE and HAVING affect the details of the records displayed, but there is a fundamental difference between them. WHERE specifies which records are acted upon by the SELECT statement, whereas HAVING specifies which records are displayed. What does this mean when doing Totals queries (i.e. queries that use Aggregate functions)?

In the query posted to UtterAccess, the intent is to limit the results to only those rows that meet the three given criteria. Since we don't want to see any records which don't meet those criteria, they really should be in a WHERE clause. While the difference may be subtle, when you're dealing with large tables, if you can reduce the number of rows that need to be aggregated, you can speed the query up significantly. The HAVING clause should be limited to those cases where the display criteria depends on the results of grouping the records. In other words, whatever is specified in a HAVING clause should include an aggregate function: HAVING Sum(UnitsInStock) > 100, HAVING Count(*) <= 1 and so on.

As well, note that there's no reason to include fields in the GROUP BY clause unless those fields are included in the SELECT field list. In fact, having fields that don't appear in the list can actually cause your query to return unexpected results: additional rows will be returned to account for changes in the fields not displayed.

Finally, using Count(*) is considerably faster than Count([FieldName]). There is, of course, a difference between the two: the Count function doesn't count records that have a Null value in the field when you use a field name, whereas when you use the asterisk wildcard character, the Count function calculates the total number of records, including those that contain Null fields. Since in this particular case a desired value is being specified in the WHERE clause, the SELECT statement will never return a Null value for CABLE_POINT_TYPE (and, as I've already mentioned, the query isn't going to return a row with Null values if no match is found). Besides, we want a zero value, and if Count ignores Null values, that wouldn't give us what we wanted anyhow!

All that being said, let me now rewrite the original query more correctly, as shown in Listing 1.

SELECT COUNT(*) AS [No of 12f Loop Joint]
  FROM CABLE_POINT
 WHERE PROJECT_ID=Forms!frmReqForm!txtProject_ID
   AND CABLE_POINT_TYPE="Break Out Splice Point"
   AND CAPACITY="12"
Listing 1: SQL for a more efficient version of the original query

Now, I believe that the reason why the generated SQL is badly formed is due to how people create such queries. The usual approach is to select the table, drag the fields into the grid, change the query into a Totals query and put criteria under the appropriate fields. The problem is, when you convert a SELECT query to a Totals query, Access assumes a value of GROUP BY in the Total row under each field. Many people don't realize that you can change that GROUP BY to WHERE to get the improvement I discuss above. Figure 1 shows how to do this.

An illustration of how to generate a better Totals query in Access
Figure 1: An illustration of how to generate a better Totals query in Access.

Forcing data

For the particular problem as posed, it's relatively simple to cause a row to be generated whether or not there's data. Since we're dealing with totals, and adding zero to any number results in the same number, what happens if we force a row with a zero value into the mix, and then sum the resultant recordset?

We can create a UNION query (qryS2_UnionQuery, shown in Listing 2) that adds a row having a value of 0:

SELECT COUNT(*) AS CountRows
  FROM CABLE_POINT
 WHERE PROJECT_ID=Forms!frmGetTotals!cboProject
   AND CABLE_POINT_TYPE="Break Out Splice Point"
   AND CAPACITY="12"
UNION
SELECT DISTINCT 0
  FROM CABLE_POINT
Listing 2: SQL for qryS2_UnionQuery, a query that will ensure that the recordset returned will always have a least one row.

Figure 2 illustrates what this query will return if there is data that matches on the three criteria, while Figure 3 shows what will be returned when there is no matching data.

Resultant recordset when matching data exists in the table. Note that two rows are returned
Figure 2: Resultant recordset when matching data exists in the table. Note that two rows are returned.

Resultant recordset when matching data does not exist in the table. Note that only one row is returned.
Figure 3: Resultant recordset when matching data does not exist in the table. Note that only one row is returned.

Instead of the original query, we now can use:

SELECT Sum(CountRows) AS NumberOfJoints
FROM qryS2_UnionQuery
Listing 3: SQL for qryS2_SumBasedOnUnion, a query sums together the rows returned by qryS2_UnionQuery to return a single row.

Note that it's actually possible to do this as a single query:

SELECT Sum(SubSelect.CountRows) AS NumberOfJoints
  FROM
(
SELECT Count(*) AS CountRows
  FROM Cable_Point
 WHERE (Project_ID=Forms![frmGetTotals]![cboProject])
  AND (Cable_Point.Cable_Point_Type=Forms![frmGetTotals]![cboType])
  AND (Cable_Point.Capacity=Forms![frmGetTotals]![cboCapacity])
UNION
SELECT DISTINCT  0
  FROM Cable_Point) AS SubSelect
Listing 4: SQL for qryS2_SingleQuery, a query that combines qryS2_UnionQuery and qryS2_SumBasedOnUnion into a single query.

Incidentally, while what I've written above is syntactically correct, Access may change the bracketing on that to the following:

SELECT Sum(SubSelect.CountRows) AS NumberOfJoints
  FROM [SELECT Count(*) AS CountRows
  FROM Cable_Point
 WHERE (Project_ID=Forms![frmGetTotals]![cboProject])
   AND (Cable_Point.Cable_Point_Type=Forms![frmGetTotals]![cboType])
   AND (Cable_Point.Capacity=Forms![frmGetTotals]![cboCapacity])
UNION
SELECT DISTINCT 0
FROM Cable_Point]. AS SubSelect;
Listing 5: SQL for qryS2_SingleQuery, as Access will may render it.

Using Crosstab Queries

Crosstab queries calculate a value (sum, average, count, or other type of total) for data that is grouped by two types of information — one down the left side of the datasheet and another across the top. While a crosstab query will generate Null values if there is no value for a particular intersection of the left-side field and the top field, like regular Select queries, it will not return Null values if there are no values for the particular left side field for all top fields or vice versa. To illustrate what I mean, the sample database that accompanies this article has a simplistic example where a particular company has three different products (with the imaginative names Product A, Product B and Product C) that it sells to a total of five different customers. Figure 4 shows the Entity-Relationship Diagram for the tables required to represent the orders placed, while Figure 5 shows the limited data entered for this fictitious company as generated by the SQL presented in Listing 6.

Entity-Relationship Diagram for the Order Taking system
Figure 4: Entity-Relationship Diagram for the Order Taking system.

SELECT Customers.CustomerId, 
       OrderDetails.OrderId, 
       OrderDetails.ProductId, 
       Customers.CustomerNm,
       Products.ProductDs, 
       OrderDetails.QuantityAm, 
       Orders.OrderDt, 
       Format([OrderDt],"mmm, yyyy") AS OrderMonth
       Format([OrderDt],"yyyy-mm") AS SortMonth
  FROM Products 
INNER JOIN ((Customers 
INNER JOIN Orders 
   ON Customers.CustomerId = Orders.CustomerId) 
INNER JOIN OrderDetails 
   ON Orders.OrderId = OrderDetails.OrderId)
   ON Products.ProductId = OrderDetails.ProductId
Listing 6: SQL for qryS3_OrderDetails, a query that returns a denormalized view of all of the order details.

Resultant recordset from qryS3_OrderDetails, showing sample data
Figure 5: Resultant recordset from qryS3_OrderDetails, showing sample data.

To summarize the order details, we can create a Crosstab query based on qryS3_OrderDetails as shown in Listing 7, producing the recordset shown in Figure 6.

TRANSFORM Sum(QuantityAm) AS SumOfQuantityAm
   SELECT OrderMonth
     FROM qryS3_OrderDetails
 GROUP BY OrderMonth, SortMonth
 ORDER BY SortMonth
    PIVOT ProductDs
Listing 7: SQL for qryS3_OrderDetails_Crosstab, a query that summarizes the data returned by qryS3_OrderDetails

Resultant recordset from qryS3_OrderDetails_Crosstab, showing summarized data
Figure 6: Resultant recordset from qryS3_OrderDetails_Crosstab, showing summarized data.

Looking at Figure 6, you might notice that there are no orders listed for Product B, nor are there any orders listed for the month of March (nor any month after April...). Usually our users are going to want to see zeroes for all products not ordered in a given month.

We can create a query that returns one row for each month of the current year using SQL such as is shown in Listing 8, resulting in a recordset such as is shown in Figure 7.

SELECT DateSerial(Year(Date()),[MonthNumber],1) AS MonthStart,
DateSerial(Year(Date()),[MonthNumber]+1,0) AS MonthEnd,
Year(Date()) AS YearNumber,
Months.MonthNumber,
Format(DateSerial(Year(Date()),[MonthNumber],1),"yyyy-mm") AS SortMonth
FROM Months
ORDER BY Months.MonthNumber;
Listing 8: SQL for qryS3_MonthsThisYear, a query that returns one row for each month in the current year.

(The Month table was described in last month's article Uses for Cartesian Products in MS Access)

Resultant recordset from qryS3_MonthsThisYear.
Figure 7: Resultant recordset from qryS3_MonthsThisYear.

If we join qryS3_MonthsThisYear to the Products table, as shown in Listing 9, we can return one row for each product/month combination. (Look closely: this is another Cartesian product, as discussed last month!) Just to summarize that data, I created a crosstab query on qryS3_MonthsThisYear, shown in Figure 8.

  SELECT qryS3_MonthsThisYear.MonthStart,
         qryS3_MonthsThisYear.MonthEnd,
         Products.ProductDs,
         0 AS QuantityAm,
         qryS3_MonthsThisYear.SortMonth
    FROM Products, qryS3_MonthsThisYear
ORDER BY qryS3_MonthsThisYear.SortMonth, Products.ProductDs;
Listing 9: SQL for qryS3_AllProductMonthCombinations, a query that returns one row for each combination of months in the current year and products sold.

Resultant recordset from qryS3_AllProductMonthCombinations
Figure 8: Resultant recordset generated by a crosstab query based on query qryS3_AllProductMonthCombinations.

Now that we have a query that does have one row for all possible outcomes, it's fairly straightforward to join that table to our Order Details using a Left Join, as shown in Figure 9, which produces the SQL statement shown in Listing 10.

SQL for qryS3_AllProductMonthCombinations, a query that returns one row for each combination of months in the current year and products sold
Figure 9: Creating query qryS3_OrderDetails_AllProductMonthCombinations using a Left Join.

   SELECT Format(A.MonthStart,"mmm, yyyy") AS OrderMonth, 
          A.ProductDs, 
          CLng(Nz(B.[QuantityAm],0)) AS QuantityAm, 
          A.SortMonth
     FROM qryS3_AllProductMonthCombinations AS A 
LEFT JOIN qryS3_OrderDetails AS B 
       ON (A.SortMonth = B.SortMonth) 
      AND (A.ProductDs = B.ProductDs)
Listing 10: SQL for qryS3_OrderDetails_AllProductMonthCombinations, which will have at least one row for each combination of Month and Product.

Now, creating a crosstab based on qryS3_OrderDetails_AllProductMonthCombinations, such as shown in Listing 11, will result in having zeroes for any product not ordered in a particular month, as shown in Figure 10.

TRANSFORM Sum(QuantityAm) AS SumOfQuantityAm
SELECT OrderMonth
FROM qryS3_OrderDetails_AllProductMonthCombinations
GROUP BY OrderMonth, SortMonth
ORDER BY SortMonth
PIVOT ProductDs
Listing 11: SQL for qryS3_OrderDetails_AllProductMonthCombinations_Crosstab, which will include zeroes for all combinations not ordered.

 Resultant recordset from qryS3_OrderDetails_AllProductMonthCombinations_Crosstab
Figure 10: Resultant recordset from qryS3_OrderDetails_AllProductMonthCombinations_Crosstab

Conclusion

In the preceding discussion, I've attempted to show two different scenarios where it's necessary to create data in order to be able to report zero values in queries. Now, I'll acknowledge that the queries involved may be slower than "straight" queries against the data that does exist, but customers often have very specific ideas about what they want to see. Note that since I used Union queries and Cartesian products in order to produce the data, and Union queries and Cartesian products both result in recordsets that are read-only, these techniques are really only appropriate for Reports .

» See All Articles by Columnist Doug Steele








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers