MDX in Analysis Services: Introducing DISTINCT COUNT - Page 2
April 26, 2004
"The Need for Distinction"
As anyone in the realm of business intelligence and general analysis has probably come to realize, we often encounter the need to quantify precisely the members of various sets of data. Those of us who have become familiar with MSAS are aware of its capabilities when it comes to categorizing and aggregating data within the hierarchical contexts of dimensions and levels. We can, for the most part, readily tap these capabilities from the user interface that MSAS provides. Through the exploitation of more advanced approaches, including the use of calculated members / measures and MDX (multidimensional expressions), we can extend our analysis even further, and leverage MSAS to reach far more specific objectives.
NOTE: For more information on calculated members, see Calculated Members: Introduction, Calculated Members: Further Considerations and Perspectives, and Calculated Members: Leveraging Member Properties among numerous general references in the Database Journal MDX Essentials series.
One of the basic requirements that come into play, at least in some form, in virtually any analysis scenario, is the need to count the members of a set targeted for analysis. An example might be the need to count the number of products we shipped from a given warehouse, or group of warehouses, to a geographical location or group of stores. This can be accomplished readily enough with the Count() function (see Basic Numeric Functions: The Count() Function for details about using the MDX Count() function).
As many of us know, Count() does a fine job of giving us a total count. This would mean that the results we might achieve in using Count() with products, in the scenarios above, would represent total number of products shipped. What we would not get, and what we might find far more useful in some situations, would be a count of the different products that were shipped. Count(), in providing a total number, would also be providing multiple counts of the same products, because products will have been shipped multiple times, in many instances. To reach our objective of counting different products, then, we would need to count each different product shipped only once. To count them multiple times not only misstates the number of different products, but it also likely renders averages, and other metrics based upon the count value, meaningless or misleading.
The word "different" here is easily supplanted by "distinct." Moreover, as many of us know, performing distinct counts has historically presented a challenge in the OLAP world. Let's discuss an example that illustrates the challenge, and then convert that challenge to an opportunity to meet a business need using the distinct count capabilities found within MSAS.
Handling Distinct Counts via the MSAS User Interface
Let's take a look at a scenario that illustrates a need for a distinct count, using a hypothetical business need to add practical value. Let's say that a group of information consumers within the FoodMart organization have approached us with a need that they wish to meet within the Warehouse cube. The consumers want to be able to report upon number of products within various metrics without having to be concerned with an issue they faced with a previous system - a scenario of "double counting" in many inventory reports that concerned product-related transactions between warehouses and stores.
We might initially attempt to meet the needs of the consumers with somewhat advanced MDX, but let's try to minimize complication, while heading off many of the issues, with a straightforward approach from within the Cube Editor component of the MSAS user interface, Analysis Manager, first. This provides all that we need, in many cases. (We will examine an MDX approach in the next section of this article).
Let's start Analysis Services and proceed with the following steps:
1. Open Analysis Manager.
2. Expand the Analysis Servers folder by clicking the "+" sign to its immediate left.
Our server(s) appear (my server, MOTHER1, is depicted in some of the illustrations).
3. Expand the desired server.
Our database(s) appear, in much the same manner as shown in Illustration 1.
4. Expand the FoodMart2000 database.
5. Expand the Cubes folder.
The sample cubes appear, as shown in Illustration 2.
NOTE: Your databases / cube tree may differ, depending upon the activities you have performed since the installation of MSAS (and the simultaneous creation of the original set of sample cubes). Should you want or need to restore the cubes to their original state, simply restore the database under consideration. For instructions, see the MSSQL Server 2000 Books Online.
6. Right-click on the Warehouse sample cube.
7. Select Edit from the context menu that appears, as shown in Illustration 3.
The Cube Editor opens. The Schema tab appears as depicted in Illustration 4.
We will be creating a measure in the Cube Editor to enable us to make our distinct Product counts. Distinct Count can only exist as a measure.
8. Right-click the Measures folder in the Tree View to the left of the Schema tab.
A single-line context menu appears, as shown in Illustration 5.
9. Select New Measure from the context menu.
The Insert Measure dialog appears.
10. Click-select product_id.
The Insert Measure dialog, selected measure circled in red, appears in Illustration 6.
11. Click OK to accept the selection.
The Insert Measure dialog closes, and we see the new measure appear (default name of Product_Id) in the Measures folder, as depicted in Illustration 7.