Distinct Counts in Analysis Services 2005

About the Series …

This
article is a member of the series Introduction to MSSQL Server Analysis
Services
. The series is designed to provide hands-on application of
the fundamentals of MS SQL Server Analysis Services (“Analysis
Services
”), with each installment progressively presenting features and
techniques designed to meet specific real-world needs. For more information on
the series, please see my initial article, Creating Our First Cube. For the software components, samples and tools
needed to complete the hands-on portions of this article, see Usage-Based Optimization in Analysis Services 2005, another article within this
series.

Introduction

In a couple of earlier articles of
this series, Introduction to MSSQL Server 2000 Analysis Services:
Distinct Count Basics: Two Perspectives
and Introduction to MSSQL Server 2000 Analysis Services:
Manage Distinct Count with a Virtual Cube
, I introduced the general concept of distinct
counts
, discussing why they are useful (and often required) within the
design of any robust analysis effort. In these and other articles, I described
some of the challenges that were inherent in their use in Analysis Services
2000
, before undertaking practice exercises to illustrate solutions to meet
example business requirements.

We have revisited distinct counts at points in other
articles within both my Introduction to MSSQL Server Analysis Services
and MDX Essentials series’, examining specifics with regard to
appropriate use, and details of optimization within the perspective under
examination in the article concerned. In this article, we will introduce distinct
counts
as they are managed in Analysis Services 2005. The redesign
of the capability based upon the hierarchy and attribute structure that debuts
in Analysis Services 2005 results in much more impressive performance
and flexibility of deployment within our integrated business intelligence
solutions, as many have come to report recently in blogs, forums, and other
media outlets.

In this article,
we will gain some hands-on exposure to distinct counts in Analysis
Services 2005
. Our examination of the expanded capability will include:

  • A discussion
    surrounding the general concepts of distinct counts, including why they
    are useful (and often required) within the design of any robust analysis effort.
  • An examination
    of some of the challenges inherent with using distinct counts in Analysis
    Services 2000
    , and how distinct counts have been redesigned in Analysis
    Services 2005
    to overcome some of these shortcomings of the previous
    version.
  • Creation of a
    distinct count measure
    within a sample cube to demonstrate the ease with
    which we can add distinct count capabilities to the cubes in our
    individual business environments.
  • A discussion
    of other considerations that surround the use of distinct counts.

Distinct Counts in Analysis Services 2005

Anyone working within the
realm of business intelligence and general analysis realizes, in short order,
that we often encounter the need to quantify precisely the members of
various sets of data. Those of us who have become familiar with Analysis
Services
are aware of its capabilities, when it comes to categorizing and
aggregating data within the hierarchical contexts of dimensions and attributes.
We can, for the most part, readily tap these capabilities from the user
interface that Analysis Services provides. Through the exploitation of
more advanced approaches, including the use of calculated members / measures,
and multidimensional expressions (“MDX”) in general, we can extend our analysis
even further, and leverage Analysis Services to reach far more specific
objectives.

One of the basic
requirements that comes into play, at least in some form, in many analysis
scenarios, is the need to count the members of a set targeted for
analysis. An example might be the need to count the number of products we have
shipped from a given warehouse, or group of warehouses, to a given geographical
location, or to a specific group of stores. This can be accomplished readily
enough with the Count() function, as most of us are aware.

Count() does a great job of giving us a total
count
. Of course, the results we would achieve in using Count()
with products, in the scenarios above, would represent total number of
products shipped
. What we would not get, and what we might find far more
useful in some situations, would be a count of the different products
that were shipped. Count(), in providing a total number, would also be
providing multiple counts of the same products, because products will
have been shipped multiple times, in many instances. To reach our objective of
counting different products, then, we would need to count each
different product shipped, only once
. To count them multiple times not
only misstates the number of different products, but it also likely
renders averages, and other metrics based upon the count value, meaningless or
misleading.

The word “different” here
is easily supplanted by “distinct.” Moreover, as many of us are aware, the
performance of distinct counts has historically presented a challenge in
the OLAP world. Let’s introduce a simple example that illustrates the
challenge, and then transform that challenge to an opportunity to meet an
illustrative business need, using the newly expanded distinct count
capabilities found within Analysis Services 2005.

William Pearson
William Pearson
Bill has been working with computers since before becoming a "big eight" CPA, after which he carried his growing information systems knowledge into management accounting, internal auditing, and various capacities of controllership. Bill entered the world of databases and financial systems when he became a consultant for CODA-Financials, a U.K. - based software company that hired only CPA's as application consultants to implement and maintain its integrated financial database - one of the most conceptually powerful, even in his current assessment, to have emerged. At CODA Bill deployed financial databases and business intelligence systems for many global clients. Working with SQL Server, Oracle, Sybase and Informix, and focusing on MSSQL Server, Bill created Island Technologies Inc. in 1997, and has developed a large and diverse customer base over the years since. Bill's background as a CPA, Internal Auditor and Management Accountant enable him to provide value to clients as a liaison between Accounting / Finance and Information Services. Moreover, as a Certified Information Technology Professional (CITP) - a Certified Public Accountant recognized for his or her unique ability to provide business insight by leveraging knowledge of information relationships and supporting technologies - Bill offers his clients the CPA's perspective and ability to understand the complicated business implications and risks associated with technology. From this perspective, he helps them to effectively manage information while ensuring the data's reliability, security, accessibility and relevance. Bill has implemented enterprise business intelligence systems over the years for many Fortune 500 companies, focusing his practice (since the advent of MSSQL Server 2000) upon the integrated Microsoft business intelligence solution. He leverages his years of experience with other enterprise OLAP and reporting applications (Cognos, Business Objects, Crystal, and others) in regular conversions of these once-dominant applications to the Microsoft BI stack. Bill believes it is easier to teach technical skills to people with non-technical training than vice-versa, and he constantly seeks ways to graft new technology into the Accounting and Finance arenas. Bill was awarded Microsoft SQL Server MVP in 2009. Hobbies include advanced literature studies and occasional lectures, with recent concentration upon the works of William Faulkner, Henry James, Marcel Proust, James Joyce, Honoré de Balzac, and Charles Dickens. Other long-time interests have included the exploration of generative music sourced from database architecture.

Latest Articles