Attribute Discretization: Using the Automatic Method

This article continues the overview of Attribute
Discretization
in Analysis
Services
begun in Introduction
to Attribute Discretization
. Both this article and its predecessor extend the
examination of the
dimensional model that we began in Dimensional Model Components: Dimensions Parts I and II. After taking
up various additional components of the dimensional model in subsequent
articles, we performed hands-on exploration of the general characteristics and
purposes of attributes in Dimensional Attributes: Introduction and Overview Parts I through V. We then fixed our focus upon
the properties underlying attributes, extending our overview into attribute
member Keys, Names and Values within several subsequent articles.

This
article continues the focus upon attribute discretization, a capability in
Analysis Services that allows us to group members of an attribute
into a number of member groups. Our concentration here will be to get
some exposure to the pre-defined “Automatic” discretization method, one of
three such pre-defined methods supported by Analysis Services, through hands-on
application of the method to a representative dimension attribute within our
sample UDM.

Note: For more information about my Introduction to
MSSQL Server Analysis Services column in general, see the section
entitled “About the MSSQL Server Analysis Services Series” that follows the
conclusion of this article.

Introduction

In Introduction to Attribute Discretization, I
summarized the articles preceding it within the current subseries, surrounding a
general introduction to the dimensional model. I noted the wide acceptance
of the dimensional model as the preferred structure for presenting quantitative
and other organizational data to information consumers. The articles of the
series then undertook an examination of dimensions, the analytical
“perspectives” upon which the dimensional model relies in meeting the primary
objectives of business intelligence
, including its capacity to support:

  • the
    presentation of relevant and accurate information representing business
    operations and events;
  • the rapid and
    accurate return of query results;
  • “slice and
    dice” query creation and modification;
  • an environment
    wherein information consumers can pose questions quickly and easily, and
    achieve rapid results datasets.

We
extended our examination of dimensions into a couple of detailed
articles. These articles, Dimensional Model Components: Dimensions Parts I and
II, emphasized that dimensions,
which represent the perspectives of a business or other operation, and
reflect the intuitive ways that information consumers need to query and view
data, form the foundation of the dimensional model. We noted that each dimension
within our model contains one or more hierarchies. (As we learn in other
articles of this series, two types of hierarchies exist within Analysis
Services: attribute hierarchies and user – sometimes called “multi-level” – hierarchies.)

We
next introduced dimension attributes within the subseries, and conducted an
extensive overview of their nature, properties, and detailed settings in Dimensional Attributes: Introduction and Overview Parts
I

through V. We noted that attributes help us to define with specificity
what dimensions cannot define by themselves. Moreover, we learned that attributes
are collected within a database dimension, where we can access them to help us to
specify the coordinates required to define cube space.

Throughout
the current subseries, I have emphasized that dimensions and dimension attributes
should support the way that management and information consumers of a given
organization describe the events and results of the business operations of the
entity. Because we maintain dimension and related attribute information within
the database underlying our Analysis Services implementation, we can support
business intelligence for our clients and employers even when these details are
not captured within the system where transaction processing takes place.
Within the analysis and reporting capabilities we supply in this manner, dimensions
and attributes are useful for aggregation, filtering, labeling, and other
purposes.

Having covered the general characteristics and purposes of attributes
in Dimensional Attributes: Introduction
and Overview Parts I
through V,
we fixed our focus upon the properties underlying them, based upon the
examination of representative attributes within our sample cube. We then
continued our extended examination of attributes to yet another important
component we had touched upon earlier, the attribute member Key, with which we
gained some hands-on exposure in practice sessions that followed our coverage
of the concepts. In Attribute Member Keys – Pt I:
Introduction and Simple Keys
and Attribute Member Keys – Pt II:
Composite Keys
, we explored the
concepts of simple and composite keys, narrowing our examination in Part I
to the former, where we reviewed the Properties associated with a simple key,
based upon the examination of a representative dimension attribute within our
sample UDM. In Part II, we revisited the differences between simple and
composite keys, and explained in more detail why composite keys are sometimes
required to uniquely identify attribute members. We then reviewed the properties
associated with a composite key, based upon the examination of another representative
dimension attribute within our sample UDM.

In Attribute
Member Names
, we examined the attribute member Name property,
which we had briefly introduced in Dimensional
Attributes: Introduction and Overview Part V
. We shed some light on
how attribute member Name might most appropriately be used without degrading
system performance or creating other unexpected or undesirable results. We then
examined the “sister” attribute member Value property (which we introduced
along with attribute member Name in Dimensional
Attributes: Introduction and Overview Part V
) in Attribute
Member Values in Analysis Services
. As we did in our overview of attribute
member Name, we examined the details of Value. Our concentration was also
similarly upon its appropriate use in providing support for the selection and
delivery of enterprise data in a more focused and consumer-friendly manner,
without the unwanted effects of system performance degradation, and other
unexpected or undesirable results, that can accompany the uninformed use of the
property.

In Introduction to Attribute Relationships in MSSQL Server
Analysis Services
, we examined yet another part of the conceptual model, Attribute
Relationships. In this introduction, we discussed several best practices and
design, and other, considerations involved in their use, with a focus upon the
general exploitation of attribute relationships in providing support, once
again, for the selection and delivery of enterprise data. In the subsequent
two related articles, Attribute
Relationships: Settings and Properties

and More
Exposure to Settings and Properties in Analysis Services Attribute
Relationships
, we examined attribute relationships in a manner
similar to previous articles within this subseries, concentrating in detail
upon the properties that underlay them.

Finally, in Introduction
to Attribute Discretization
, we introduced a capability in Analysis
Services – to which we refer as attribute discretization – that allows
us to group members of an attribute into a number of member groups. We discussed design, and other, considerations
involved in the discretization of attributes, and touched upon best practices surrounding
the use of this capability. Our focus was upon the general exploitation of discretization
in providing support for the selection and delivery of enterprise data.

In this article, we will gain some hands-on exposure to
setting up one of the multiple pre-defined discretization processes supported
within the Analysis Services UDM. We will first discuss the options that are
available, and then work with Automatic discretization
in the sample cube. (In individual articles designed specifically for the
purpose, we will examine the setup of other discretization options, in a manner
similar to previous articles within this subseries, gaining hand-on exposure to
the use of those options in individual practice scenarios.)

Our examination will include:

  • A brief review
    of attribute discretization in Analysis Services, potential benefits that
    accrue from discretization in our UDMs, and how the process can help us to meet
    the primary objectives of business intelligence.
  • A brief
    overview of the multiple pre-defined discretization processes supported within
    the Analysis Services UDM.
  • Examination,
    via the browser in the Dimension Designer, of the pre-existing members of a
    select attribute hierarchy, noting the absence of grouping and discussing
    shortcomings of this default arrangement.
  • Enablement of
    the Automatic discretization method within the dimension attribute Properties
    pane.
  • Reprocessing
    the cube to enact the new Automatic discretization of the select attribute
    members.
  • Another
    examination, via the browsers in both the Dimension Designer and the Cube
    Designer, of the members of a select attribute hierarchy, noting the new, more
    intuitive grouping established by the newly enacted Automatic discretization method.
  • Forward-looking
    references to subsequent articles within our series, where we will perform
    detailed examinations surrounding other discretization methods supported within
    the Analysis Services UDM.
William Pearson
William Pearson
Bill has been working with computers since before becoming a "big eight" CPA, after which he carried his growing information systems knowledge into management accounting, internal auditing, and various capacities of controllership. Bill entered the world of databases and financial systems when he became a consultant for CODA-Financials, a U.K. - based software company that hired only CPA's as application consultants to implement and maintain its integrated financial database - one of the most conceptually powerful, even in his current assessment, to have emerged. At CODA Bill deployed financial databases and business intelligence systems for many global clients. Working with SQL Server, Oracle, Sybase and Informix, and focusing on MSSQL Server, Bill created Island Technologies Inc. in 1997, and has developed a large and diverse customer base over the years since. Bill's background as a CPA, Internal Auditor and Management Accountant enable him to provide value to clients as a liaison between Accounting / Finance and Information Services. Moreover, as a Certified Information Technology Professional (CITP) - a Certified Public Accountant recognized for his or her unique ability to provide business insight by leveraging knowledge of information relationships and supporting technologies - Bill offers his clients the CPA's perspective and ability to understand the complicated business implications and risks associated with technology. From this perspective, he helps them to effectively manage information while ensuring the data's reliability, security, accessibility and relevance. Bill has implemented enterprise business intelligence systems over the years for many Fortune 500 companies, focusing his practice (since the advent of MSSQL Server 2000) upon the integrated Microsoft business intelligence solution. He leverages his years of experience with other enterprise OLAP and reporting applications (Cognos, Business Objects, Crystal, and others) in regular conversions of these once-dominant applications to the Microsoft BI stack. Bill believes it is easier to teach technical skills to people with non-technical training than vice-versa, and he constantly seeks ways to graft new technology into the Accounting and Finance arenas. Bill was awarded Microsoft SQL Server MVP in 2009. Hobbies include advanced literature studies and occasional lectures, with recent concentration upon the works of William Faulkner, Henry James, Marcel Proust, James Joyce, Honoré de Balzac, and Charles Dickens. Other long-time interests have included the exploration of generative music sourced from database architecture.

Latest Articles