Attribute Discretization: Using the "Equal Areas" Method
May 14, 2009
This article continues my exploration of attribute discretization, a capability in Analysis Services that allows us to group members of an attribute into a number of member groups. Our concentration here will be to get some exposure to the pre-defined Equal Areas discretization method, one of three such pre-defined methods supported by Analysis Services, through hands-on application of the method to a representative dimension attribute within our sample UDM.
This article continues the overview of Attribute Discretization in Analysis Services begun in Introduction to Attribute Discretization, and continued in Attribute Discretization: Using the Automatic Method. Both this article and its predecessor extend the examination of the dimensional model that we began in Dimensional Model Components: Dimensions Parts I and II. After taking up various additional components of the dimensional model in subsequent articles, we performed hands-on exploration of the general characteristics and purposes of attributes in Dimensional Attributes: Introduction and Overview Parts I through V. We then fixed our focus upon the properties underlying attributes, extending our overview into attribute member Keys, Names, Values and Relationships within several subsequent articles.
Note: For more information about my Introduction to MSSQL Server Analysis Services column in general, see the section entitled About the MSSQL Server Analysis Services Series that follows the conclusion of this article.
In Introduction to Attribute Discretization and Attribute Discretization: Using the Automatic Method, I summarized preceding articles within the current subseries, consisting of a general introduction to the dimensional model. I noted the wide acceptance of the dimensional model as the preferred structure for presenting quantitative and other organizational data to information consumers. The articles of the series then undertook an examination of dimensions, the analytical perspectives upon which the dimensional model relies in meeting the primary objectives of business intelligence, including its capacity to support:
We extended our examination of dimensions into a couple of detailed articles. These articles, Dimensional Model Components: Dimensions Parts I and II, emphasized that dimensions, which represent the perspectives of a business or other operation, and reflect the intuitive ways that information consumers need to query and view data, form the foundation of the dimensional model. We noted that each dimension within our model contains one or more hierarchies. (As we learn in other articles of this series, two types of hierarchies exist within Analysis Services: attribute hierarchies and user - sometimes called multi-level - hierarchies.)
We next introduced dimension attributes within the subseries, and conducted an extensive overview of their nature, properties, and detailed settings in Dimensional Attributes: Introduction and Overview Parts I through V. We noted that attributes help us to define with specificity what dimensions cannot define by themselves. Moreover, we learned that attributes are collected within a database dimension, where we can access them to help us to specify the coordinates required to define cube space.
Throughout the current subseries, I have emphasized that dimensions and dimension attributes should support the way that management and information consumers of a given organization describe the events and results of the business operations of the entity. Because we maintain dimension and related attribute information within the database underlying our Analysis Services implementation, we can support business intelligence for our clients and employers even when these details are not captured within the system where transaction processing takes place. Within the analysis and reporting capabilities we supply in this manner, dimensions and attributes are useful for aggregation, filtering, labeling, and other purposes.
Having covered the general characteristics and purposes of attributes in Dimensional Attributes: Introduction and Overview Parts I through V, we fixed our focus upon the properties underlying them, based upon the examination of representative attributes within our sample cube. We then continued our extended examination of attributes to yet another important component we had touched upon earlier, the attribute member Key, with which we gained some hands-on exposure in practice sessions that followed our coverage of the concepts. In Attribute Member Keys Pt I: Introduction and Simple Keys and Attribute Member Keys Pt II: Composite Keys, we explored the concepts of simple and composite keys, narrowing our examination in Part I to the former, where we reviewed the Properties associated with a simple key, based upon the examination of a representative dimension attribute within our sample UDM. In Part II, we revisited the differences between simple and composite keys, and explained in more detail why composite keys are sometimes required to uniquely identify attribute members. We then reviewed the properties associated with a composite key, based upon the examination of another representative dimension attribute within our sample UDM.
In Attribute Member Names, we examined the attribute member Name property, which we had briefly introduced in Dimensional Attributes: Introduction and Overview Part V. We shed some light on how attribute member Name might most appropriately be used without degrading system performance or creating other unexpected or undesirable results. We then examined the sister attribute member Value property (which we introduced along with attribute member Name in Dimensional Attributes: Introduction and Overview Part V) in Attribute Member Values in Analysis Services. As we did in our overview of attribute member Name, we examined the details of Value. Our concentration was also similarly upon its appropriate use in providing support for the selection and delivery of enterprise data in a more focused and consumer-friendly manner, without the unwanted effects of system performance degradation, and other unexpected or undesirable results, that can accompany the uninformed use of the property.
In Introduction to Attribute Relationships in MSSQL Server Analysis Services, we examined yet another part of the conceptual model, Attribute Relationships. In this introduction, we discussed several best practices and design, and other, considerations involved in their use, with a focus upon the general exploitation of attribute relationships in providing support, once again, for the selection and delivery of enterprise data. In the subsequent two related articles, Attribute Relationships: Settings and Properties and More Exposure to Settings and Properties in Analysis Services Attribute Relationships, we examined attribute relationships in a manner similar to previous articles within this subseries, concentrating in detail upon the properties that underlay them.
With the next article, Introduction to Attribute Discretization, we introduced a capability in Analysis Services to which we refer as attribute discretization - that allows us to group members of an attribute into a number of member groups. We discussed design, and other, considerations involved in the discretization of attributes, and touched upon best practices surrounding the use of this capability.
Finally, in Attribute Discretization: Using the Automatic Method, we introduced the first of multiple pre-defined discretization methods supported within the Analysis Services UDM. We first discussed the options that are available, focusing upon the employment of the Automatic discretization method within the sample cube, to meet the business requirements of a hypothetical client. We then began our practice session with an inspection of the contiguous members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement. Next, we enabled the Automatic discretization method within the dimension attribute Properties pane, and then reprocessed the sample cube with which we were working to enact the new Automatic discretization of the select attribute members. Finally, we performed further inspections of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping established by the newly enacted Automatic discretization method.
In this article, we will gain some hands-on exposure to setting up another of the discretization methods supported by Analysis Services. We will first briefly review the options that are available (referencing their coverage in other articles, where applicable), and then work with Equal Areas discretization in the sample cube. (In individual articles designed specifically for the purpose, we will examine the setup of other discretization options, in a manner similar to previous articles within this subseries, gaining hand-on exposure to the use of those options in individual practice scenarios.)
Our examination will include: