Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Jun 23, 2009

Attribute Discretization: Using the "Clusters" Method

By William Pearson

This article continues my exploration of attribute discretization, a capability in Analysis Services that allows us to group members of an attribute into a number of member groups. Our concentration here will be to get some exposure to the pre-defined “Clustered” discretization method, one of three such pre-defined methods supported by Analysis Services, through hands-on application of the method to a representative dimension attribute within our sample UDM.

This article continues the overview of Attribute Discretization in Analysis Services begun in Introduction to Attribute Discretization, and continued in Attribute Discretization: Using the Automatic Method and Attribute Discretization: Using the “Equal Areas” Method . Both this article and its predecessor extend the examination of the dimensional model that we began in Dimensional Model Components: Dimensions Parts I and II. After taking up various additional components of the dimensional model in subsequent articles, we performed hands-on exploration of the general characteristics and purposes of attributes in Dimensional Attributes: Introduction and Overview Parts I through V. We then fixed our focus upon the properties underlying attributes, extending our overview into attribute member Keys, Names, Values and Relationships within several subsequent articles.

Note: For more information about my Introduction to MSSQL Server Analysis Services column in general, see the section entitled “About the MSSQL Server Analysis Services Series” that follows the conclusion of this article.

Introduction

In Introduction to Attribute Discretization, Attribute Discretization: Using the Automatic Method, and Attribute Discretization: Using the Equal Areas Method, I summarized preceding articles within the current subseries, consisting of a general introduction to the dimensional model. I noted the wide acceptance of the dimensional model as the preferred structure for presenting quantitative and other organizational data to information consumers. The articles of the series then undertook an examination of dimensions, the analytical “perspectives” upon which the dimensional model relies in meeting the primary objectives of business intelligence, including its capacity to support:

  • the presentation of relevant and accurate information representing business operations and events;
  • the rapid and accurate return of query results;
  • “slice and dice” query creation and modification;
  • an environment wherein information consumers can pose questions quickly and easily, and obtain rapid results datasets.

We extended our examination of dimensions into a couple of detailed articles. These articles, Dimensional Model Components: Dimensions Parts I and II, emphasized that dimensions, which represent the perspectives of a business or other operation, and reflect the intuitive ways that information consumers need to query and view data, form the foundation of the dimensional model. We noted that each dimension within our model contains one or more hierarchies. (As we learn in other articles of this series, two types of hierarchies exist within Analysis Services: attribute hierarchies and user - sometimes called “multi-level” - hierarchies.)

We next introduced dimension attributes within the subseries, and conducted an extensive overview of their nature, properties, and detailed settings in Dimensional Attributes: Introduction and Overview Parts I through V. We noted that attributes help us to define with specificity what dimensions cannot define by themselves. Moreover, we learned that attributes are collected within a database dimension, where we can access them to help us to specify the coordinates required to define cube space.

Throughout the current subseries, I have emphasized that dimensions and dimension attributes should support the way that management and information consumers of a given organization describe the events and results of the business operations of the entity. Because we maintain dimension and related attribute information within the database underlying our Analysis Services implementation, we can support business intelligence for our clients and employers even when these details are not captured within the system where transaction processing takes place. Within the analysis and reporting capabilities we supply in this manner, dimensions and attributes are useful for aggregation, filtering, labeling, and other purposes.

Having covered the general characteristics and purposes of attributes in Dimensional Attributes: Introduction and Overview Parts I through V, we fixed our focus upon the properties underlying them, based upon the examination of representative attributes within our sample cube. We then continued our extended examination of attributes to yet another important component we had touched upon earlier, the attribute member Key, with which we gained some hands-on exposure in practice sessions that followed our coverage of the concepts. In Attribute Member Keys – Pt I: Introduction and Simple Keys and Attribute Member Keys – Pt II: Composite Keys, we explored the concepts of simple and composite keys, narrowing our examination in Part I to the former, where we reviewed the Properties associated with a simple key, based upon the examination of a representative dimension attribute within our sample UDM. In Part II, we revisited the differences between simple and composite keys, and explained in more detail why composite keys are sometimes required to uniquely identify attribute members. We then reviewed the properties associated with a composite key, based upon the examination of another representative dimension attribute within our sample UDM.

In Attribute Member Names, we examined the attribute member Name property, which we had briefly introduced in Dimensional Attributes: Introduction and Overview Part V. We shed some light on how attribute member Name might most appropriately be used without degrading system performance or creating other unexpected or undesirable results. We then examined the “sister” attribute member Value property (which we introduced along with attribute member Name in Dimensional Attributes: Introduction and Overview Part V) in Attribute Member Values in Analysis Services. As we did in our overview of attribute member Name, we examined the details of Value. Our concentration was also similarly upon its appropriate use in providing support for the selection and delivery of enterprise data in a more focused and consumer-friendly manner, without the unwanted effects of system performance degradation, and other unexpected or undesirable results, that can accompany the uninformed use of the property.

In Introduction to Attribute Relationships in MSSQL Server Analysis Services, we examined yet another part of the conceptual model, Attribute Relationships. In this introduction, we discussed several best practices and design, and other, considerations involved in their use, with a focus upon the general exploitation of attribute relationships in providing support, once again, for the selection and delivery of enterprise data. In the subsequent two related articles, Attribute Relationships: Settings and Properties and More Exposure to Settings and Properties in Analysis Services Attribute Relationships, we examined attribute relationships in a manner similar to previous articles within this subseries, concentrating in detail upon the properties that underlay them.

With the next article, Introduction to Attribute Discretization, we introduced a capability in Analysis Services – to which we refer as attribute discretization - that allows us to group members of an attribute into a number of member groups. We discussed design, and other, considerations involved in the discretization of attributes, and touched upon best practices surrounding the use of this capability.

In Attribute Discretization: Using the Automatic Method, we introduced the first of multiple pre-defined discretization methods supported within the Analysis Services UDM. We discussed the options that are available, focusing upon the employment of the Automatic discretization method within the sample cube, to meet the business requirements of a hypothetical client. We then began our practice session with an inspection of the contiguous members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement. Next, we enabled the Automatic discretization method within the dimension attribute Properties pane, and then reprocessed the sample cube with which we were working to enact the new Automatic discretization of the select attribute members. Finally, we performed further inspections of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping established by the newly enacted Automatic discretization method.

Finally, in last month’s article, Attribute Discretization: Using the Equal Areas Method, we introduced the second of the pre-defined discretization methods supported within the Analysis Services UDM. We discussed the options that are available with this particular approach, as we did in the article previous for the Automatic method, focusing upon the employment of the Equal Areas discretization method, again within the sample cube, to meet the business requirements of a hypothetical client. We then began our practice session with an inspection, via the browser in the Dimension Designer, of the contiguous members of another select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement. Next, we enabled the Equal Areas discretization method within the dimension attribute Properties pane, and again reprocessed the sample cube with which we were working to enact the new Equal Areas discretization of the select attribute members. Finally, we performed another inspection, via the Dimension Designer and Cube Designer browsers, of the members of the attribute hierarchy involved in the request for assistance by our hypothetical client, noting the new, more intuitive grouping established by the newly enacted Equal Areas discretization method.

In this article, we will gain some hands-on exposure to setting up yet another of the discretization methods supported by Analysis Services. We will first briefly review the options that are available (referencing their coverage in other articles, where applicable), and then work with Clusters discretization in the sample cube. (In individual articles designed specifically for the purpose, we will examine the setup of other discretization options, in a manner similar to previous articles within this subseries, gaining hand-on exposure to the use of those options in individual practice scenarios.)

Our examination will include:

  • A brief review of attribute discretization in Analysis Services, potential benefits that accrue from discretization in our UDMs, and how the process can help us to meet the primary objectives of business intelligence.
  • A brief overview of the multiple pre-defined discretization processes supported within the Analysis Services UDM.
  • Examination, via the browser in the Dimension Designer, of the pre-existing members of a select attribute hierarchy, noting the absence of grouping and discussing shortcomings of this default arrangement.
  • Enablement of the Clusters discretization method within the dimension attribute Properties pane.
  • Reprocessing the cube to enact the new Clusters discretization of the select attribute members.
  • Another examination, via the browsers in both the Dimension Designer and the Cube Designer, of the members of a select attribute hierarchy, noting the new, more intuitive grouping established by the newly enacted Clusters discretization method.
  • Backward- and forward-looking references to previous and subsequent articles, respectively within our series, wherein we perform detailed examinations surrounding other details of discretization, as supported within the Analysis Services UDM.


MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM


















Thanks for your registration, follow us on our social networks to keep up-to-date