Attribute Discretization: Using the Automatic Method - Page 3
April 17, 2009
Procedure: Employ the Automatic Discretization Method for Attribute Members in Analysis Services
When we browse a cube, we typically dimension the members of one attribute hierarchy by the members of another attribute hierarchy. For example, we might group customer sales by product purchased, by customer geography (state, province or country), or by customer gender. However, with certain types of attributes, it is useful to have Analysis Services automatically create groupings of attribute members, particularly when large numbers of contiguous values are involved, based upon the distribution of the members within an attribute hierarchy.
As an example, lets assume that we have been approached by representatives of our hypothetical client, the Adventure Works organization. These representatives, members of the Human Resources department of the business, tell us that one of the existing attributes of the Employee dimension, Vacation Hours, does not serve them well in browses and reports, as the many different values are simply listed, the members being derived from the unique Vacation Hours values in the VacationHours column of the DimEmployee table that underlies the Analysis Services layer of the business intelligence solution. (The developer of the cube, having returned to his home country, is no longer available to assist the HR department in modifying the Vacation Hours presentation into a more useful format.)
We listen to the description of the problem, and then suggest grouping the Vacation Hours information via the pre-defined Automatic discretization method offered within Analysis Services. We suggest this as a starter approach, so that our client colleagues can see how grouping the values in this manner might make the data more meaningful in browsing / reporting. We tell them that, once we do this, information consumers who browse the Vacation Hours attribute hierarchy will see the names / values of the groups instead of the members themselves. This limits the number of levels that are presented to users, which can be less confusing, and more useful for analysis. (We inform the client representatives, too, that other methods of discretization can be leveraged if the Automatic method proves less than ideal we are simply suggesting the Automatic method as a good starting point to illustrate an approach to meeting the end objective an approach that we can easily tweak once our colleagues understand the general concept of discretization, and can then make intelligent choices with regard to specific nuances that they might deem desirable.
Browse the Existing, Ungrouped Members within the Attribute Hierarchy
We will begin our practice with the Automatic discretization method within the Vacation Hours attribute hierarchy of the Employee dimension.
1. Within the Solution Explorer, right-click the Employee dimension (expand the Dimensions folder as necessary).
2. Click Open on the context menu that appears, as shown in Illustration 3.
The tabs of the Dimension Designer open. Lets look at the current state of the attribute under consideration, Vacation Hours.
3. Click the Browser tab.
4. Select Vacation Hours within the dropdown Hierarchy selector atop the Browse tab, as depicted in Illustration 4.
The attributes belonging to the Vacation Hours attribute hierarchy appear as partially shown in Illustration 5.
We note that many contiguous values appear. It is easy to see how grouping these values into logical buckets might make the data easier for information consumers to analyze.
Add Automatic Discretization within the Attribute Hierarchy
We will enact the Automatic discretization method from the Dimension Structure tab of the Employee dimension.
1. Click the Dimension Structure tab.
2. Within the Attributes pane, select Vacation Hours (the bottom attribute hierarchy in the pane).
The Properties window for the attribute appears (by default in the bottom right corner of the design environment), as depicted in Illustration 6.
3. In the Properties window, click the setting box to the right of the DiscretizationMethod property (which currently contains the word None), to enable the selector on the right edge of the box.
4. Select Automatic within the selector, as shown in Illustration 7.
The DiscretizationMethod property specifies the method used to group the members of the owner attribute. Once we tell Analysis Services the method of discretization, we can tell it how many buckets to create for purposes of grouping the attribute members.
5. In the setting box to the right of the DiscretizationBucketCount property (which currently contains the 0), replace the value for the DiscretizationBucketCount property with the number 10.
The DiscretizationBucketCount property specifies the number of buckets in which to discretize members of the attribute to which the property belongs. The default setting is 0. When we leave the DiscretizationBucketCount property at default, Analysis Services generates the number of groups independently, after sampling the underlying data involved.
The affected portion of the Properties window for the Vacation Hours attribute hierarchy appears as depicted (modifications surrounded by the red box) in Illustration 8.
We are now ready to process the Analysis Services database, and then to examine the results of our handiwork with the browser, as before.
Process the Analysis Services Database to Update Structure for Our Discretization Settings
We will process the Analysis Services database within which we have been working, and then browse the Vacation Hours attribute members to confirm that we have provided preliminary answers to the clients stated requirements.
1. Right-click the Analysis Services project atop the tree in the Solution Explorer.
2. Select Process ... from the context menu that appears, as shown in Illustration 9.
3. Click Yes on the dialog asking if you would like to save changes, which appears as depicted in Illustration 10.
Information updates on the server, and then the Process Database dialog appears, as shown in Illustration 11.
4. Click the Run button on the dialog.
The Process Progress viewer appears, and generates periodic status updates for various processing events. When processing is complete, we see a Process succeeded message appear in the Status bar in the lower part of the viewer, as depicted in Illustration 12.
5. Click the Close button on the Process Progress viewer to dismiss the viewer.
6. Click the Close button on the Process Database dialog to dismiss the dialog.
We are now ready to return to the Dimension Designer browser to examine the results of our handiwork.