Analysis Services Enhancements
September 10, 2000
Microsoft implemented these data mining services using the OLE DB for Data Mining specification (http://www.microsoft.com/data/oledb/dm.htm) to allow other 3rd party vendors and application developers to integrate data mining with external applications. For example, using the OLEDB for DM spec, a developer can integrate data mining into their web site in order to predict buying patterns based on demographic profiles. Although no performance numbers are currently available, Microsoft has been using this technology in their MSN and MSNBC web sites for months.
Two algorithms will be included in the final release, Microsoft Decision Trees and Microsoft Clustering. These algorithms will be able to perform classification, prediction and clustering. Microsoft has also left the door open for the addition of new algorithms in the future to support other mining tasks such as deviation analysis, estimation, and affinity grouping.
Microsoft Decision Trees are used for classification. It constructs one or more decision trees that can be used to predict attributes based on other attributes. For example, classification can be used to classify credit applications as low, medium or high risk or identifying fraudulent behavior based on inputs such as age, sex, income or any other attribute.
Microsoft Clustering is used to group records into clusters that share similar attributes. The difference between clustering and classification is that clustering groups these records together based on self-similarity rather than predefined predictive input.
Microsoft has added some powerful new features in the cube and dimension design areas. OLAP Services supports parent-child dimensions that can represent structures like an organizational chart. Another new feature that parent-child dimensions can take advantage of is custom rollup formulas. In the organizational chart example, a supervisor's salary is not the sum of all of their personnel. Using a custom rollup formula, salaries within this organizational structure can be correctly calculated. OLAP Services now supports ragged dimensions or non-standard hierarchies where standard hierarchies don't apply. For example, in the US, cities rollup into states and states roll up into country. However, in other countries, city may only rollup into country. This structure can now be represented correctly.
Several performance enhancements have been added to OLAP as well. The addition of separate storage modes for large dimensions allows OLAP Services to support 10 Million+ member dimensions. Also, automatic member grouping effectively removes the 7.0 version's limitation of 64,000 children under a single parent. OLAP Services can also take advantage of SQL Server 2000's new indexed views feature by using these instead of relational aggregate tables.
However, the biggest performance enhancements are only available using the Enterprise Edition -- Linked Cubes and Distributed Partitioned Cubes. These features allow for transparent scale-out architecture for your OLAP servers. You no longer have to add more processors or memory; you can use distributed computing to linearly scale.
OLAP Services also now supports security at the dimension or member level. This allows for a more granular design of security. For example, you can design security roles to only allow users from within a particular division to see metrics for their own division or enterprise, but not other divisions.
OLAP Services has also become much more developer and analyst friendly with the additions of the MDX Builder, Actions, and Drillthrough. MDX can now be built graphically, UI applications can now respond to user activity by performing pre-defined actions such as displaying metadata, and source relational data can be accessed by drilling to the lowest grain of the fact table. OLAP Services also now supports multi-user administration -- no more being forced to wait while another user edits a different database on the same server.
OLAP Services also now supports cube and dimension designs such as Distinct Count, Default Members, member property ordering and dimension filters. Distinct Count was difficult to implement and poor performing in 7.0, but has been included as a new aggregate function type. Default members and member property ordering options create a richer user experience by allowing the cube designer to select the member and order that the end user will use most frequently.
These additions, as well as many others, should make Analysis Services the best choice for OLAP and Data Mining within your organization.