Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Tips Database Forum Rss Feed

» Database Journal Home
» Database Articles
» Database Tutorials
MS Access
SQL Scripts & Samples
» Database Forum
» Slideshows
Free Newsletters:

News Via RSS Feed

Rss Feed

Database Journal |DBA Support |SQLCourse |SQLCourse2

Featured Database Articles


Posted Feb 7, 2005

MDX Essentials: The CROSSJOIN() Function: Breaking Bottlenecks - Page 2

By William Pearson

The CROSSJOIN() Function: Breaking Bottlenecks


As many of us know, the CROSSJOIN() function is highly useful anytime we wish to generate a cross-product of members in two different sets. The ability to specify "all possible combinations" is convenient - indeed, the most straightforward way to perform such a combination of two sets. Unfortunately, the indiscriminate use of the CROSSJOIN() function, like many other MDX functions, can slow reporting and analysis dramatically. This is often due to a failure to understand how the function performs set combinations, and how its action can lead to huge results datasets when applied to large cubes.

NOTE: For a detailed introduction to the CROSSJOIN() function, see my Database Journal article Basic Set Functions: The CrossJoin() Function.

In combining two sets, CROSSJOIN() combines every member of the first set (all from a single dimension) with every member of the second, creating a "Cartesian" effect as a result. Combining two sets, for example, with the following query will illustrate a scenario, on the scale of a decidedly small sample cube (the Warehouse sample that installs with MSAS), where we experience a taste of the consequences when the number of members in set 1, times the number of members in set 2, times the member population in set 3, results in many combinations.

    {[Measures].[Warehouse Profit]} ON COLUMNS,
   {CROSSJOIN([Warehouse].[Warehouse Name].Members,
        CROSSJOIN([Store].[Store Name].Members,
           [Product].[Product Name].Members))} ON ROWS

This query would generate 898,560 combinations (24 individual Warehouses times 24 Stores times 1,560 distinct Products). A similar scenario, with the cube sizes we see these days, could be crippling to performance. This is aggravated by the fact that sparsity is common enough in large cubes, and CROSSJOIN() results on those cubes would likely have a much higher sparsity factor. (The results dataset produced by the above query yields only a tiny fraction of combinations (tuples) with non-empty measures.)

The process of generating all possible combinations, empty or not, lies behind the performance drag, and becomes even more pronounced when we go a step further and attempt to perform query operations, as we shall see in the practice example, that must wait for the combinations to be assembled, and then be applied to the resulting dataset. The time consumed in assembling a large number of empty combinations has been wasted when they are "tossed" in a subsequent step in the march toward the ultimate results.

For purposes of our practice procedure, we will assume that we have been asked by management of a hypothetical client to investigate degradation in performance of a query. The query was originally constructed at the request of a group of information consumers in the Corporate Planning department, shortly after the implementation of MSAS at the FoodMart organization. The creator of the query, who initially wrote the MDX in a way that seemed intuitive, intended to optimize it later. Unfortunately, he was laid off in the wake of sudden moves by management to move all developer functions offshore.

Attempts to communicate with the offshore support team were abandoned when it was learned that the building housing the group had been destroyed in a natural disaster that had swept the region.  (The inability to obtain assistance with this issue paled, we are told, with the loss of
documentation, code, and other collateral for a host of enterprise projects in process at the time of the disaster.  Corporate financial information has also "gone missing" - a fact that has been seized upon by the media in short order.)

We listen closely to the requirement. We then develop a plan to examine the query under consideration, before offering options for improving its performance. As is typically the case, we decide to work with a copy of the affected cube (in this case, the Warehouse cube) to allow the original to remain isolated.

MS SQL Archives

Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM