Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Jan 26, 2004

MDX in Analysis Services: Optimizing MDX: More on Location, and the Importance of Arrangement - Page 5

By William Pearson

The Role of Expression Arrangement in Optimization

As many of us have learned, both in working with SQL and with MDX, the greatest strides in performance tuning can often be gained by tuning the expressions themselves. The simple rearrangement of a query can produce dramatic performance gains, as we will see in this section. We have discussed the control of resource use, to some extent, through the direction of a query's location of execution. We also exposed a scenario in the previous section where, even though the query might contain level populations that fall below the threshold for server processing, client processing can be more expensive than is apparent.

Let's return to our example above, and see what we can do in the way of arranging expressions to make our query less resource intensive. To do so, we will resume where we left off within the Sample Application:

1.  Create a new query to rephrase the last (saved as MXAS11-3), arranged as follows:

-- MXAS11-4:  Optimized Set Operation
SELECT
{[Measures]. [Units Shipped]} ON COLUMNS,
TopCount (
   Generate ( [Warehouse].[City].Members,
      TopCount ( CrossJoin ({ [Warehouse].CurrentMember}, 
         [Product].[Product Name].Members
          ), 7,  [Measures].[Units Shipped] 
       )
   ), 7,  [Measures].[Units Shipped] 
) ON ROWS 
FROM Warehouse

In our rearrangement of query MXAS11-3, we are exploiting the GENERATE function to rephrase our query to "determine the top seven tuples that we obtain from cross-joining the warehouse city and named products, for the current warehouse city. Then marry the results with the rest of the members of the warehouse city level."

The number of combinations that result within this query is 1,721 (1 x 1,560 named products, plus 7 x 23 cities). This represents a significant reduction over the number of combinations in the pre-optimized query, and results in a corresponding reduction of over 95 percent.

As we will see from the next step, the results will be identical to those in the pre-optimized version.

2.  Execute the query using the Run Query button.

The results dataset appears as shown in Illustration 6.


Illustration 6: Results Dataset, Optimized CrossJoin() Query

3.  Save the query as MXAS11-4.

Through this small exercise, we have illustrated the dramatic performance gains that can be obtained by simply rephrasing a query. To summarize the approach, we might keep in mind that we can replace complex CROSSJOIN() statements with a GENERATE() and CROSSJOIN() combination, as we have demonstrated above. The following general rule summarizes the approach:

CrossJoin (X,Y) = 
GENERATE(X, CrossJoin(X.CurrentMember, Y)

Or

CrossJoin (X,Y) = 
GENERATE (CrossJoin(X,Y.CurrentMember), Y)

Let's take a look at another scenario, where the objective of our query is the summation of two measures, Units Shipped and Warehouse Sales, for a given set of products, stores and warehouses (all products, in all non-U.S. stores and warehouses, in our example). We can approach the query in a couple of ways, using calculated measures to effect the summations.

Let's create a straightforward query first, where we create calculated members that we combine with the measures we seek to obtain the correct calculations.

1.  Create the following new query:

-- MXAS11-5:  Pre-optimized Set Operation # 2
WITH 
MEMBER [Product].[Non-US Total] AS
'SUM ({[Product].[All Products].[Drink],  
[Product].[All Products].[Food]})'
MEMBER [Store].[Non-US Total] AS
'SUM ({[Store].[All Stores].[Canada],   
[Store].[All Stores].[Mexico]})'
MEMBER [Warehouse].[Non-US Total] AS
'SUM ({[Warehouse].[All Warehouses].[Canada],  
[Warehouse].[All Warehouses].[Mexico]})' 
SELECT
{[Measures].[Units Shipped],  
[Measures].[Warehouse Sales]} ON COLUMNS,
{[Time].[1998]} ON ROWS
FROM Warehouse
WHERE ( [Product].[Non-US Total],  
[Store].[Non-US Total], [Warehouse].[NON-US Total] )

Now, let's run the query and note the results.

2.  Execute the query using the Run Query button.

The results dataset appears as depicted in Illustration 7.


Illustration 7: Results Dataset, Pre-Optimized Summation Query

3.  Save the query as MXAS11-5.

Now let's consider another approach to obtaining the same objective. This time, we will place each set of calculation logic into a respective calculated measure.

4.  Create the following new query:


-- MXAS11-6:  Optimized Set Operation # 2
WITH 
MEMBER [Measures].[Total Units Shipped] AS
   'SUM (
      {[Warehouse].[All Warehouses].[Canada], 
[Warehouse].[All Warehouses].[Mexico]},
      SUM (
         {[Store].[All Stores].[Canada],  
[Store].[All Stores].[Mexico]},
            SUM (
              {[Product].[All Products].[Drink], 
[Product].[All Products].[Food]},
              [Measures].[Units Shipped] 
)))'
MEMBER [Measures].[Total Warehouse Sales] AS
 'SUM (
      {[Warehouse].[All Warehouses].[Canada], 
[Warehouse].[All Warehouses].[Mexico]},
      SUM (
         {[Store].[All Stores].[Canada] ,[Store].[All Stores].[Mexico]},
            SUM (
              {[Product].[All Products].[Drink],[Product].[All Products].[Food]},
              [Measures].[Warehouse Sales] 
)))'
SELECT
{ [Measures].[Total Units Shipped], 
[Measures].[Total Warehouse Sales] } ON COLUMNS,
{[Time].[1998]} ON ROWS
FROM Warehouse

Again, we will run the query and observe the results.

5.  Execute the query using the Run Query button.

The results dataset appears as shown in Illustration 8.


Illustration 8: Results Dataset, Optimized Summation Query

6.  Save the query as MXAS11-6.

We note that the results are identical for the two approaches to the query (with the exception of the rounding of the Warehouse Sales amounts and the measure names, either of which could be adjusted, were they significant concerns). The second approach, however, performs appreciably better, as would be evident were the query extracting a much larger amount of data from a real-world data source. The following general rule summarizes the use of the commutative nature of the SUM function (and applies equally to the MAX, MIN, and TopCount() functions):


SUM (CrossJoin (X,Y), b) = 
SUM(X, SUM(Y, b))

In both the examples of performance enhancements we have explored above, and in any other potential performance enhancement scenario we might contemplate, we would need to consider the cost / benefit tradeoffs, actual performance metrics against different sources, and whether the differences in performance are too negligible to devote much time to attempting to increase performance further. The point is that tuning of MDX expressions can be accomplished, both in the manner suggested above or through other approaches, to maximize performance, and that beginning with the MDX statements themselves is usually the best approach to any optimization effort. Once the base query is tuned to an optimal state, we can always consider client / server balancing, hardware upgrades, and perhaps further measures to increase performance.

Summary and Conclusion ...

In this lesson, we continued our focus on the use of control of location as a primary intervention type within the realm of options available to us for MDX query optimization. We performed a practice exercise to reinforce the concepts exposed. We then extended our considerations of the available types of intervention to include the optimization of set operations and syntax arrangement considerations. Within our exploration of the importance of the optimization of set operations, we undertook practice examples that illustrated some ways we can rearrange queries to enhance performance, often significantly.

In our next lesson, the final article of the Optimizing MDX mini-series, we will, expose methods of caching to load a commonly used slice of a cube into memory, making for faster retrieval in prospective operations. Our discussion will include various aspects of cache creation, and uses of caching within MDX. In addition, we will touch upon other performance enhancement options, including external functions and cube design modifications and augmentation.

» See All Articles by Columnist William E. Pearson, III



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM