MDX Essentials: Numeric Functions: Introduction to the AVG() FunctionSeptember 13, 2004 About the Series ...This article is a member of the series MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs. For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials. Note: Service Pack 3 / 3a updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples. OverviewIn this lesson, we will introduce a commonly used numeric function in the MDX toolset, the AVG() function. The general purpose of the AVG() function, as we shall discover, is to return the average of the tuples occupying a set. We will consider elementary uses of the function in this article, and then explore more sophisticated uses in subsequent articles. For now, we will build a foundation in the basics. The AVG() function can be leveraged in activities that range from the simple to the complex, as is the case with many other MDX functions. We will introduce the function, commenting upon its operation and touching upon variations at a general level, and then we will:
The AVG() FunctionIntroductionAccording to the Analysis Services Books Online, the AVG() function "returns the average value of a numeric expression evaluated over a set." The function uses a numeric expression to indicate the base value for which the average will be calculated. (An example of the numeric expression might be the measure Warehouse Sales in the FoodMart 2000 Warehouse cube). The AVG() function ignores empty values found within the cells that are associated with the specified set. Its behavior with regard to empty cells can be circumvented at least a couple of ways, as we shall see in the next section. After discussing its operation in the next section, we will examine the syntax for the AVG() function. Next, we will undertake practice examples constructed to support hypothetical business needs that illustrate uses for the function. This will allow us to activate what we explore in the Discussion and Syntax sections, by getting some hands-on exposure in creating expressions that leverage the function. DiscussionTo reword our initial explanation of its operation, the AVG() function computes the average of the non-empty values populating the cells of the set specified within the function. Mechanically, this means that the total value (the sum) of the cells inhabiting the set is divided by the number (or count) of populated cells. A key concept here is that a behind-the-scenes count of the populated cells is taking place for use as the "divisor" in the calculation. Empty cells are not included in the divisor. In cases where we wish to count the empty cells, as well, we can force the inclusion of these cells by employing the COALESCEEMPTY() function. Another approach might be to work around the normal exclusion of empty cells by simply approaching the computation of the average in a different manner, such as by taking the results obtained by subjecting the set to the SUM() function, which we would then divide by the results returned by subjecting the same set to an appropriate COUNT() function. NOTE: For more information on the SUM() function, see my Database Journal articles Mastering Time: Period - to - Date Aggregations, and Calculated Members: Leveraging Member Properties (in the MDX in Analysis Services series), both of which contain references to the function. In addition, for a detailed look at the COUNT() function, see my article Basic Numeric Functions: The Count() Function, in the MDX Essentials series, also at Database Journal. Subjecting an empty set to the AVG() function returns the same result we obtain within a scenario where we divide by zero (commonly denoted by "1.INF" appearing within the affected parts of the dataset). Let's look at some syntax illustrations to further clarify the operation of AVG(). SyntaxSyntactically, the set we use to specify the range of cells for which we wish to calculate the average is placed within the parentheses to the right of AVG, and separated by a comma from the numeric expression we have described. The syntax is shown in the following string: Avg(Set, [, Numeric Expression]) The following example expression illustrates a use of the AVG() function. Let's say that information consumers from the FoodMart Logistics department, whose data is housed within the Warehouse cube, come to us with a straightforward request: The consumers wish to see the total national averages for Units Shipped for each of the Category groups composing our Non-Consumable products line. They wish the totals for the member Product Categories to be displayed, with the averages for the three countries in which FoodMart operates, Canada, Mexico and the USA, to be presented side by side. The basic AVG() function involved, within a core query that presents the information in the manner requested, would be constructed as follows: WITH MEMBER [Store].[Nat'l Avg] AS 'AVG( { [Store].[Store Country].Members}, [Measures].[Units Shipped])' SELECT { [Store].[Store Country].Members, [Store].[Nat'l Avg] } ON COLUMNS, {[Product].[Product Family].[Non-Consumable].Children} ON ROWS FROM [Warehouse] WHERE [Measures].[Units Shipped] Our query is simply expressing that we wish to retrieve the "average total units shipped for each of our Non-Consumable product categories, by country of store operation." We are ignoring time - the consumers are aware that the cube contains data from two years, and want the information to be based upon the cube as a whole. The query we construct returns a dataset similar to that depicted in Table 1.
The above example serves to illustrate the treatment, within the AVG() function, of empty cells within the specified set. The results demonstrate clearly that the function excludes empty cells in calculating the average we see. We will practice the use of the AVG() function in the section that follows. We will start with a relatively simple scenario, and then construct a second, slightly more sophisticated query.