Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Sep 13, 2004

MDX Essentials: Numeric Functions: Introduction to the AVG() Function

By William Pearson

About the Series ...

This article is a member of the series MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs. For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.

Note: Service Pack 3 / 3a updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.

Overview

In this lesson, we will introduce a commonly used numeric function in the MDX toolset, the AVG() function. The general purpose of the AVG() function, as we shall discover, is to return the average of the tuples occupying a set. We will consider elementary uses of the function in this article, and then explore more sophisticated uses in subsequent articles. For now, we will build a foundation in the basics.

The AVG() function can be leveraged in activities that range from the simple to the complex, as is the case with many other MDX functions. We will introduce the function, commenting upon its operation and touching upon variations at a general level, and then we will:

  • Examine the syntax surrounding the function;
  • Undertake illustrative examples of the uses of the function in practice exercises;
  • Briefly discuss the results datasets we obtain in the practice examples.

The AVG() Function

Introduction

According to the Analysis Services Books Online, the AVG() function "returns the average value of a numeric expression evaluated over a set." The function uses a numeric expression to indicate the base value for which the average will be calculated. (An example of the numeric expression might be the measure Warehouse Sales in the FoodMart 2000 Warehouse cube).

The AVG() function ignores empty values found within the cells that are associated with the specified set. Its behavior with regard to empty cells can be circumvented at least a couple of ways, as we shall see in the next section.

After discussing its operation in the next section, we will examine the syntax for the AVG() function. Next, we will undertake practice examples constructed to support hypothetical business needs that illustrate uses for the function. This will allow us to activate what we explore in the Discussion and Syntax sections, by getting some hands-on exposure in creating expressions that leverage the function.

Discussion

To reword our initial explanation of its operation, the AVG() function computes the average of the non-empty values populating the cells of the set specified within the function. Mechanically, this means that the total value (the sum) of the cells inhabiting the set is divided by the number (or count) of populated cells.

A key concept here is that a behind-the-scenes count of the populated cells is taking place for use as the "divisor" in the calculation. Empty cells are not included in the divisor. In cases where we wish to count the empty cells, as well, we can force the inclusion of these cells by employing the COALESCEEMPTY() function. Another approach might be to work around the normal exclusion of empty cells by simply approaching the computation of the average in a different manner, such as by taking the results obtained by subjecting the set to the SUM() function, which we would then divide by the results returned by subjecting the same set to an appropriate COUNT() function.

NOTE: For more information on the SUM() function, see my Database Journal articles Mastering Time: Period - to - Date Aggregations, and Calculated Members: Leveraging Member Properties (in the MDX in Analysis Services series), both of which contain references to the function. In addition, for a detailed look at the COUNT() function, see my article Basic Numeric Functions: The Count() Function, in the MDX Essentials series, also at Database Journal.

Subjecting an empty set to the AVG() function returns the same result we obtain within a scenario where we divide by zero (commonly denoted by "1.INF" appearing within the affected parts of the dataset).

Let's look at some syntax illustrations to further clarify the operation of AVG().

Syntax

Syntactically, the set we use to specify the range of cells for which we wish to calculate the average is placed within the parentheses to the right of AVG, and separated by a comma from the numeric expression we have described.

The syntax is shown in the following string:

Avg(Set, [, Numeric Expression])

The following example expression illustrates a use of the AVG() function. Let's say that information consumers from the FoodMart Logistics department, whose data is housed within the Warehouse cube, come to us with a straightforward request: The consumers wish to see the total national averages for Units Shipped for each of the Category groups composing our Non-Consumable products line. They wish the totals for the member Product Categories to be displayed, with the averages for the three countries in which FoodMart operates, Canada, Mexico and the USA, to be presented side by side.

The basic AVG() function involved, within a core query that presents the information in the manner requested, would be constructed as follows:

WITH
MEMBER
    [Store].[Nat'l Avg]  
AS 
   'AVG(  { [Store].[Store Country].Members}, [Measures].[Units Shipped])'
SELECT
{ [Store].[Store Country].Members, [Store].[Nat'l Avg] } ON COLUMNS,
    {[Product].[Product Family].[Non-Consumable].Children} ON ROWS
FROM 
   [Warehouse]
WHERE [Measures].[Units Shipped]

Our query is simply expressing that we wish to retrieve the "average total units shipped for each of our Non-Consumable product categories, by country of store operation." We are ignoring time - the consumers are aware that the cube contains data from two years, and want the information to be based upon the cube as a whole.

The query we construct returns a dataset similar to that depicted in Table 1.

Canada

Mexico

USA

National Avg

Carousel

 

427

427

Checkout

 

1,445

1,445

Health & Hygiene

 

11,513

11,513

Household

 

20,765

20,765

Periodicals

 

3,515

3,515

Table 1: Results of a Simple AVG() Function, with Empty Cells in the Specified Set

The above example serves to illustrate the treatment, within the AVG() function, of empty cells within the specified set. The results demonstrate clearly that the function excludes empty cells in calculating the average we see.

We will practice the use of the AVG() function in the section that follows. We will start with a relatively simple scenario, and then construct a second, slightly more sophisticated query.



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM


















Thanks for your registration, follow us on our social networks to keep up-to-date