MDX Essentials: Basic Set Functions: The Union() Function
November 10, 2003
About the Series ...
This is the thirteenth article of the series, MDX Essentials. The series is designed to provide hands-on application of the fundamentals of the Multidimensional Expressions (MDX) language, with each tutorial progressively adding features designed to meet specific real-world needs.
For more information about the series in general, as well as the software and systems requirements needed for getting the most out of the lessons included, please see the first article, MDX at First Glance: Introduction to MDX Essentials.
Note: Service Pack 3 updates are assumed for MSSQL Server 2000, MSSQL Server 2000 Analysis Services, and the related Books Online and Samples.
What We Accomplished in our Last Article
In the last article of the series, Basic Set Functions: The Order() Function, we explored the versatile, Order() function, which we find to be useful in many queries - queries that rank from the simplest to the most advanced. The Order() function provides the sorting capabilities we need within MDX; its support of the two general types of sort order, hierarchized and nonhierarchized, make it an important part of our analysis toolsets.
In addition to discussing the purpose of the Order() function, we examined the syntax surrounding its uses, and illustrated its application in practice exercises, providing hands-on exposure to both hierarchized and nonhierarchized order types. Finally, we discussed the results we obtained in each exercise, remarking on the distinguishing characteristics of each.
In this lesson, we will focus our attention on another commonly used MDX tool, the Union() function. The rather straightforward purpose of the Union() function is to combine two sets into one, but the direct and indirect uses to which the function can be put are legion. Suffice it to say that Union() provides important capabilities within MDX, and is thus another valuable part of our analysis toolsets.
Along with an introduction to the Union() function, this lesson will include:
The Union() Function
The Union() function, according to the Analysis Services Books Online, "returns the union of two sets, eliminating duplicates by default." The function provides, moreover, a means of retaining duplicates for those scenarios where this might be desirable. As we shall see, alternative ways of accomplishing unions of sets, both with and without duplicates, are possible in Analysis Services using its extensions to OLE DB MDX.
We will examine the syntax for the Union() function in general, building to the union of two sets after creating an individual query for each, to show that the Union() function does, indeed, generate the results we might expect within the context of an illustrative business need. Next, our practice example will be modified to introduce the possibility of duplication within a Union() function, so that we can examine the means that MDX affords us for managing duplication. Finally, we will present alternative syntax where appropriate, illustrating its use in practice examples that parallel uses for the primary syntax we present, so as to make the comparisons more meaningful.
This will give us a richer understanding of the options we have within the Union() function, together with a feel for the similarities, as well as for the differences, of the results those options afford.
Union() allows us to combine sets. The primary syntax we use to take advantage of Union() affords us an optional flag to force the presentation of duplicates that might occur within the sets we are subjecting to the Union() function. The default is the elimination of duplicates, through the removal of duplicate members from the end ("tail") of the unioned set, as we shall see.
Analysis Services also presents a couple of alternate syntax structures that achieve the effects of Union(), and that carry implicit handling of duplicates within their definitions. As we mentioned earlier, the alternate syntax structures comprise extensions to MDX in OLE DB, and are thus relevant specifically to Analysis Services implementations.
The Union() function acts as a vehicle to allow us to meet myriad common and uncommon business needs, including the obvious need to bring together two sets into a single set, for presentation, computation, and perhaps other purposes. Additional uses I have seen include forced elimination of duplicates between two sets, in scenarios where this default behavior can accomplish an elimination where other attempts fail, or are more costly with regard to operating overhead and so forth.
Let's look at some syntax illustrations to further clarify the operation of Union().
Syntactically, the sets to be combined in the Union() function are placed within the parentheses to the right of Union, and separated by a comma. The optional ALL flag overrides the default behavior of the function with regard to duplicates, and allows duplicates to remain in the set that is produced. The syntax is shown in the following string:
Union(<<Set1>>, <<Set2>>[, ALL])
Alternate forms of the function also exist. These are shown in the following strings:
Alternate Syntax 1:
Alternate Syntax 2:
<<Set1>> + <<Set 2>>