MDX Essentials: Basic Set Functions: Subset Functions: The Subset() Function - Page 2
July 12, 2004
The Subset() Function
According to the Analysis Services Books Online, the Subset() function "returns «Count» tuples from «Set» as a set, starting at position «Start». Once we recover from the seemingly redundant explanation that is, in fact, a pretty clear representation of the operation of the Subset() function, we can see that Subset() works a little like the substring functionality that appears in various programming environments, query languages and other places. We are focusing on tuples and their positions relative to each other, as opposed to characters, but the similarities in concept are perhaps easy to recognize.
As we shall see, the order of the set elements remains intact within the operation of the function. We control the "range" of the function by providing a count, similar to the way we control the "reach" we obtain in other MDX functions - and similar to the way we use the numeric expression in the Head() and Tail() functions that we explored in our previous two articles. The difference is that we do not begin our "starting point" from either the left/beginning or right/ending "side" of the set, as do the Head() and Tail() functions, respectively (and a bit like LTRIM and RTRIM, we might note, in the string-based analogy we cited earlier). We can tell Subset() with which exact position to begin its work, and the number of elements to capture, by providing the associated «Start» and «Count» specifications.
We will examine the syntax for the Subset() function, then look at its behavior based upon different «Start» and «Count» input we might provide. Next, we will undertake practice examples constructed to support hypothetical business needs that illustrate uses for the function. This will allow us to activate what we explore in the Discussion and Syntax sections, by getting some hands-on exposure in creating expressions that leverage the function.
To restate our initial explanation of its operation, the Subset() function iterates through the elements of the specified set and constructs a set by adding the members in the directed range to the new set. The Subset() function starts at a point, or an index («Start» in the syntax model we show in the Syntax section below) that we designate within a set. The function acts to return a range of m tuples from a specified set. We specify m via the «Count» input we provide. The function "counts over" this number of members, "lassoing" them into selection for the new set it creates.
In a manner dissimilar to what we saw for the Head() and Tail() functions in the two immediately previous articles, Subset() manages the absence of a specified numeric expression for «Count» by "defaulting" to include all elements from the «Start» position to the end of the set. (Recall that the Head() and Tail() functions handled the absence of a specified numeric expression by substituting "1" as the range of elements "over" from the beginning and end of the specified set, respectively.)
Let's look at some syntax illustrations to further clarify the operation of Subset().
Syntactically, the set upon which we seek to perform the Subset operation is specified within the parentheses to the right of Subset, just as we saw with the Head() and Tail() functions in our previous articles. The syntax is shown in the following string.
Subset(<< Set >>, << Start >> [,<< Count >>])
We follow «Set», the set specification with a comma, which is followed by «Start», the starting position for the operation. «Start» is, in turn, followed by «Count», the count of members in the selection range. As we have mentioned, the omission of the count value means that the function simply selects all tuples from «Start», which is "position zero," to the end of the set. In specifying «Count», "0" represents the first member in the set, "1" the second, and so forth.
Within a scenario where the specified «Count» is greater than the number of tuples in the set we specify, the complete set, beginning from the «Start» position, is returned. Moreover, the input of a number less than 1 as the «Count» results in an empty set (indicated, for example, by a message in the MDX Sample Application that, because "the cellset ... contains no positions," it is unable to display a results dataset.
The following example expression illustrates the use of the Subset() function, within a context similar to that of an expression we used in discussing the syntax of the Head() and Tail() functions in the immediately preceding two articles. This will illustrate the similarities in the construction of the functions, while exposing the differences in the datasets that they return.
Let's say, again, that a group of corporate-level information consumers within the FoodMart organization wish to see the total Profits by U.S. Warehouse-Country for the last three Quarters of 1998. While we could easily accomplish this with the Tail() function, whose specialty is, after all, returning the "last of" anything, we can accomplish the same results with the Subset() function.
The basic Subset() function, which would specify the "last three Quarters" (the "children" of year 1998) portion of the required result dataset, would be constructed as follows:
Subset(.Children, 1, 3)
This expression would be equivalent to the expression from our last article, Tail(.Children, 3), and would return an identical result dataset. Assuming that we placed the Subset() function above within the column axis definition of a query, and the Warehouse-Country information defined the row axis, our returned dataset would resemble that shown in Table 1.
Table 1: Results Dataset, with Subset() Defining Columns
Just as we saw with the Tail() function in our previous session, Subset() has the effect of compactly expressing that we wish to display the Quarters as shown. The "starting point" is Q2 (position "1", as Q1 would be position "0" to the zero-based «Start» value), from which we derive the set (the Quarters of 1998), in their natural order, for three elements "distance."
The primary difference in the two functions, as we can readily see, is that the Subset() function can be used a bit more flexibly. It allows us to specify "starting point" in a given set, together with a "range" of selection, as opposed to the same selection capability, with fixed starting point at the beginning or end of the set, that we obtain using the Head() and Tail() functions, respectively.
As was the case with the Tail() and Head() functions, Subset() can be particularly useful in working with the Time dimension. Moreover, the same efficiencies we saw with the other subset functions can be obtained when Subset() is used in conjunction with "family" functions, as with the .Children function above. More compact, reusable coding is often the result.
NOTE: For information surrounding the .Children function, see MDX Member Functions: The "Family" Functions.
We will practice the use of the Subset() function in the section that follows.