Partition a Cube in Analysis Services
Objective and Business Scenario
Partitions allow us to divide a logical cube
into separate "storage containers" for our data and its
aggregations. All MSAS cubes have at least one partition, which is created
along with the cube automatically. Additional partitions are not a requirement,
and, indeed, many cubes perform well enough to meet our business needs with
only the partition with which they were originally created. Nevertheless, very
large cubes can benefit dramatically from a well-conceived partitioning
strategy, whose benefits include enlarged flexibility in data storage and data
source location, as well as enhanced query performance.
The creation
of separate physical partitions means that we can design cube storage around our
expected uses, particularly our expected access frequency, of the data in the
partitions we design. An example of a common partitioning strategy that I see
on client engagements includes the creation of one partition for each quarter
of the current operating year, together with a separate partition for each
prior year.
Because
cube querying, and reporting in general, is typically at its highest frequency
in the current year and its constituent quarters, we would likely want
to design the "current" partitions with MOLAP storage, along
with the aggregations required to deliver rapid query results. Prior years'
partitions would receive perhaps less aggregation, because the further in the
past a given year from the current year, the less routine demand we would anticipate
for more than summary information. The need for access to the partitions in
general would decrease with age, so the storage design strategy for
those years' data might become more focused upon size conservation, and less upon
query performance.
A cube containing financial
information can, therefore, contain a partition for the data of each past year,
and also partitions for each quarter of the current year. At the end of the
year, the four quarterly partitions can be merged into a single partition for
the year.
The "separate
container" concept means each partition can have different combinations of
options, enabling us to design cube storage strategies appropriate to our
unique business requirements. These combinations include:
-
source data
location;
-
aggregation data location;
-
storage mode;
-
aggregation
design.
Many variations with
regard to partition data sources are possible; we will investigate some
of the alternatives in subsequent articles. We can store partition aggregate
data on the Analysis server computer where the partition is defined (the
default) or on another Analysis server computer, as a remote partition.
Storage mode determines whether the partition's aggregate data is stored
in a database specified in the partition's data source (ROLAP), stored completely
within a cube on an Analysis server computer (MOLAP) or stored within a custom
combination of the two (HOLAP).
Aggregation design, which specifies the number and
contents of the aggregations created for the partition, can be different for
each partition. As we learned in Basic Storage
Design, we can tailor a partition's aggregation design, defining
the balance between query performance and storage utilization, with the Storage
Design Wizard. Moreover, with the Usage-Based Optimization Wizard, we
can perform these same actions, as well as optimizing the aggregation design
based on queries (which we can select) that the organization's information
consumers have previously executed against the partition's cube. As we noted in
our articles surrounding Storage Design
and Usage-Based Optimization, the
aggregations that we design in either case are actually created as a part of
the subsequent cube processing cycle.
Considerations and Comments
For purposes of this
exercise, we will prepare a copy of Budget cube, one of the
sample cubes within the FoodMart 2000 sample database that accompanies
the installation of MSAS. The "clone" will allow us to leave the
original sample cube in its existing condition, as we might have saved various
settings, structures, and so forth, for referential or other reasons. There
will therefore be no need to remember to return and remove settings that we
modify for purposes of the lesson, or otherwise restore the original sample to
its previous state.
While the cloning
process is simple, ensure that you have the authority, access and privileges
needed to accomplish the process, and that the copy of an existing MSAS cube
(which we will, of course, rename) presents no other issues in your
environment. After the session, the clone we have created can be deleted or
used for another purpose, whatever is convenient.
If the sample database
/ Budget cube was not installed, or was removed prior to your beginning
this article, please see the MSAS documentation, including the Books
Online, for the straightforward procedure to restore the database from
the archive (.cab) file containing the samples. As of this writing, a
copy of the archive can be obtained from the installation CD or via download
from the appropriate Microsoft site(s).