Another Approach to Local Cube Design and Creation
Our second approach to
creating a local cube will consist of a return to Microsoft Query, to
which we were briefly introduced in Introduction
to Local Cubes,
when we first began the creation of our PivotTable report, in the
creation of the database and connection phases of the PivotTable report
setup. We will use Microsoft Query to create a subset of a
relational data source, and then we will finish the cube creation process
using the OLAP Cube Wizard.
Start with a Relational Data Subset
Many situations arise
where we have neither a server-based cube nor an Analysis Server
available, but in which we can access a relational data source.
For example, we may become involved in the early phases of transitioning an
enterprise to the use of local cubes for remote users (say a team of
sales people or managers who need analytical capabilities on their laptops). We
may need to prototype an eventual solution by outfitting a representative test
group with basic cubes, in order to obtain useful feedback regarding the
contemplated design for a server-based generation and distribution
environment. After obtaining the specifics from the information consumers, we
can more effectively place the final design in production, and allow the server
to generate the cubes in automated fashion for field deployment.
The capability to
design and create cubes directly from a relational data source provides
an excellent opportunity to perform "proof of concept" exercises with
a reality-based model - a model that will often "prompt"
consumers to specify their requirements more clearly. Users that are
unfamiliar with OLAP and its uses can see the model in action, and can relate
it to their day-to-day needs, and signal suggested improvements from the
standpoint of both the overall cube structure (including the dimensions,
measures and other components that are critical to effective design), and
interface and usability considerations of a more mechanical nature.
Furthermore, local
cubes offer the capability of "parallel testing" proposed model changes
(without actually making them to a given production cube), keeping the
production cube build process operating in an insulated manner. Such "paralleling"
is useful both as a troubleshooting and continuous improvement process. For
example, many clients with whom I have worked in the past have asked me to investigate
issues with cube size and performance. The cube under consideration was often
the production model, so we were not afforded the luxury of simply taking it
offline to modify and tune its build cycle.
Using a "zero
based" approach to constructing a new cube (a rapid process using a local
prototype), based upon the currently desired reporting output of the production
cube, many issues were subsequently resolved through a close investigation of
what data in the cube was actually being used in the field; redundancies and
abandoned elements often came to light through a rapid redesign that was based
on current needs. Moreover, examination of the evolutionary "layers"
(which were gradually added onto the original cube structure as consumer needs
became known), at which summarization was being attempted, often led to the
derivation of a creation cycle that was more finely attuned to the effective
use of preprocessing and aggregation at the RDBMS level, where appropriate. In
situations such as these, and in many others, the generation of a local cube
from the relational source tables can be highly useful to the
organization.
As we have mentioned,
we begin the direct-from-relational creation of a cube by first deriving a subset
of the relational data. Preparation, as we shall see, involves 1) setup of
a data source for the relational database, and then 2) creation
of the query that specifies the selection criteria for the subset.
A query tool is needed to precisely identify the parameters of that subset, and
once again, MS Query provides a straightforward means of achieving the
creation of this subset (or rowset, as it is called when using
the SQL - based MS Query tool).
MS Query allows us to perform joins
between the dimension and fact tables, to create calculated columns, and to
otherwise prepare a "virtual warehouse / mart" on a miniature scale
from which to build a cube. After we establish the rowset, we use the OLAP
Cube Wizard to design the cube and its member objects (including measures,
dimensions, hierarchical levels and so forth).