Picking
Up Where We Left Off: The Financial Reporting Cube
As we recall, we set
out in Part I of this article to build a basic Financial
Reporting cube. Along the way, we discussed some of the high-level
requirements one would expect to find in standard financial reporting,
including balance sheets, income statements, accounts receivable and accounts
payable reports. Beginning with the creation of a core Expense cube, we
looked at the components of a cube that could be used to partially support a
simple Income Statement, introducing new concepts to our growing knowledge
about Analysis Services as they became useful. Throughout the lesson, we
re-traced many of the common steps we transition in building any cube, as we
began the construction of a sample finance cube for our hypothetical financial
reporting department.
In Part I we performed setup of the core cube, focusing first on the expense
side of the Income Statement. As we noted, the sample database that we used
does not include a "financial statements fact table;" it focuses more
on the revenues side of the equation in the construction of the ubiquitous
Sales cube that it presents (and that virtually all subject matter authors
select) as a handy model for tutorial and other purposes. Our focus is the
construction of a cube in a way that reflects reality - and that presents scenarios
where the data that we want to analyze are scattered in various tables
throughout the database.
As we begin the second
half of the lesson, our next step is to pull the sales data into our Financial
Reporting cube, and show how we can then integrate the expense and revenue
data in a way that revenues are matched with the direct costs of generating
those revenues. (Sales are matched with the costs of sales, i.e.
the "direct" costs of goods that have been sold, etc.), and netted
together to result in a Gross Margin amount. Other Expenses
(often called "fixed" expenses) then need to be subtracted out of the
Gross Margin to come to the overall Net Profit or Loss.
Our objective is to
illustrate the construction of a basis, in the form of an OLAP cube, for
the production of meaningful statements that reflect an organization's
operating results. There are many approaches available to us to accomplish our
objective within MSSQL Server 2000 Analysis Services, using only the FoodMart
2000 database provided as a sample in implementation of MSSQL Services
Analysis Services. The route we take, while perhaps circuitous from the
perspective of many, will represent an attempt to cover various aspects of one
general approach. There are certainly other, perhaps more efficient /
performance oriented methods for combining data from diverse sources, some of
which we will mention as we progress.
Adding the Revenue Component
As we have mentioned
above, financial reporting often pairs Revenues / Sales with the
associated Cost of Goods Sold, to arrive at a Gross Margin. As
many of us learned in school, an Income Statement, upon which this
article will focus, displays the income (in this case, used synonymously with
the term "Gross Margin"), expenses, and net earnings / net loss
for a particular period of time, such as a three-month period (a fiscal
quarter) or an annual period (a fiscal year). While income statements can
become quite detailed and sophisticated, depending upon the needs and reporting
requirements that drive their production, we will use a simple format within
this article, so as to focus more upon the OLAP aspects, and less upon
accounting terminology and nuances.
To simplify, the Gross
Margin represents the gross profit from sales activity. This gross
profit then has operating expenses (the "fixed" costs of
operating the business) applied. After subtracting these operating expenses
(which include overhead, among many other possible costs) from the contribution
made by the Gross Margin, we can determine the Net Income / Net Loss
of the operation. In our simple income statement structure, the Gross
Margin will be assembled from data in the sales fact table (in the form of store
sales and store costs - assumed here to be directly related, and
appropriately matched from an accounting perspective, for the sake of
simplicity). We will then derive Net Income from the combination of the Gross
Margin and the Expenses that we intend to derive from the Expense
cube we created in Part
I of this lesson.
Suffice it to say that
these, as well as far more complex, accounting treatments can be complicating
factors in the creation of a financial reporting cube. Our case, however, is a
great illustration of the steps we need to take to assemble a reporting cube
from multiple tables - a scenario that is pervasive in the business world.
Many large clients with whom I have acted as a consultant, such as brokerages
and large manufacturing concerns, maintain extensive sales data in one or more
data stores and detailed operating expenses in completely separate stores, for
both analysis and operational purposes. The complexity only grows when
multiple operations are combined to consolidate financial operations on a yet
higher level, and so forth. While centralized data sources (such as
warehouses or data marts) are typically constructed to provide a more unified
reporting platform, as often as not, I come across scenarios much like that in
our lesson, where parts of the reporting whole must be drawn from multiple
sources into an OLAP cube. MSSQL Server 2000 Analysis Services provides
excellent options for handling these scenarios, as we will see in this article
and throughout this series.
Creating the Cube - Revenue Focus
Because our intent is
to "marry" the Revenue and Cost of Goods Sold (and the Gross
Margin that results from the combination of the two) with the Operating
Expense that is summarized in the Expense cube, we need to plan a
cube structure for revenues that mirrors the structure, at least with regard to
the elements upon which we seek to report, of the Expense cube. When we
review the Expense cube as we created it in Part I, we see the following structure:
Illustration
1: The Schema for the Expense Cube
Now, let's take a look
at how sales_fact_1997, the table containing the revenue data from which
we will build the core Revenue cube, stacks up to the structure of the Expense
cube, from the perspective of dimensions and measures, by reviewing the table
structure, as displayed in Illustration 2 below.
Illustration 2: The Design View of the sales_fact_1997
Table (FoodMart.mdb)
It becomes readily
apparent that there are differences that must be overcome in creating a Revenue
cube that closely resembles the Expense cube. This is probably why most
of the articles and books that are readily available to us, at this youthful
point in the life of Analysis Services, tend to supply surrogate databases with
tables that make objectives like our own easier to accomplish, or to simply focus
on simplistic objectives that can be accommodated conveniently by the
existing FoodMart 2000 tables. The fact is that the requirement we have
identified in this comparison, a requirement to align divergent sources, is a
reality that most of us face everyday in the real world, for the reasons we
have noted earlier, and a host of other drivers.
My focus in this and
other articles is to provide practical options for dealing with requirements
that approach reality - while constraining the lessons to the sample
databases and other objects that are common to anyone who would be taking the
time to join our sessions - not to simply teach what is "easily possible"
within the FoodMart sample, or to require anyone who participates to
purchase or otherwise obtain a database that has been, in effect, created to
produce the desired results easily. Limitations of the samples aside, it is
still possible to illustrate many concepts in a realistic manner without
reliance upon outside tools or artificial environments.
To handle the situation
confronting us, we might take numerous approaches, but one that I found both
straightforward and "self-contained" within the FoodMart 2000
database and Analysis Services will be presented. To restate the detailed
objectives in building the Revenue cube, we need to create a cube whose
structure resembles the Expense cube with regard to any dimensions or
measures that will exist in the ultimate central cube. The Revenue cube
would, on the surface, appear to need the following fields to mirror the Expense
cube enough to align itself for "apples to apples" reporting
purposes:
- account_id
- amount
- category_id
- store_id
- time_id
We can safely forego
further consideration of currency_id, as we will assume, for the sake of
our lesson, that both tables are denominated in the same currency. The data
fields above will be useful to us in Revenue cube design and creation,
however, and would appear to need to be addressed. Let's discuss each item
briefly to understand the nature of the road ahead more fully.
First, the account_id
field simply does not exist in the sales_fact_1997 table. However, we
know that all the dollar amounts contained therein either represent sales,
or the costs of the sales (Costs of Goods Sold, in our
earlier discussion). The Expense cube relies upon the accounts
table in the FoodMart 2000 database to provide the data used in
the definition of its account dimension. The simple chart of accounts used in
the company's reporting is represented in the table view of the account table,
as partially pictured below:
Illustration 3: Table View of the
account Table (FoodMart2000.mdb)
We can reasonably
assume that the accounts assigned to the summarized cost of sales for a given
date-row is matched to the corresponding sales from the same date. We can also
see, from the chart of accounts in the account table above, the account
to which the sales are assigned (account_id 3100, Gross Sales),
together with the account to which Costs of Sales are assigned (account_id
3200, Cost of Goods Sold). We therefore know the account_id to
assign to each value. However, herein lies our next complication.
The values themselves
appear on the same row within the table, which means we cannot assign a single account_id
to each line to allow us to become comparable with the layout of the Expense
cube, which is generated from a fact table where one account_id exists
per row. A breakout of the data into a more accommodating fact table design
would be desirable, of course, but perhaps not be an option. I encounter more
complicated renditions of this issue within many client engagements,
particularly where the source system happens to be a relational or hybrid
database. Often these clients do not want to hear that they "need a data
warehouse." Whatever the scenario within which this complication is
found, we will explore one way of approaching it, realizing that many others
are within reach, but with an eye toward working within the constraints and
data structures that I have outlined earlier.
Next stop on the list
of alterations for alignment of the Revenue cube to the Expense
cube is the assignment of the name "amount" to the measure,
versus the currently assigned "store_sales" or "store_cost."
Once the values are rearranged in separate lines, we will see that this is
simple enough.
Category_id does not exist in the sales_fact_1997
table, although this would be simple enough to add. We will forgo this item,
however, as we can see (through a review of the category table) that,
although it appears to have been designed to contain ACTUAL, BUDGET, and other
such classifications (and probably does for a year with which we are not
concerned for our present purposes), a quick review of the expense_fact
table reveals that only ACTUAL appears to be used anyway. As we don't stand to
gain much from adding this to our Financial Reporting Cube (keep in mind
that an added dimension often means an exponential increase in processing
requirements, etc., especially as the warehouse / database grows over the
years), we will simply pass on adding it to the modifications list we are
compiling. Finally, the store_id and time_id requirements can be
obtained from the existing sales_fact_1997 table without further
manipulation.
Preparing the Source Data for the Revenue Cube
In
order to meet our outlined objectives of working within the existing FoodMart
2000 database, we will visit the Access environment briefly to establish a
means of creating a suitable "virtual" fact table to support the
desired Revenue cube design.
As most
of us are aware by this point in the series, FoodMart 2000.mdb is located, by
default in the [Install Directory]:\Program Files\Microsoft Analysis
Services\Samples directory; Once we have located the database on our PC's,
we will navigate to it, and open it in MS Access, where we see the layout shown
in Illustration 4 below.
Illustration
4: Selecting the FoodMart 2000 Database