Building a Data Warehouse Blueprint for Success


One of the most integral components and critical success factors of any enterprise data warehousing initiative is the Solutions Architecture document, a high-level conceptual model of a data warehousing solution. Learn why this collaborative effort that addresses the needs of all major stakeholders, including both the business units and Information Technology (IT), is essential.

One
of the most integral components and critical success factors of any Data
Warehousing initiative is the Solutions Architecture document.

It is the high-level conceptual model
of a data warehousing solution. The solutions architecture creates the roadmap
of the possible solution, providing a context for the planning and construction
of a comprehensive future state environment. It essentially sets the framework
for the development of the data warehousing solution. It is a collaborative
document that addresses the needs of all major stakeholders including both the
business units and the Information Technology (IT). It transfers all of the
ideas, concerns, issues and pain points associated with the current
infrastructure and project from business questions to possible technologically
based answers. It reinforces the scope of the project.

Therefore, the Solutions Architecture
document should not be an academic exercise. It must be a high-level design of
the data warehousing solution specific to the project and should include the
following items.

ETL Framework

Using a complete set of business
requirements, the ETL framework will be used to build solutions to cleanse,
standardize and integrate data from disparate sources into the data warehouse
and/or operational data store.

Staging Area

The staging area is the information
hub primarily used in support of the ETL framework were source data goes
through a series of enrichment processes for the purpose of populating the data
warehouse and/or the operational data store.

Operational Data Store

The operational data store (ODS) is a
set of logically related data structures within a database that primarily
includes integrated, subject oriented, volatile, non-historical atomic data
used specifically in support of transactional processing and timely operational
reporting.

Data Warehouse

The data warehouse is the opposite of
the ODS in that it is a database that contains integrated, historical,
non-volatile, organizational level, atomic data used primarily in downstream
decision support initiatives such as business intelligence.

Data Marts

A data mart is a subject oriented set
of organizational level data created to support analytical reporting
requirements for a specific business unit.

Business Intelligence

Business Intelligence is a logical
grouping of applications and repositories that are specifically architected for
information delivery to the business community.

Steps in building a data warehouse solutions architecture document

As the saying goes, "a picture is
worth a thousand words", this is a great way to start the process. Once a
diagram has been developed, building the outline and related detail become
almost organic. This is especially true with the use of industry standard
enterprise architectural frameworks.

However, the architects on the project
team must choose a format for the solutions architecture document. In my years
of doing data warehousing, two formats are quite common. It can be a high-level
"how-to" guide with the requisite definitions of each of the data
warehouse components. Alternatively, it can be a "roadmap" with
definitions of each data warehouse component along with the descriptions of
related standards and guidelines that need to be adhered to within the Software
Development Life Cycle (SDLC).

The format selection is usually based
on the culture and the maturity of the organization and the team must fully
understand this aspect of the project in order to make the right selection.

The main thing is that independent of
the format chosen, the building of the solutions architecture is a
collaborative effort where the major project stakeholders from the executive
team, business units and IT must provide input into the document to create a
meaningful and usable asset to the organization.

In Today’s World

However, the audience for this
document has dramatically changed over the years. In the mid to late nineties,
this document was created only for the executive and the in-house project
teams. Since the 21st century, IT has gone through a technological
and cultural evolution of sorts and the ways things are done are quite
different now. The solutions architecture document now has a different audience
that it speaks to. This audience is a project team that is not quite
co-located, meaning that the team now includes the executive team, the in-house
project team, the outsourced and the offshore project teams.

Since the architects are the primary
authors of this document, it is therefore their responsibility to complete a
number of exercises specifically created to understand the organizational
communication plans, rules of engagement and project management procedures. It
may seem like this is too much of an extra effort, however if the whole idea
behind the document is to produce a workable solution that the entire project
team thinks is realistic and the executive team has bought into, it’s worth the
extra hours of time spent!

Don’t ever forget that the solutions
architecture document must be able to stand with other project assets within
the organization and also provide a comprehensive guide to the build out of the
data warehousing environment and its related components.

And yes! Of course, I have a couple of
stories that that I would like to share that illustrate these points!

The one that did not work

A few years ago, I was assigned to a
data warehousing project that would completely modernize an aging
infrastructure. My role was that of the data architect, I was responsible for
the design of the entire data layer. The project charter and scope had been
socialized and accepted. So, my thinking is that the easy part is done now the
real work begins. I just did not anticipate how real and tough it would be!

The approach was to have the project
manager author all documents produced from the project with the team (including
the architects) contribute as required in the project plan.

Well what do you think happened?

The project manager completed the
entire document and asked each of the architects to contribute only a diagram
and set of definitions for each component within the diagram. During the review
with the major stakeholders, I just remembered noticing lots of disinterested
looks, few questions being asked except from the IT department who were very
angry that they were not included in the creation of the document. The sections
of the document related to the data layer (including the data warehouse,
staging area, ODS and data marts) was criticized as not being based in reality,
not taking the existing standards into consideration. The whole document was
dismissed as an academic exercise by the IT review team. Needless to say, that
review did not go well and after the meeting was over, I basically walked out
of the conference room feeling like I was blindsided by a train that I did not
see coming!

So what finally happened to that
document? It succeeded in being an item that was checked off on a project plan
as being completed. Last I recall, it is now a very expensive doorstop or bookshelf
filler in somebody’s office! Not a good ending for such an important asset to
the project and the organization.

The one that did work

Recently, I have been working on
assignment as a data architect within a financial company. This project was
scoped to modernize and centralize an IT infrastructure that was extended over
the IT and business landscape. Meaning, IT supported some of the database
applications and the business units supported others.

My role was to complete sections of
the Solutions Architecture document related to the data layer. The approach
used here was very different to my prior assignment. The project management
team assigned the completion of the Solutions Architecture document to the
architecture team!

We agreed that with the completion of
each component of the solutions architecture, we as architects would socialize,
review, address issues, solicit input and provide clarifications with all major
stakeholders on the IT and business teams fully aware that this document would
be used by a project team that was not co-located and included offshore members
as well.

As we stepped through the process and
with every review, it became quite clear that this document had
"legs" to stand up with all other assets of the project as well as a
common belief being shared that the information being produced will be used
extensively as the data warehousing environment is rolled out over time! That’s
a much better ending than the last one! I can assure you this document will NOT
be the dusty pile of papers on any bookshelf anywhere at any time!

Additional Resources

Rensselaer Data Warehouse Project

Data Warehouse Architecture

»


See All Articles by Columnist

Denise Rogers

Denise Rogers
Denise Rogers
enise is a data architect with several years experience in the deployment of data architectures at major healthcare insurance companies and state government. She is a certified PMP that has designed and deployed a number of data solutions ranging from transactional to decision support within various architectural and project management frameworks. She has also spearheaded a number of efforts related to database environment assessments, capacity planning and performance tuning. In the past, Denise has held several user group positions including participation in International DB2 User Group (IDUG) and internal architectural groups. She has presented solutions to division heads at the within state government as well as conducted a number of company related training and information sharing sessions on database performance tuning techniques, best practices, etc . She has also mentored and coached project team members at various levels of expertise including university recruits, business users and senior IT staff. Denise graduated from Greater Hartford Community College Cum Laude in 1983 with an Associate’s degree in Management Information Systems.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles