OLAP and Data Warehousing - Data Warehousing Solution Architecture (Part 1)July 24, 2002
Data Warehouse is not a software product but is instead the architecture used
for enterprise-level business intelligence information systems organization.
Data Warehouse storage (database) is a central part of such systems; however,
a full-function Data Warehouse solution consists of complementary products
and technologies. The minimal typical components of a decision support system
are:
Data acquisition is the task of bringing data from everywhere to the
data warehouse. Most businesses have several operational systems that handle
the organization's day-to-day processing. These systems serve as the data
source for the warehouse. The systems might reside on a mainframe, in a
client/server database application, in a third-party application with a
proprietary data store, within desktop applications such as spreadsheets
and database applications, or any combination of these. The challenge is
to identify the data sources and develop a solution for extracting and
delivering the data to the warehouse in a timely, scheduled manner.
After collecting the data, you need to transform it. In an ideal organization,
all systems would use the same set of codes and definitions for all data
elements. Data transformation is the cleansing and validation of data for
accuracy and ensuring that all values conform to a standard definition.
After these data transformation tasks are complete, you can add the data
to the warehouse
ETL tools should be reliable and efficient. They should also have known
flexibility, submitting the possibility of configuration and programming
with usage of transformation script language. One more essential
requirement to these tools -- the possibility to access a broad range of data
sources usually provided by the general purpose data access interfaces such as ODBC, OLE DB, JDBC, etc. The industrial software giants
only recently have begun to release products possessing the described
functionality; therefore, for many years it was necessary to use proprietary
solutions oriented at the concrete system, or customized products from small companies such as Prism Solutions, ETI, Carleton and similar. Products released recently from large corporations include Visual Warehouse (from
IBM) and Data Transformation Services (from Microsoft).
The repository provides information about the location and nature of
operational data, which is critical in the construction of a data warehouse.
It acts as a guide to the warehouse data, storing information necessary
to define the migration environment, mappings of sources to targets, translation
requirements, business rules, and selection criteria to build the warehouse.
|