OLAP and Data Warehousing – Data Warehousing Technology Review (Part 2)


Welcome to the follow-up of the first part in Data Warehousing Technology Review. Let’s jump right in.

So, the main idea underlying Data Warehousing is that carrying out
the analysis directly on OLTP-systems is ineffective and undesirable
.
Instead, the idea is to extract the necessary data from several OLTP-systems,
convert it, and then place it in a single data source — a Data Warehouse. This
process is also known as the ETL (Extract Transform Load)
process.

During the ETL process, the data is:




  • Cleared (unnecessary or housekeeping information is eliminated);

  • Aggregated (the sum and average values are calculated);

  • Transformed (data type conversion and storage  structures reorganization
    are executed);

  • Unified (from various data sources); and

  • Synchronized (resulting in one instant of time).

Data Warehouse is a ready platform for DSS (Decision Support Systems) and OLAP systems construction as it contains data having the following
properties:

 



  • Internal integrity
  • Although the data is preempted from several independent operating data
    sources (OLTP-systems), they should be joined by unified naming laws,
    values measurement ways, and common attribute sets. It has the highest value in
    those organizations where a number of various systems are exploited and/or
    similar data is presented in different ways (for example, logical value «True»
    may be stored as 1,-1, True or T. Other variants of this include the same metric within
    several systems being named differently – for example «balance»
    and «totals ».)

    At the stage of storage data loading, the above mentioned mismatches should
    be eliminated (i.e. the data should be checked out, cleared and presented
    in a uniform view). The Mandatory execution of such a procedure allows us to avoid
    many potential problems. Also, the analysis of such integrated data can be performed
    much easier.



  • «Single subject» focus
  • OLTP systems contain gigabytes of data that are rarely interesting for analysis
    — addresses, postal codes, record identifiers, etc. Such information
    is typically not loaded into a warehouse, which limits the considered data
    spectrum to a necessary minimum.

    The information in DW-storage is organized according to the primary aspects
    of company business activity (customers, sales, etc.); this distinguishes
    Data Warehouses from operating databases where the data are organized according
    to business processes (invoice shipment, calculations with customers, goods
    shipment, etc.). Subject orientation (due to corresponding data structures)
    promotes both analysis simplification and faster execution of analytical queries.



  • Historic nature
  • OLTP systems usually cover relatively limited time intervals, whereas
    Data Warehouses frequently contain data for up to several decades. This
    makes them an ideal base for revealing trends and long-term tendencies in business.

    From a technical point of view, the historical nature of data means that
    tables in storage contain a «temporary key», or data are distributed
    under several tables, each of them referring to a fixed-time interval.

     



  • «Read-only» availability
  • Depending of business needs, you choose one of three available methods
    to update data in a Warehouse Full Process – Completely restructures
    a cube based on its current definition and then its data recalculation;
    Incremental Update – Adds new data to a partition in the cube and
    aggregations updating. This method does not process changes to a cube’s
    structure (measures, dimensions) or changes to its existing source data.
    An incremental update creates a temporary partition from the new data and
    merges it into an existing partition); and Refresh Data – Clears and
    reloads a cube’s data and recalculates its aggregations. This method is used
    if the cube’s source data has changed but its structure has not.

    The
    hand-held data modification in storage is practically never performed (actually
    Microsoft SQL Server 7.0 OLAP Services cubes may be updated by the user
    (so called «write-back capabilities»), as it results in data
    integrity violation. Typically, it is not required to minimize the time of DW
    data loading, so the storage structure may be optimized for predefined
    queries execution, achieved by relational scheme de-normalization,
    preliminary data aggregation and, «of most use», indexes creation.


Thus, the data loaded in DW-storage are organized in a completely integrated
structure, have natural internal connections, and gain new properties that
add them the status of INFORMATION.




See All Articles by Columnist
Alexzander Nepomnjashiy

Alexzander Nepomnjashiy
Alexzander Nepomnjashiy
I am a Microsoft SQL Server Database Designer for Neo-Systems North-West - a security services, consulting, and training company. I have over eight years of experience in the IT field. I am currently working on several projects which involve the deployment of Microsoft Windows NT Server/Microsoft SQL Server within an enterprise business/financial environment. My typical role in these projects includes extending and improving our clients' corporate ERP systems to manage retail sales data, predict market changes and calculate trends for future market situations (DSS, OLAP). Also among my responsibilities are the design and administration of Microsoft SQL Server 7.0/2000 databases. I am available to work on a contract basis for the following types of projects: - Technical authoring, including books, articles, and white papers; - Network and systems design and analysis; - Database and software development and analysis; - Short-term consulting projects. I hope you find these articles useful. If you have any ideas for future articles (in a field of Microsoft SQL Server databases design, administration, performance optimization), or if you have anything to say about the ones below, please do not hesitate to contact me! Feel free to forward these articles to all interested associates. Thank You!

Latest Articles