Business Intelligence Terminology 101
January 7, 2011
Turning data into information and information into knowledge for the purposes of insight and fact-based decision support is the purpose of business intelligence. When delving into business intelligence for the first time, there are some important terms to understand. Deanna Dicken walks through some of the basic terminology needed to get you started.
BIDS is short for Business Intelligence Development Studio. This Integrated Development Environment (IDE) is Visual Studio 2008 with some additional project types for building SQL Server Integration Services (SSIS), SQL Server Analysis Services (SSAS), and SQL Server Reporting Services (SSRS) solutions.
At the heart of multidimensional databases (see OLAP below) are cubes. Cubes represent the many perspectives (or dimensions) with which you may want to view the data. Cubes contain levels, measures, dimensions, and hierarchies.
A data mart is typically a smaller, more focused version of a data warehouse (see below). Data marts bring together data for a single business unit or user group for the purposes of analysis.
For instance, you might create a data mart to analyze the operational processing times by step and product for a single business unit. This might help management glean knowledge of long running steps or maybe justify a new piece of equipment.
Data mining is a process by which algorithms and complex searches are applied to data in an attempt to discover patterns and relationships within the data.
When data from one or more data sources are brought together in a central store related to a common subject for the purpose of analysis, this is a data warehousing. The data store is called a data warehouse.
For instance, you might collect sales and product data from all lines of business within the company into a data warehouse to provide for analyzing product sales over time, geographic location and product or product category.
Dimensions can be thought of as perspectives on a subject. Let's say your subject of interest is sales. One way of analyzing sales is by time. How many widgets did we sell last month? Last quarter? Last year? This is a time dimension.
Maybe you want to look at your sales data by geography. What were our sales in Beijing? China? How about all of Asia? This is a geography dimension.
Product or product category might be useful. What was my best selling product? Are there any product categories that aren't hitting their sales targets? This is a product dimension.
Having multiple perspectives or dimensions with which to view the subject is very powerful. This allows you to put them together and get answers to even more detailed questions. What was the worst selling product category in Beijing last quarter?
Short for Extract-Transform-and Load, ETL is the means by which data is pulled from a data source (extracted), scrubbed (transformed), and written (loaded) into the destination database.
A fact table in an OLAP database is essentially the subject of the analysis. The dimensions can be thought of as foreign keys to this table or attributes of the fact. For instance, as described above, you might have sales as your fact table or subject. Attributes of a sale would be things like date and time, customer, product, geographical location, sales clerk, etc. All these characteristics of the sale could be your dimension tables.
Data within a dimension can often be represented through one or more hierarchies. These are parent-child relationships within a dimension. For instance, in a time dimension, a year is a parent to quarter, a quarter is a parent to month, and so on. As you see here the parent-child relationships can nest. These hierarchies then make it possible to drill up or down within a dimension.
A KPI is a Key Performance Indicator. These indicators are measures that are important to the business, typically relating to the overall health of its operation. An example would be profit margin. Profit margin can be trended over time. A significant deviation (for the worst) from expected values can alert management to drill into those measures of profit margin to see which might be causing the issue, allowing for a quick response.
Levels are an implementation of the hierarchies that represent the level of detail viewed in a dimension. In the example above for a time dimension's hierarchies, we had levels for year, quarter, and month.
Numerical data stored on a fact table is a measure. For instance, regarding sales fact, we might have a measure for cost, gross profit, maybe even order fulfillment time. These measures can then be summarized for each level of analysis.
When data is pre-calculated and stored in a cube, this is referred to as MOLAP or Multi-dimensional Online Analytical Processing.
Online Analytical Processing is a strategy for summarizing and analyzing large amounts of data using a multi-dimensional data model. Implementations of OLAP include MOLAP and ROLAP.
Unlike MOLAP, ROLAP (Relational Online Analytical Processing) does not require the data to be pre-summarized or stored in a multi-dimensional store. Instead, ROLAP allows for analysis and on-the-fly computations to occur against the relational database.
A snowflake schema refers to the arrangement of tables in a multi-dimensional data model where dimensions hanging off the fact table also have tables hanging off of them. The links in the data model start to resemble a snowflake with multiple branches hanging off the dimension tables and not the fact table. This occurs with normalized models in a multi-dimensional data store.
With a star schema for multidimensional modeling, the fact table sits in the middle of the diagram and all the dimensions hang directly off of the fact table thus resembling a star. The data in the dimension tables is denormalized to allow all other tables to link directly to the fact table.
SQL Server Analysis Services
Analysis Services is an enterprise level product that ships with SQL Server as a tool to help businesses gain insight into the organization's data and to perform predictive analysis.
I hope that after reading this article you have enough of an idea of basic business intelligence terminology to understand what it takes to design and build your first business intelligence solution.
For More Information