SQL Server 2008 and 2008 R2 Integration Services – Dimension Processing Data Flow Destination

In one of our recently published articles, we have described characteristics and operations of Partition Processing Data Flow Destination, which illustrates one of the many aspects of support for Business Intelligence-related features built into SQL Server 2008 R2 Integration Services. In particular, this SSIS component is capable of performing direct data load from a fact table into an arbitrarily chosen partition of a measure within an Analysis Services cube. In this presentation, we will present its counterpart known as Dimension Processing Data Flow Destination, which provides equivalent functionality in regard to dimensions.

In case you missed our earlier introduction to Analysis Services terminology, here is its brief summary. Cubes are data structures that consist of measures (most commonly, numeric values, which quantify events that need to be analyzed) and dimensions (providing context for these events and corresponding to categories according to which data is aggregated).

Dimensions, in turn, are described by their attributes. An example illustrating this paradigm (which we will employ in our demonstration) is provided by the AdventureWorksDW2008R2 sample data warehouse database, where sales amounts or order quantities represent measures, while customers, geographical locations, or dates constitute dimensions. In order to optimize analysis (by the client) and data processing (on the SQL Server side), multiple, related measures (including their aggregates, arranged based on appropriately chosen dimensions) are combined into measure groups, while attributes of dimensions form hierarchies. Partitions function as storage units for measure groups, with at least one partition allocated to each group.

Dimension processing involves reading records stored in a relational data source, mapping them to corresponding dimension attributes, and populating the latter in the manner described by their properties. In order to properly configure the Dimension Processing Data Flow Destination, you need to identify the correlation between columns in relational tables and their Analysis Services equivalents. This, in turn, requires familiarity with a few additional concepts that dictate the way dimension attributes are defined.

As we mentioned earlier, attributes facilitate categorization of dimensions (in geometrical terms, individual attributes function as coordinates of their dimensions in multidimensional space occupied by the cube they are part of). For example, the most common use of dimension is to provide chronological context of events, which typically involves such attributes as day, month, quarter, year, or day of the week.

Similar to relational structures, each dimension has a key attribute (which usually corresponds to the primary key in the underlying table of a relational database from which the dimension is derived). However, in addition, each attribute has also the KeyColumns property. Its role is to designate a column (or columns) in the relational database table with which the attribute is associated. In some cases this might result in rather confusing or, at best, an unclear naming convention – in particular, when values are numeric or when a composite key (with an attribute bound to multiple columns) is used.

These issues can be remediated by employing the NameColumn property, which contains either a reference to a more descriptive column in the source table or to an expression that yields a user-friendly outcome representing the numeric value.

Finally, you also have an option of assigning values to the attribute by leveraging its ValueColumn property (if the ones obtained from ColumnKey are not appropriate). Each of these properties can be managed using the Properties window of individual attributes, accessible from the Dimension Structure tab of the Designer interface (which, in turn, you can invoke by selecting View Designer entry in the context sensitive menu of any arbitrarily chosen dimension in the Solution Explorer window of Business Intelligence Development Studio).

We will step through two very straightforward examples illustrating the use of Dimension Processing Data Flow Destination. For the sake of simplicity, we will leverage predefined Analysis Services structures referencing data representing business operations of AdventureWorks – a fictitious company manufacturing and selling bicycles (with sample databases downloadable from the CodePlex web site, where you will also find a detailed procedure describing their deployment). The resulting Analysis Services installation includes Source Currency and Destination Currency dimensions, whose processing we will emulate.

To examine the properties of the first of them, select the View Designer entry from the context sensitive menu of the Source Currency.dim node in the Dimensions folder in the Solution Explorer window of Business Intelligence Development Studio. Once the Designer window is displayed, you will find that it consists of two attributes – Source Currency and Source Currency Code. Click on the first of them, switch to its Partitions tab, and expand the KeyColumns entry. As its content will reveal, the underlying TableID and ColumnID are, respectively, Currency and CurrencyName (similarly, Currency and CurrencyKey are used by the Source Currency Code attribute). Note that there is no Currency table in the corresponding AdventureWorks2008R2 relational database, since this name represents a DataTable structure defined as part of Adventure Works.cube design. You can easily confirm this by launching the Designer window displaying the cube’s content. Once you locate the target object, you will be able to determine that its QueryDefinition property is set to:

SELECT CurrencyKey, CurrencyAlternateKey, CurrencyName
FROM DimCurrency

In order to implement the Dimension Processing Data Flow Destination, we will use the above T-SQL query as the data source and map the resulting columns against the properties of Source Currency dimension attributes.

To start, in the Business Intelligence Development Studio, create a new project based on the Integration Services template. Switch to the Data Flow tab and activate the hyperlink displayed in the center of the main window stating ‘No Data Flow task have been added to this package’. Click here to add a new Data Flow task.

With the Data Flow task in place and active, drag an OLE DB Source component from the Toolbox onto the Designer interface. Display its Editor window by selecting Edit… item from its context-sensitive menu. In the Connection Manager section, click on the New… command button and point to the target instance of the AdventureWorksDW2008R2 database. After you have defined a new OLE DB connection manager, ensure that SQL command entry appears in the Data access mode listbox, and copy the T-SQL SELECT statement listed above into the SQL command text window. Preview the input data to verify that your settings are correct and close the Editor window.

Next, drag the Dimension Processing Destination from the Toolbox onto the Designer window; position it directly underneath the OLE DB Source component, and extend the Data Flow Path so the two of them become connected. Launch the Dimension Processing Destination Editor using Edit… context sensitive menu entry. Click on the New… command button to create an Analysis Services Connection Manager for the target cube (containing our designated dimension), which resides either on an Analysis Services instance or is part of a solution including an Analysis Services project. Select Source Currency entry along with the Full Processing method (depending on your requirements, you can choose Add (incremental), Full, or Data only), and switch to the Mappings section, where you need to match input columns (derived from the OLE DB Source and pointing, in our case, to the dbo.DimCurrency table in the AdventureWorksDW2008R2 database) with equivalent properties of dimension attributes:

  • Currency.CurrencyKey maps to Source Currency Code.Key
  • Currency.CurrencyAlternateKey maps to Source Currency Code.Name
  • Currency.CurrencyName maps to Source Currency.Key

The final (but optional) configuration step involves customizing the Advanced section, where you can set such parameters as Key error action (ConvertToUnknown or DiscardRecord), Processing error limit (Ignore errors or Stop on error after specific Number of errors), desired behavior following Specific error conditions (in case of such issues as Key not found, Duplicate key, Null key converted to unknown, or Null key not allowed), as well as assign Error log path. This yields a fully functional SSIS package.

We can apply the same approach to process the Destination Currency dimension. In this case, the KeyColumns of its Destination Currency, Destination Currency Code and Locale attributes reference Destination Currency TableID, which is defined (as you can verify by examining the cube design) as:

CurrencyKey, CurrencyAlternateKey, CurrencyName, 
   CASE WHEN CurrencyAlternateKey = 'ARS' THEN '11274' 
      WHEN CurrencyAlternateKey = 'AUD' THEN '3081' 
      WHEN CurrencyAlternateKey = 'DEM' THEN '1031' 
      WHEN CurrencyAlternateKey = 'GBP' THEN '2057' 
      WHEN CurrencyAlternateKey = 'MXN' THEN '2058' 
      WHEN CurrencyAlternateKey = 'CAD' THEN '4105' 
      WHEN CurrencyAlternateKey = 'SAR' THEN '1025' 
      WHEN CurrencyAlternateKey = 'EUR' THEN '2067' 
      WHEN CurrencyAlternateKey = 'FRF' THEN '1036' 
      WHEN CurrencyAlternateKey = 'BRL' THEN '1046' 
      WHEN CurrencyAlternateKey = 'JPY' THEN '1041' 
      WHEN CurrencyAlternateKey = 'CNY' THEN '2052' 
      WHEN CurrencyAlternateKey = 'VEB' THEN '16394' 
      WHEN CurrencyAlternateKey = 'USD' THEN '1033' 
FROM dbo.DimCurrency
WHERE (CurrencyKey IN (SELECT DISTINCT CurrencyKey
                  FROM dbo.FactCurrencyRate))

Once again, we will use this T-SQL statement to populate the SQL Command property of OLE DB Source in our sample package. In the Dimension Processing Destination Editor interface, select the Destination Currency dimension in the Connection Manager section. Next, switch to the Mapping section and configure it according to the following listing:

  • CurrencyName maps to Destination Currency.Key
  • CurrencyKey maps to Destination Currency Code.Key
  • CurrencyAlternateKey maps to Destination Currency Code.Name
  • LCID maps to Locale.Key

Finally, configure custom error settings in the Advanced section and execute the package to verify its functionality.

See all articles by Marcin Policht

Marcin Policht
Marcin Policht
Being a long time reader, I'm proud to join the crowd of technology gurus gathered here. I have a fair share of Microsoft exams behind me, a couple of acronyms after my name - MCSE, MCSD, and MCT, decent familiarity with majority of MS BackOffice products (SMS, SQL, Exchange, IIS), programming and scripting languages (VB, C++, VBScript with wsh) and several years of practical experience with Windows environment administration and engineering. My focus these days is on SQL Server and Windows 2000, and I'll attempt to share the most interesting experiences with these products.

Latest Articles