Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

MS SQL

Posted Jan 12, 2006

SQL Server 2005 Integration Services - Part 16

By Marcin Policht

Data Flow Script Component

In our previous article of this series, we presented a quick introduction to Data Flow Script Component of SQL Server 2005 Integration Services and followed with more in-depth discussion about its debugging features. However, our coverage would not be complete without spending some time on exploring its versatility, which gives it the unique ability to function as a data source, destination, or transformation. Our goal is to review these three types of configuration as well as provide examples of their implementation.

As we mentioned before, you are prompted to choose one of the Script Component types as soon as you add the Script Component to the Data Flow area of the SSIS Designer interface (via Select Script Component Type dialog box). Your decision affects settings displayed in the Script Transformation Editor dialog box (primarily, the type of entries available in the Inputs and Outputs section and presence of Input Columns). In addition, the choice has impact on automatically generated Visual Basic .NET code, which you can view via the Script section (and obviously determines the content of custom code you will create).

We will start our discussion with the description and an example of the Source Script Component (which corresponds to the first option available in the Select Script Component Type dialog box). As expected, this selection yields configuration that does not require any inputs (Add Input command button is not present on the Inputs and Outputs page) and, consequently, eliminates the need to specify Input Columns (this section is missing from the Script Transformation Editor). The remaining two sections - Connection Managers and Script - allow you to specify the location of a data source and define a way to extract its content (and, potentially, perform some processing tasks), respectively. Customization of the component is carried out in two stages. During the first one, called Metadata Design mode, you define connections, identify SSIS variables to be available from within the code (by filling out the ReadOnlyVariables and ReadWriteVariables entries in the Script section of the Script Transformation Editor), as well as configure outputs and their columns - including assigning a descriptive name and data type to each. The second stage - known as the Code Design mode - involves programming (using Microsoft Visual Studio for Applications), resulting in code responsible for retrieval and manipulation of data to be passed to a next component in data path (in this code, you reference connections and output columns using names you specified in the first stage).

Source Script Component is useful in scenarios, where data source has a difficult to parse format (other than standard ones, such as delimited, tabular, fixed length, XML-based, or supported by OLE DB and .NET data providers). A fairly common example involves text files containing a number of multi-line records (equivalent to rows), with varying number of lines (corresponding to columns) per each. In this article, we will present an implementation of the Source Script Component, which processes such files. In our case, we will generate the file using WMI Data Reader Task, by querying Win32_Product Windows Management Instrumentation class, which contains the listing of all Windows Installer applications on a target computer. Keep in mind that in order to be able to execute this query on a Windows 2003 Server system, you need to ensure that the WMI Windows Installer Provider is installed. (Since this is one of the Windows components, this is accomplished from the Management and Monitoring Tools dialog box, accesible via Add/Remove Windows Components section of Add or Remove Programs Control Panel applet).

Launch SQL Server 2005 Business Intelligence Development Studio and create a new Integration Services project. Place a WMI Data Reader Task on the Control Flow area of the SSIS Designer (by dragging it from the Toolbox) and display its Editor (by selecting the Edit... option from its context sensitive menu). Set the WMI options according to the following listing:

  • WmiConnection - handled via WMI Connection Manager, pointing to the \root\cimv2 namespace. Server name and authentication settings will depend on the system you target and credentials you use when connecting to it - but if you are running the package locally, you can simply rely on the defaults.
  • WmiQuerySourceType - Direct input
  • WqlQuerySource - SELECT * FROM Win32_Product
  • OutputType - Property name and value
  • OverwriteDestination - Overwrite destination
  • DestinationType - File connection
  • Destination - use the File Connection Manager (we will refer to it as WMIProduct throughout this article - but the choice of name is obviously arbitrary), pointing to a local file (e.g. C:\temp\WMIProduct.txt) with the "Create file" option as the "Usage type" parameter.

You can test the functionality of the task using the "Execute Task" context-sensitive menu option to ensure that the WMIProduct.txt file is populated with appropriate content. A single sample data record (there should be one for every Windows Installer application installed on the target computer) should have the following format:

Caption, Microsoft SQL Server 2005 Tools
Description, Microsoft SQL Server 2005 Tools
IdentifyingNumber, {12345678-ABCD-4321-ABCD-123456789ABC}
InstallDate, 20051115
InstallDate2, 20051115000000.000000-000
InstallLocation, 
InstallState, 5
Name, Microsoft SQL Server 2005 Tools
PackageCache, C:\WINDOWS\Installer\5c059.msi
SKUNumber, 
Vendor, Microsoft Corporation
Version, 9.00.1399.06

Next, add the Data Flow Task to the Control Flow tab of the SSIS Designer interface and double click on it to activate the Data Flow area. Drag the Script Component icon from the Toolbox onto it, pick the Source option from the Select Script Component Type dialog box, and display the Script Transformation Editor (by choosing the Edit... context sensitive menu entry) with three entries on its left hand side, named Inputs and Outputs, Script, and Connection Managers. Select the first one and rename Output 0 to ProductOutput (or something else, more meaningful than the default). Expand the node, drilling down to Output Columns level and click on the Add Column command button. Assign a name Caption to the newly created column and ensure that its DataType property is set to string[DT_STR]. While you could repeat the same process for the remaining attributes of the Win32_Product class, based on the sample data record above, in our example (for the sake of simplicity), we will limit the selection to Name, Vendor, and Version only. Switch to the Connection Managers section and click on the Add... command button. Choose the WMIProduct entry from the drop-down list and specify a name via which you will reference this connection in your code (we decided to use WMIProduct). Highlight the Script entry on the left hand side and activate the Microsoft Visual Studio for Applications with Design Script... command button.

At this point, the code window will contain only automatically generated stub of the ScriptMain class, including the CreateNewOutputRows method (which purpose is clearly defined by its name). While there are several other methods, which can be used to further customize processing of the output (for their list refer to BufferWrapper and ComponentWrapper modules in Project Explorer of Visual Studio for Applications), we will deal exclusively with CreateNewOutputRows. Note default references to Microsoft.SqlServer.Dts.Pipeline.Wrapper and Microsoft.SqlServer.Dts.Runtime.Wrapper namespaces (rather than Dts.Runtime, used in the Script Task) in the Imports section. Since we intend to work with a File Connection Manager, add to them the Imports System.IO statement (to simplify notation required to access objects and methods in this namespace). Modify the remaining code by copying the following:

Imports System
Imports System.Data
Imports System.Math
Imports System.IO
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub CreateNewOutputRows()
        Dim oFile As File
        Dim oStreamReader As StreamReader
        Dim sLine As String
        Try
            oStreamReader = oFile.OpenText(Me.Connections.WMIProduct.ConnectionString)
            sLine = oStreamReader.ReadLine()
            Do While sLine IsNot Nothing
                If sLine.Split(Chr(44))(0) = "Name" Then
                    ProductOutputBuffer.AddRow()
                    ProductOutputBuffer.Name = sLine.Substring(6)
                ElseIf sLine.Split(Chr(44))(0) = "Vendor" Then
                    ProductOutputBuffer.Vendor = sLine.Substring(8)
                ElseIf sLine.Split(Chr(44))(0) = "Version" Then
                    ProductOutputBuffer.Version = sLine.Substring(9)
                End If
                sLine = oStreamReader.ReadLine()
            Loop
            ProductOutputBuffer.SetEndOfRowset()
        Catch e As Exception
            Me.ComponentMetaData.FireError(1, "Script Component", e.Message, "", 0, True)
        Finally
            oStreamReader.Close()
        End Try
    End Sub
End Class

As part of the CreateNewOutputRows method, we read the text file identified by the WMIProduct Connection Manager (line by line, in a Do While loop) until we reach its end. For each line, we check whether its first part (up to the first comma - which corresponds to the character 44 in ASCII - hence the Chr(44) reference and use of the Split method) matches the string "Name", "Vendor", or "Version". Whenever a new line containing the "Name" entry is detected (assuming that this constitutes the beginning of a new record), we invoke the AddRow() method. The row is populated with name, version, and vendor values until the next line starting with the "Name" string is encountered.

The use of Try... Catch... Finally structure ensures proper error handling and is a standard programming practice. It also takes care of releasing all resources involved in code execution, once the package completes (gracefully or as the result of a failure). To verify results, you can redirect the output to any of the destination components - the outcome should contain listing of names, vendors, and version numbers for all applications installed on a target computer.

» See All Articles by Columnist Marcin Policht



MS SQL Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
MS SQL Forum
Topic By Replies Updated
SQL 2005: SSIS: Error using SQL Server credentials poverty 3 August 17th, 07:43 AM
Need help changing table contents nkawtg 1 August 17th, 03:02 AM
SQL Server Memory confifuration bhosalenarayan 2 August 14th, 05:33 AM
SQL Server Primary Key and a Unique Key katty.jonh 2 July 25th, 10:36 AM