SQL Server 2000 DTS Part 9 - DTS Designer Tasks - Page 2

February 10, 2004

As you can see within the displayed window, the multiphase data pump is divided into the following phases and sub-phases:

  • Pre Source Data - The first phase, executed only once per task, before the first row is fetched from the source. This can be used, for example, to create a log file, initialize data, or write a header row.
  • Row Transform - The second phase, encompassing the functionality of the entire SQL Server 7.0 Transform Data task, launched every time a new row is retrieved from the data source and completed prior to writing it to the destination.
  • Post Row Transform - The third phase, performing additional processing depending on the outcome of the second phase. This outcome determines which one of its sub-phases (On Insert Success, On Insert Failure, On Transform Failure) is invoked. On Insert Success is triggered when processing was successful, On Insert Failure indicates problems at the destination (such as inserting null data in a non-nullable column), and Transform Failure is commonly the result of type conversion errors (e.g. character to integer). Post Row Transform is typically followed immediately by the Row Transform phase, fetching the next row from data source, although whether this happens depends on a couple of factors. More specifically, Transform Failure or Insert Failure could result in task termination, depending on the value of the (previously discussed) max error count parameter (configurable from the Options tab of the Transform Data Task Properties dialog box). Other possibilities include the Batch Complete phase, if the batch size has been reached (discussed next) or the Post Source phase if the last row has been processed.
  • Batch Complete - The fourth phase, executing once per batch, once it is completed. This is where the Insert batch size parameter, mentioned earlier in this article (available from the Options tab of the Transform Data Task Properties dialog box), comes into play. With this functionality, you can divide your entire data load into more manageable pieces and handling cleaning and record keeping tasks separately for each.
  • Pump Complete - This takes place after the last row of the data is processed. Since actions performed during this phase have no access to source or destination, they are used typically for data-independent cleanup activities, such as global variable manipulation (which can be used as error counters).
  • Post Source Data - The final stage of the task, which provides data access. It is frequently used to write footer rows to the destination (although it can also include a number of other activities, such as global variable cleanup).

Division of processing into phases provides several benefits. It offers more control over data manipulation as well as error handling and recovery. Transformation can be restarted from the point of failure, without reloading already processed data.

Let's take a look at a sample ActiveX Script demonstrating the multi-phase capabilities of the Transform Data Task. For the sake of example, let's assume we will use an existing transform, producing table consisting of the content of the Shippers table with the total number of orders for each ShipperID, using the batch size of two (which should produce two batches). We will record the processing information in a text file (which name is stored in a global variable "LogFile") including such values as "TotalBatches", "TotalRows", and "FailedRows". Prior to any customization, auto generated code of our ActiveX Script transformation has the following format (Function Main is executed during the Row Transform phase):

Function PreSourceMain()
	PreSourceMain = DTSTransformstat_OK
End Function

Function Main()
	DTSDestination("Orders") = DTSSource("Orders")
	DTSDestination("Phone") = DTSSource("Phone")
	DTSDestination("CompanyName") = DTSSource("CompanyName")
	DTSDestination("ShipperID") = DTSSource("ShipperID")
	Main = DTSTransformStat_OK
End Function

Function TransFailureMain()
	TransFailureMain = DTSTransformstat_OK
End Function

Function InsertSuccessMain()
	InsertSuccessMain = DTSTransformstat_OK
End Function

Function InsertFailureMain()
	InsertFailureMain = DTSTransformstat_OK
End Function

Function BatchCompleteMain()
	BatchCompleteMain = DTSTransformstat_OK
End Function

Function PumpCompleteMain()
	PumpCompleteMain = DTSTransformstat_OK
End Function

Function PostSourceMain()
	PostSourceMain = DTSTransformstat_OK
End Function

Start by creating the "LogFile" global variable from the DTS Package Properties dialog box (you can find more information on the subject in one of our earlier articles) of type string and set its value appropriately (e.g. "d:\logs\MPDPump.log"). Then modify the ActiveX Script so it matches the following code:

'  Copy each source column to the destination column
Function PreSourceMain()
	Const Overwrite = TRUE
	Set oFSO = CreateObject("Scripting.FileSystemObject")
	Set oLogFile = oFSO.CreateTextFile(DTSGlobalVariables("LogFile").Value, Overwrite)
	DTSGlobalVariables("TotalBatches").Value = 0
	DTSGlobalVariables("TotalRows").Value = 0
	oLogFile.Close
	Set oFSO = Nothing
	PreSourceMain = DTSTransformstat_OK
End Function

'  Copy each source column to the destination column
Function Main()
	DTSDestination("Orders") = DTSSource("Orders")
	DTSDestination("Phone") = DTSSource("Phone")
	DTSDestination("CompanyName") = DTSSource("CompanyName")
	DTSDestination("ShipperID") = DTSSource("ShipperID")
	DTSGlobalVariables("TotalRows").Value = _
		DTSGlobalVariables("TotalRows").Value + 1
	MsgBox "Row #: " & DTSGlobalVariables("TotalRows").Value
	Main = DTSTransformStat_OK
End Function

Function TransFailureMain()
	TransFailureMain = DTSTransformstat_OK
End Function

Function InsertSuccessMain()
	InsertSuccessMain = DTSTransformstat_OK
End Function

Function InsertFailureMain()
	InsertFailureMain = DTSTransformstat_OK
End Function

Function BatchCompleteMain()
	DTSGlobalVariables("TotalBatches").Value = _
		DTSGlobalVariables("TotalBatches").Value + 1
	MsgBox "Batch #: " & DTSGlobalVariables("TotalBatches").Value 
	BatchCompleteMain = DTSTransformstat_OK
End Function

Function PumpCompleteMain()
	Const ForReading = 1, ForWriting = 2, ForAppending = 8
	Set oFSO = CreateObject("Scripting.FileSystemObject")
	Set oLogFile = oFSO.OpenTextFile(DTSGlobalVariables("LogFile").Value, ForWriting)
	oLogFile.WriteLine "Total Rows:"  & vbTab & DTSGlobalVariables("TotalRows").Value
	oLogFile.WriteLine "Total Batches:"  & vbTab & DTSGlobalVariables("TotalBatches").Value
	oLogFile.Close
	Set oFSO = Nothing
	PumpCompleteMain = DTSTransformstat_OK
End Function

Function PostSourceMain()
	PostSourceMain = DTSTransformstat_OK
End Function

This trivial example, which displays rows and batch numbers as they are being processed, and writes the total number of batches and rows to a log file, should help you to create scripts that are more elaborate. Refer to the Phased Transformation Samples topic in the SQL Server 2000 Books Online for additional help on this subject (you will find there samples demonstrating TransformFailed and InsertFailed functions). In our next article, we will cover the remaining two DTS tasks.

» See All Articles by Columnist Marcin Policht








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers