SQL Server 2000 DTS Part 9 - DTS Designer Tasks
February 10, 2004
In our previous article, we started discussing basic principles and applications of the Transform Data task. We will continue this topic here, focusing on its more advanced features. So far, we have described the purpose of four of the tabs of the Transform Data Task Properties dialog box in the most basic scenario, without resorting to the use of multiple processing phases.
The last tab of the Transform Data Task Properties dialog box, labeled Options, provides a number of choices controlling error handling and more advanced data transformation settings. Errors can be recorded in an exception file, with an arbitrarily assigned location and name. Its type can be either set to be compatible with SQL Server 7.0 implementation (in case you are using parsers written for this version) or customized by selecting separate files for error messages, source, and destination rows (with .source and .dest extenstions, respectively) whenever transform fails. In either case, you can assign formatting details, such as characters used for row and column delimiters, as well as text qualifiers. The next section on the Options tab determines data movement parameters, such as maximum error count (maximum number of errors that can occur before a task is terminated), fetch buffer size (number of rows fetched from a source at a time), first, and last rows (allows limiting the size of transformation, skipping header and footer rows, and resuming transfer in case of a partial failure). Finally, the last section on the Options tab, applicable strictly to SQL Server data transforms, allows you to force fast data load (which is recommended from the performance point of view, although it affects recoverability and should be followed by a full backup). Once enabled, this setting makes it possible to configure a number of suboptions, such as transferring NULL values to a destination column with a default value assigned, or behavior relating to constraints check, table locking or identity insert. You can also specify for data to be processed in batches. The value of the batch size parameter indicates the number of rows to be considered as a single unit (when set to 0, the entire data is loaded in one batch). We will explain the significance of this setting shortly.
The main advantage of the SQL Server 2000 implementation of the Transform Data Task over its SQL 7.0 predecessor is its granularity. The entire data transformation process can be divided into several distinct phases. By default, however, this functionality is hidden. In order to expose it, you need to go back to the SQL Server Enterprise Manager interface and select the Properties item from the context sensitive menu of the Data Transformation Services folder. This will display the Package Properties dialog box, with the Designer section containing a single checkbox labeled "Show multi-phase pump in DTS Designer." Note also, the other settings available within the same page (even though they are not related directly to tasks, you will find them useful in other situations). In particular, you can turn on caching in order to speed up the process of creating and opening packages in the DTS Designer (this setting causes caching of DTS tasks, transforms and OLEDB providers registered on the SQL Server, so they do not have to be reloaded every time Designer is launched). Keep in mind, though, that if you mark the corresponding checkbox, you will need to use the Refresh Cache button (or restart the Designer) in order to make newly installed providers available. Enabling the debugging option simplifies troubleshooting ActiveX scripts by employing the default Windows debuggers in error handling. Finally, you can also allow saving packages to Meta Data Services, which we will be discussing in a future article of this series.
Once the "Show multi-phase pump in DTS Designer" checkbox is marked, click on the OK button, open the Local Packages folder, and open the package we created previously (for steps describing its creation refer to the previous article of this series) in the DTS Designer window. Bring up the Properties dialog box for the Data Transform Task within this package. Make sure that the Source tab contains the following SQL query:
SELECT Shippers.ShipperID, Shippers.CompanyName, Shippers.Phone, COUNT(OrderID) AS Orders FROM Shippers INNER JOIN Orders ON ShipVia = ShipperID GROUP BY Shippers.ShipperID, Shippers.CompanyName, Shippers.Phone
Since we used tempdb as our destination data store, it is likely that you will need to recreate the target table (otherwise you will receive an error message indicating that the target table does not exist). To verify whether this is the case, display the content of the Destination tab. If the table structure is not listed there, click on the Create... button and use the following SQL statement:
CREATE TABLE [ShippersOrders] ( [ShipperID] int NOT NULL, [CompanyName] nvarchar (40) NOT NULL, [Phone] nvarchar (24) NULL, [Orders] int)
Next, switch to the Transformations tab. First, notice an additional, previously not present list box labeled "Phases filter." By default, it will contain the "Row transform" entry, since this is the only one that existed in SQL Server 7.0 and one that is exposed by default in the SQL Server 2000 implementation. However, you can easily verify that the list box contains a number of other entries, corresponding to all phases (and sub-phases) of the Transform Data Task. In order to explore the full functionality of multiphase data pump, we will use ActiveX Script, so remove any existing transformations with the help of the Delete button. Click on New, ensure that ActiveX Script entry is selected in the Create New Transformation dialog box and confirm your choice. In the Transformation Options dialog box, you will notice an additional tab named Phases. From here, you can specify which phases (and sub-phases) your transform will be using. As indicated before, only the Row transform function option is marked. We will modify it shortly. Modify the content of the Source and Destination tabs so that all available columns are listed as selected. Switch to the General tab and click on the Properties... button. This will display the ActiveX Script Transformation Properties dialog box, with all phases listed on the Phases tab (tabs appear in the lower portion of the dialog box). Select the checkboxes next to each of them and auto generate the relevant code with the Auto Gen. button. As soon as this happens, the code will appear in the code window on the right hand side.