Data Transformation Services (DTS)


Overview


Lookups are a very powerful weapon in your DTS arsenal, with more uses than
simply looking a single value. In this article I will try and cover the different
uses I have come up with. I will avoid covering too much of the basics of what a
lookup is since you already have Books Online to do this.


I have created a sample process loosely based on loading data between hospital systems. Your raw data is episodes of care for a patient.
The download contains the raw data files (EpisodeData1.csv, EpisodeData2.csv), a script to create the
database objects (DTSLookup_EpisodeSampleObjects.sql), and a DTS package that
demonstrates the methods shown below. You will notice that I have used a separate connection for the lookups, as this improves performance.

ActiveX Script Transformation:         Lookup T-SQL:     


Raw Data

EpisodeData1.csv
EpisodeID PatientID LocalSpecialty StartDate EndDate LengthOfStay
12345678 563625 12 20000112 20000114 3
56878945 573456 12 20000201 20000115 15
24598756 563625 22 20000405 20000106 2
12547899 258756 22 20000604 20000611 14
12547866 258756 22   20000611 14
36547895 258756 22 20002406 20002407 0
96988789 254688 12 20000701   0

EpisodeData2.csv
EpisodeID PatientID LocalSpecialty StartDate EndDate LengthOfStay
96988789 254688 12 20000701 20000702 2


Simple Lookup


Just for completeness we will start with a simple lookup, which returns one value.
In our sample we use this to transform our treatment specialty from a local code
into a nationally recognised code.

' TransformSpecialty - Simple Lookup 
' (Assign to variable for later use)
iNationalSpecialty = DTSLookups("TransformSpecialty").Execute(DTSSource("LocalSpecialty").Value)
DTSDestination("NationalSpecialty") = iNationalSpecialty
SELECT [National]
FROM LocalSpecialty
WHERE (Local = ?)


Multiple Arguments & Multiple Values


This example is slightly more advanced in that it requires two arguments to return the
result, which itself consists of two values. The multiple arguments is quite obvious,
however the multiple result values are slightly more complicated as it they are returned
in an array. Another feature we have used in this lookup is the ability to call stored procedures as well a the usual full SQL statement.
In our sample we use this derive a resource code and high level banding for an episode of care.
The resource is obviously related to the length of stay in hospital, but is also affected by the specialty.

' ResourceAllocation - Multi-Argument, Multi-Value
aResourceValues = DTSLookups("ResourceAllocation").Execute(iNationalSpecialty, _
	DTSSource("LengthOfStay").Value)
DTSDestination("ResourceCode").Value = CStr(aResourceValues(0))
DTSDestination("ResourceLevel").Value = aResourceValues(1)
spDTSLookup_MultiDemo ?, ?


Writing Data


Lookups aren’t restricted to SELECT statements, and in this example we write
episodes that have invalid dates to a log table for further investigation. The date
validation requires that a valid episode start date is present and either a valid
end date or no end date, which denotes an unfinished episode. In addition the the
source data has dates in yyyymmdd format, which is not a valid VBScript date format, so we use
our own function to convert them into yyyy-mm-dd.

' Check Dates for valid StartDate, and valid EndDate, or no EndDate
If (IsDate(DBDate(DTSSource("StartDate"))) And IsDate(DBDate(DTSSource("EndDate")))) _
  Or (IsDate(DBDate(DTSSource("StartDate"))) And IsNull(DTSSource("EndDate"))) Then

...

Main = DTSTransformStat_OK
Else
	DTSLookUps("LogError").Execute DTSSource("EpisodeID"), DTSSource("PatientID")
	Main = DTSTransformStat_SkipInsert
End If

...
Function DBDate(sDate)
	If Len(sDate) = 8 And IsNumeric(sDate) Then
		DBDate = Left(sDate, 4) & "-" & Mid(sDate, 5, 2) & "-" & Right(sDate, 2)
	ElseIf IsNull(sDate) Then
		DBDate = Null
	Else
		DBDate = "InvalidFormat"
	End If
End Function	
INSERT INTO EpisodeLoadLog (EpisodeID, Error)
VALUES (?, 'Date validation error (PatientID=' + ? + ')')


Deleting Data


The final example deletes data. Taking the concept of unfinished episodes outlined above,
we would expect to get those previously unfinished episodes as completed ones in the
next load. Remember in the initial load we did not have enough information to allocate
our resource information correctly as the length of stay was 0.
In this scenario we will assume that we should only get “duplicate rows” for episodes
that have now closed, so we use the lookup to delete any existing rows for the
specified episode, if they have a NULL end date.

' DeleteUnfinishedEpisode - Deletes an existing episode if it has a no EndDate
DTSLookups("DeleteUnfinishedEpisode").Execute DTSSource("EpisodeID")
DELETE FROM PatientEpisodes
WHERE (EpisodeID = ?) AND (EndDate IS NULL)


The Transformation


If we now take all of the code and put it together in a logical manner,
we get the following ActiveX Transformation Script:

Function Main()
	Dim iNationalSpecialty, aResourceValues

	' Check Dates for valid StartDate, and valid EndDate, or no EndDate
	If (IsDate(DBDate(DTSSource("StartDate"))) And IsDate(DBDate(DTSSource("EndDate")))) _
	  Or (IsDate(DBDate(DTSSource("StartDate"))) And IsNull(DTSSource("EndDate"))) Then

		' DeleteUnfinishedEpisode - Deletes an existing episode if it has a no EndDate
		DTSLookups("DeleteUnfinishedEpisode").Execute DTSSource("EpisodeID")

		DTSDestination("StartDate") = DBDate(DTSSource("StartDate"))
		DTSDestination("EndDate") = DBDate(DTSSource("EndDate"))

		' Direct Transfer
		DTSDestination("EpisodeID") = DTSSource("EpisodeID")
		DTSDestination("PatientID") = DTSSource("PatientID")
		DTSDestination("LengthOfStay") = DTSSource("LengthOfStay")

		' TransformSpecialty - Simple Lookup 
		' (Assign to variable for later use)
		iNationalSpecialty = DTSLookups("TransformSpecialty").Execute(_
			DTSSource("LocalSpecialty").Value)
		DTSDestination("NationalSpecialty") = iNationalSpecialty

		' ResourceAllocation - Multi-Argument, Multi-Value
		aResourceValues = DTSLookups("ResourceAllocation").Execute(iNationalSpecialty, _
			DTSSource("LengthOfStay").Value)
		DTSDestination("ResourceCode").Value = aResourceValues(0)
		DTSDestination("ResourceLevel").Value = aResourceValues(1)

		Main = DTSTransformStat_OK
	Else
		DTSLookUps("LogError").Execute DTSSource("EpisodeID"), DTSSource("PatientID")
		Main = DTSTransformStat_SkipInsert
	End If
End Function

Function DBDate(sDate)
	If Len(sDate) = 8 And IsNumeric(sDate) Then
		DBDate = Left(sDate, 4) & "-" & Mid(sDate, 5, 2) & "-" & Right(sDate, 2)
	ElseIf IsNull(sDate) Then
		DBDate = Null
	Else
		DBDate = "InvalidFormat"
	End If
End Function


Running the sample package


To run the sample package on your own server, follow the steps below.

  • Place EpisodeData1.csv in C:Temp
  • Run DTSLookup_EpisodeSampleObjects.sql. The package connections point to the master DB of the (local) server, so ensure this where you create the objects, or amend the package as appropriate.
  • Open and execute the package. The package has an owner/operator password of swynk,
    which ensures it is compatible with SQL 7.0 (SP1/SP2/SP3) and SQL 2000.

  • Now check that you have 5 episodes in PatientEpisodes, and two errors in EpisodeLoadLog. Note that we also have one unfinished episode.
  • To demonstrate the delete in action, place EpisodeData2.csv in C:Temp, and amend the Text File (Source) connection
    to point to it. Re-execute the package.

  • Now check that the unfinished episode has been updated, well actually it was deleted then re-inserted but you get the idea.


    Download the sample – DTSLookup_EpisodeSample.zip

  • Latest Articles