Using DTS to Detect and Process a FileJanuary 31, 2002 IntroductionThere is nothing better than when a new technology lends itself to solving an old problem; well, there may be some things better, but this will have to do for now. One of the classic problems in computing -- classic being defined as a problem I have seen before -- is processing files that were uploaded to a server. This problem arose in a recent situation and DTS was used to help load an uploaded file into SQL server. In order to process the file the first step is to be able to detect the arrival of the file via FTP. Once this has been accomplished the file needs to be unzipped and then loaded into SQL server. This process occurs each night and there may be more than one file on a nightly bases. The loading of a flat file to SQL server can be handled by a seperate DTS package, but the more difficult task is detecting and unzipping the file. Getting StartedThe following is a diagram of what the final DTS package looks like in the DTS Design window. The package is broken down into four parts. Three of the parts are ActiveX Script Tasks, while the fourth launches another DTS Process. Knowing and experimenting with the various objects one can use in the DTS process quickly expands the tasks that can be accomplished by DTS. DTS can be used to perform transformations between different database formats and to combine fields, and the ability to run ActiveX scripts in the middle of the DTS process allows for additional operations like logging and file manipulation. Check for toImport.txtThe initial part of the DTS process checks for a file created during a previous execution of this DTS package. In creating a package that can run continuously, it is important to check to see if the process has failed previously. In this case, if the file that was to be imported into the SQL server still exists, the previous DTS package failed to properly process and a new process should not start. The script to check for the toImport.txt file follows. Line 5 of this script instantiates a file system object. This object gives access to the file system to look for files and to open a log file. Lines 6-10 set constants for directories and files that will be used in the following scripts. If the constants are not used in the current script they are there for reference as they are used in subsequent scripts. The key directory in the toImport.txt check script is the strPublishDir. This is the directory where the file toImport.txt is created, so a seperate DTS package can use this file to load an SQL table. Line 16 checks to see if a log file currently exists. If the file exists then a file system object is opened to append text (line 21); otherwise, a new file is created (line 17). Lines 18, 19, 21, 22, and 23 write text to the log file. The text varies depending on if the file was created or opened. Line 27 instantiates another file system object. Line 30 checks for the existence of the toImport.txt file. If the file exists then a note is made in the log file and the function main returns a failure (DTSTaskExecResult_Failure). If the file does not exist then line 35 returns a success (DTSTaskExecResult_Success). Line 37 writes a message in the log file, and it is closed on line 38. Lines 40-43 release the objects that were created and the function ends. toImport.txt Check Script
Check for .DONE fileAnother beauty of DTS is that the movement from DTS object to DTS object within a package can be governed on the success or failure of the precious step. In this case, the missing toImport.txt file is a success. After this success, the files needed to be processed can be detected. In this case, two files are uploaded. The first file is a zipped file with a .zip extention and another file with a .done extension. The filename before the period is the same in both cases. This makes the job of the next script to detect the .DONE file and unzip the compressed file. Lines 5-25 are the same as the previous script. The lines set up the needed variables and open the log file for writing. On line 27 a file system object is created to get the files in the FTP directory. Line 28 starts a loop through the files in the FTP folder. On line 29 the file suffix is tested to see if the .DONE file has arrived. If it has arrived the filename is extracted without the extension, and the detection is logged on lines 31 and 32. Line 35 instantiates a file system object that is used in line 37 to make sure the zip file has arrived. Lines 38 and 39 log this result to the log file. The first thing done if the file has arrived is to copy it to an archive directory (line 43) and note it in the log file (lines 44 - 48). Once the file has arrived and been backed up, it can then be unzipped. Line 52 instantiates a shell object. Line 53 creates a command line that is run on line 54. This results in the extraction of a .TXT file from the .ZIP file that has the same filename prefix as the .DONE and .Zip files. In order to DTS the flat file into SQL, programatically, the file resides with the same name and in the same location each time the process runs. Line 60 copies the .TXT file to the publishing directory and renames the file to toImport.txt. The results are logged on line 61. The files used to create the toImport.txt file are deleted on lines 65,67 and 69. Entries are made in the log file on lines 66,68, and 70. Line 71 returns the fact that the process has succeded. Line 73 and 74 are used to log the lack of a .DONE and return a DTS failure if a .DONE file does not exist in the FTP directory. Line 76 and 77 log and then close the log file. Lines 79-82 release the objects used in this function.
Run another DTS ProcessOnce it has been verified that the toImport.txt file exists, the DTS package launches another DTS package. In this case the DTS package loads the toImport.txt file into SQL server. It would be possible to have any DTS package run at this point. Delete toImport.txtIn order for this process to run continuously it is important to delete the toImport.txt file. This is done with the following function. This function uses the techniques used in the above scripts. The existence of the toImport.txt is checked, and if it exists it is deleted.
ConclusionsA DTS package is explored and the scripts used to detect a file arrival are explained. This process is used to load a file into SQL server but the power of SQL server allows this process to be used for more than just loading data into SQL server.
|