SQL Server 2008 R2 Integration Services incorporate a variety of processes facilitating a wide range of extraction, transformation and loading (ETL) scenarios. One of more common examples of such processes involves copying files across distributed, frequently incompatible systems. Traditionally, this has been accomplished by employing File Transfer Protocol. In this article, we will describe the FTP Task that natively delivers this functionality within SSIS packages.
FTP has been a part of the TCP/IP suite of protocols since the early days of the Internet and still remains one of its more popular components. Communication between two systems (where a local one initiating data transmission constitutes a client and a remote one takes the role of a server) is first established on TCP port 21 (this is referred to as the control connection, used to carry metadata, such as instructions and authentication bits). The actual file transfer, which follows, utilizes a separate channel (known as a data connection). Specifics of its configuration depend on the mode of operation that both parties recognize and allow. In the case of active mode, the server opens an additional connection from its local port 20 to a mutually agreed upon port on the client. Passive mode, on the other hand, relies on a session from an arbitrarily chosen client port to a negotiated port on the server. As you can easily conclude, the passive mode tends to be more firewall friendly, since it does not require opening extra inbound ports.
Since its introduction, FTP has been supporting anonymous and basic authentication (with credentials transmitted in clear text). While its Windows-based version also supports integrated authentication, this mechanism is not available when conducting file transfers using FTP Task, so if you have security or confidentiality concerns, you should consider employing additional provisions, such as IPSec.
In order to explore characteristics of the FTP Task incorporated into SQL Server 2008 R2 Integration Services, let’s take a closer look into specifics of its implementation. Start by launching Business Intelligence Development Studio and creating a new project based on the Integration Services template. Drag the FTP Task icon from Toolbar and drop it onto the Designer interface. Before we set its properties, which will establish it as an FTP client, let’s configure package to designate its server counterpart. To accomplish this, you need to define an FTP Connection. Right-click on an empty area of Connection Managers section of the Designer window and choose New Connection… from the context sensitive menu. In the resulting Add SSIS Connection Manager dialog box, highlight the FTP Connection manager for FTP connections entry and click on Add… command button to display FTP Connection Manager Editor.
FTP Connection Manager Editor provides an interface where you can specify Server settings including Server name and its port (set by default to 21), Credentials used for authentication (consisting of the User name and Password), and a number of protocol-specific Options, such as Time-out (in seconds), number of Retries, or Chunk size (in KB), determining amount of data transported in individual FTP packets. You also have ability to designate whether FTP sessions should use passive mode. Test Connection command button gives you ability to quickly verify whether the settings are correct.
At this point, we are ready to configure the newly created FTP Task. Use its context-sensitive menu to display the Editor dialog box. In its General section, choose the existing FTP Connection Manager (alternatively, it is also possible to leverage the New Connection… entry to define a new one) in the FtpConnection listbox. Next, switch to the File Transfer section. Its appearance is dependent on the Operation you select. In particular, you can choose one of the following types (note that those dealing with local files and folders are likely more suitable for File System Task instead):
- Send files – intended for transferring local files identified (depending on the value of IsLocalPathVariable Boolean parameter) either by File Connection Manager (single file only) or an SSIS variable (one or more files, when wildcard characters are used) to a remote destination, defined using the combination of FTP Connection Manager and the Remote Path parameter (representing an existing subfolder within the target FTP root folder). This path can be specified either explicitly or via an SSIS variable (depending on the value of IsRemotePathVariable Boolean parameter). If you expect a destination file to exist, you have an option of overwriting it by setting OverwriteFileAtDest to True. Enable IsTransferAscii parameter if the content being transmitted consists exclusively of ASCII text.
- Receive files – intended for transferring one or more files from a remote destination (defined using the combination of FTP Connection Manager and the RemotePath parameter) to a local folder, identified either by File Connection Manager or an SSIS variable (depending on the value of IsLocalPathVariable parameter). You have the ability to decide whether you want to overwrite files as destination (by leveraging OverwriteFileAtDest parameter), although in this case, the setting applies to the local (rather than remote) file system. Similarly, it is possible to designate a desired transmission mode by setting IsTransferAscii to either True or False.
- Create local directory – facilitates creating a local folder by leveraging either a File Connection Manager pointing to a non-existing folder or an SSIS variable containing its full path (depending on the value of IsLocalPathVariable parameter).
- Create remote directory – facilitates creating a folder at the destination, whose full path is formed by combining location defined by FTP Connection Manager and either the value of RemotePath parameter or an SSIS variable associated with RemoteVariable parameter (depending on the value of IsRemotePathVariable parameter).
- Remove local directory – deletes a local folder identified either by a File Connection Manager (based on the Existing folder usage type) or an SSIS variable (as before, you can choose between them by assigning a desired value to IsLocalPathVariable parameter).
- Remove remote directory – provides equivalent functionality in regard to a remote folder, whose path is constructed by combining location pointed to by FTP Connection Manager and either an arbitrary SSIS variable associated with RemoteVariable parameter or the value assigned explicitly to the RemotePath parameter.
- Delete local files – gives you the ability to delete local files, whose location is identified either by the File Connection Manager (single file only, since the Usage type is set to Existing file) or an SSIS variable (one or more files, when wildcard characters are used) assigned to the LocalVariable parameter.
- Delete remote files – targets one or more files at the remote destination. Their identity is established either by the RemotePath parameter or Remote Variable (representing file system path within the location defined by FTP Connection Manager).
As indicated above, while in some cases it might not be possible to target multiple files to be transferred or deleted using FTP Task, this limitation can be somewhat remediated by leveraging wildcards (for file names only). More specifically, you have an option of using either the asterisk (*) or the question mark (?) characters. The former represents one of more non-matching characters in the target file names or their extensions, while the latter comes handy in situations where all files are named identically with exception of individual characters occupying the same position within each name. If this workaround does not meet your requirements, you have an option of executing FTP Task within a Foreach Loop, which enumerates target files and passes their names to the task via SSIS variables.