SQL Server 2000 Security – Part 12 – Auditing with Meta Data Services

In our previous two articles of this series, we discussed auditing and
security related aspects of SQL Server 2000 Data Transformation Services. Now,
we will combine these topics by presenting DTS Meta Data Services, which
provides the ability to audit creation, execution, and modification of DTS
packages as well as track associated data changes. In particular, you can track
package version history or data lineage. This information is stored according
to the rules of DTS Information Model, which can be accessed and manipulated by
both SQL Server Enterprise Manager and programmatic methods (for review of SQL
Server 2000 DTS, refer to its coverage
on the DatabaseJournal Web site.

As we mentioned in our previous article, the ability to save packages to
Meta Data Services is disabled by default. This is intentional, since its use
is associated with an increase in resource utilization and affects package
execution speed. In order to alter the default setting, in the Package
Properties dialog box, which appears after selecting the Properties entry from
context sensitive menu of the Data Transformation Services node in SQL Server
Enterprise Manager, you need to mark the checkbox labeled "Enable Save To
Meta Data Services". This will make available a new option "Meta Data
Services" in the Location listbox in the Save DTS Package dialog box,
displayed when saving packages from the DTS Designer interface. When you choose
this option, you will also have a chance to specify the target server where the
package meta data information will be stored, authentication information for
connecting to it, and scanning settings. These settings determine whether and
to what extent information about databases and their objects referenced by DTS
packages will be scanned and written into equivalent meta data structures. The
main check box determines if such meta data should be created to begin with.
Once you enable it, you also need to decide whether you intend to:

  • rescan all information about data stores referenced in the
    package into the Meta Data Services even if such information is already present
    from previous scans (corresponding to the "Scan all referenced catalogs
    into repository" option) or use existing information (corresponding to
    "Use scanned catalogs if already present in repository"), which saves
    time and resources but should be used only if data structure has not changed
    between scans.

  • perform the activity defined by the previous selection for all
    data structures referenced in the package (corresponding to "Scan catalog
    always") or limit it only to the ones which have not yet been scanned
    (corresponding to "Scan catalog if not already present in Meta Data
    Services").

You also have an ability to import meta data for a particular database
(rather than for all databases referenced by a DTS package). This can be
accomplished by selecting the "Import Metadata…" item from the
context sensitive menu of Meta Data Services Packages and Meta Data nodes under
the Data Transformation Services folder in the SQL Server Enterprise Manager.
Importing databases will allow you to view their information by clicking on the
Meta Data node. The display is divided into three categories, grouped under
Browse, Lineage, and Package headings:

  • Browse section provides a listing of Database Meta Data,
    including database (its name, SQL Server instance name and version), schema (in
    the case of SQL Server, this refers to the database owner), tables, their
    columns and characteristics for each (datatype, length, scale, precision, and nullability).
    For each table and column, you can find DTS packages that use it as their data
    store connection, as well as related tables and columns that function as a
    source or a destination for its data. This is done by selecting Package and
    Source/Destination items from the context sensitive menu of the icon
    representing individual tables and columns. For this type of statistic to be
    available, you need to import relevant package information into the meta data
    as well (which we will discuss next).

  • The second section deals with lineage, which provides the ability
    to track changes to data. In general, SQL Server 2000 Meta Data Services offers
    two types of lineage:

    • row-level data lineage – identifies the source and destination
      for a particular piece of data (a table row) as well as all transformations
      that were applied to it (including DTS package execution information). This is
      accomplished by assigning a globally unique identifier (GUID) to each value,
      which is used for tracking purposes.

    • column-level data lineage – maintains a record of columns
      involved in transformations (without tracking individual values in each). This
      allows correlating references between packages, which might be using the same
      columns as source or destination for their transformations. It also provides
      package revision and execution history, informing who, where, and when creates,
      runs, and modifies packages.

    In order to help you understand this concept, we
    will demonstrate implementation of both types of lineage using a sample DTS
    package based on a template called Template Demo.dtt, stored in Program
    FilesMicrosoft SQL Server80ToolsTemplatesDTS folder. To create a package,
    right click on the Data Transformation Services node in the SQL Server
    Enterprise Manager and select the All Tasks -> Open Template option from its
    context sensitive menu. Point to the target file in the Select File dialog box,
    click on the Open command button, and choose the New Package entry in the
    Select Package dialog box. This will present you with the DTS Designer
    interface, containing a fairly rudimentary sample package, using a single,
    undefined Transform Data Task and two OLE DB Provider SQL Server connections,
    one pointing to Northwind database as the source and the other referencing tempdb
    as the destination. To define transformation properties, double click on it,
    and from the Properties dialog box, select the Shippers table as its source (on
    the Source tab). You can click on the Preview button to verify that the source
    table exists and has some data in it (three rows by default). Next, switch to
    the Destination tab. This will generate a Package Error informing you that the
    table does not exist at the destination, which makes sense, since we have not
    created it yet. Acknowledge the error, click on the Create button and verify
    that T-SQL code presented there will create the Shippers table at the
    destination. Switch to the Transformations tab and note that, without any
    customization, Transform Data Task simply copies data from Source to
    Destination. Click on OK to close the Properties dialog box. If you execute the
    package at this point, you should receive confirmation of its successful
    completion. You can also easily verify that tempdb will contain the Shippers
    table, with its content identical to Shippers in the Northwind database.

    Note, however, that information about the package
    execution has not been recorded into Meta Data Services and no lineage
    statistics are available. In order to be able to track row-level data lineage,
    we will need to implement the following steps:

    1. Import the
    source and target databases into the Meta Data Services.

    2.
    Switch to the DTS Designer interface and select the Properties
    option from the Package top-level menu. On the Advanced tab in the DTS Package
    Properties dialog box, mark the "Show lineage variables as source
    columns" and "Write lineage to repository" checkboxes, click on
    OK to confirm your choice, and return to the DTS Designer.

    3.
    Double click on the Transform Data task arrow to display the
    Transform Data Task Properties dialog box. Switch to the Transformations tab.
    You will notice that the list of columns in the Source changed and includes DTSLineage_Full
    and DTSLineage_Short. The first entry will store the globally unique identifier
    (GUID) representing the package version, the second one is intended for version
    checksum (unlike with GUID, its uniqueness is not guaranteed – although still
    very likely).

    4.
    Now, you need to create identical columns at the destination. To
    accomplish this, switch to the Destination tab, and click on the Create command
    button. You will notice that the T-SQL statement now contains two additional
    fields Lineage_Full (of type UNIQUEIDENTIFIER) and Lineage_Short (of type INT).
    Rename the table identifier in the CREATE TABLE statement of the Create
    Destination Table dialog box to ShippersMDS and click on the OK button. This
    will create an appropriately structured table at the destination (Note that we
    could have done this at the very beginning as long as the appropriate options
    were selected on the Advanced tab of the package Properties dialog box – we
    went through initial steps purely to demonstrate changes that take place as the
    result of activating these options).

    5.
    Now you need to modify transform properties from the
    Transformations tab. To change the mapping between source and destination
    columns, first click on Select All and then on Delete All. This will remove
    existing mappings. To create a new one, make sure that all Source and
    Destination columns are highlighted and click on the New button. From the
    "Create New Transformation" dialog box, select Copy Column entry, and
    click on OK twice to return to the Transformations tab.

    6.
    Click on OK to close the Transform Data Task Properties dialog
    box and return to the DTS Designer window.

    7.
    Assign to the package a meaningful name and save it to Meta Data
    Services using the Save As item from the Package menu. Ensure you select the
    Meta Data Services as the target location.

    8.
    Execute the package.

    At this point, all relevant row-level lineage
    information should be available. For example, in the Lineage section of the
    Meta Data node of SQL Server Enterprise Manager, you can type either Lineage
    Long or Lineage Short values (you can extract them from ShippersMDS table in
    the tempdb) in the corresponding text boxes, and click on the Find Lineage
    command button. This will display information about the Package responsible for
    creation of the corresponding table entry. Similarly, you can obtain
    information about its source using the Source/Destination context sensitive
    menu option..

    To implement column-level data lineage, all that is
    necessary is to ensure that you mark the "Resolve package references to
    scanned catalog meta data" checkbox in the Scanning Options dialog box
    when saving the package into Meta Data Services. In our case, you can choose
    "Scan all referenced catalogs into repository" and "Scan catalog
    always" options.

  • Package section displays information about packages saved to Meta
    Data, including such information as, for example, their name, version, author,
    creation time, lineage, and execution statistics (account used to launch it and
    launch time). You should be able at this point to see reference to the package
    we just saved and executed.

Note that when saving to Meta Data Services, you lose the ability to assign
owner and user passwords. While, this makes your packages less secure, you have
the ability to audit their use. In addition, keep in mind that this
functionality is used primarily for troubleshooting more complex data
warehousing scenarios, where keeping track of transformations via DTS Packages
would be otherwise fairly difficult.

»


See All Articles by Columnist
Marcin Policht

Marcin Policht
Marcin Policht
Being a long time reader, I'm proud to join the crowd of technology gurus gathered here. I have a fair share of Microsoft exams behind me, a couple of acronyms after my name - MCSE, MCSD, and MCT, decent familiarity with majority of MS BackOffice products (SMS, SQL, Exchange, IIS), programming and scripting languages (VB, C++, VBScript with wsh) and several years of practical experience with Windows environment administration and engineering. My focus these days is on SQL Server and Windows 2000, and I'll attempt to share the most interesting experiences with these products.

Latest Articles