SQL Server 2000 Security – Part 8 – Introduction to Replication Security

So far in our series of articles, we have been discussing relatively generic
SQL Server 2000 security related issues that every database administrator is
likely to encounter on a daily basis. Now it is time to look into less common
scenarios that deal with one specific area of management – database
replication. We will start by presenting its architecture and implementation
options, and follow with their security implications.

The main purpose of replication is data distribution. Placing multiple
copies of the same data at different locations allows you to load balance or
reduce network traffic generated by remote database clients, merge data from multple
sources into one comprehensive target, or separate databases based on their
purpose (e.g. OLTP access can be isolated from read only queries by creating
two data sources, with the read only one being periodically updated). Copying
critical data to a secure location also serves frequently as the basis for
disaster recovery or high availability solutions.

There are different variations of replication, which differ in scope of data
(from stored procedures or user defined functions to multiple tables),
direction of data transfer (one or two directional, with merge functionality
and conflict resolution provisions), propagation method (push or pull), and
update mechanism (more about this shortly). In order to make principles of
replication easier to understand, SQL Server designers decided to apply
terminology derived from the publishing industry when naming its main
components:

  • publisher – SQL Server instance that serves as a source of
    replicated data,

  • distributor -SQL Server instance which manages and monitors data
    flow from a publisher to subscribers, using the distribution database created
    specifically for this purpose,

  • subscriber – SQL Server instance functioning as the target for
    replicated data,

  • article of publication – the smallest unit of replicated data,
    consisting of a table, its subset formed by either selected columns (filtered
    vertically), rows (filtered horizontally), or combination of both, a view,
    user-defined function, or stored procedure.

  • publication – collection of one or more articles grouped in order
    to be replicated together.

Keep in mind that the three architectural elements of replication –
publisher, distributor and subscriber – are, in essence, roles that a
particular SQL Server can serve. These roles are not mutually exclusive – there
are situations where the same system functions as a publisher and a distributor
or as a publisher and a subscriber within the same instance of replication.
Similarly, with multiple replication instances, a system can be a subscriber
for one, and a distributor for another.

There are three types of replication:

  • snapshot – is the simplest type from the conceptual and
    implementation perspective. It involves capturing content of a publication at a
    point in time and copying it to subscribers. This content consists of scripts
    to create database objects included in a publication that are subsequently
    executed on the subscriber, as well as data residing in these objects. Its main
    benefit is simplified maintenance and troubleshooting as well as decreased load
    on replication participants, since they are not responsible for keeping track
    of database changes in between snapshots. On the other hand, lack of
    transactional support is also the main drawback of snapshot replication. Since
    incremental changes to a publisher are not reflected in the subscriber until
    the next snapshot, data between the two tend to be out of synch most of the
    time. In addition, size of replicated data is the same size as the size of the publication,
    every time the snapshot gets replicated – affecting network bandwidth and
    processor utilization across all replication participants. This is the main
    reason that snapshot replication is typically used with small and relatively
    static publications. There is no mechanism to propagate updates on subscribers back
    to the publisher.

  • transactional – overcomes the limitations of snapshot replication
    by keeping track of incremental changes to a publication (taking advantage of
    database transaction logs), propagating them to distributor, which in turn
    distributes them to subscribers, shortly after they take effect (note that
    transactional replication starts with a single snapshot, which provides initial
    synchronization between publisher and subscribers). This ensures that
    subscribers represent an up-to-date state of publisher (degree of
    synchronization between the two is configurable). At the same rate, though,
    this places an additional load on publisher, distributor, and subscribers, and
    increases complexity of configuration and management. This type of replication is
    used in scenarios where data on subscribers needs to remain consistent with the
    original publication. Subscribers are typically treated as read only, although
    there are (relatively limited) configuration options (immediate updating
    subscribers and queued updating subscribers) allowing updates applied to
    subscribers to replicate back to the publisher. In general, though, it is
    recommended to handle updateable subscribers by configuring merge replication
    (presented next).

  • merge – intended for scenarios where content of a publication can
    be updated at multiple locations. Initial snapshot is used to synchronize
    remote subscribers with a publisher, but from this point on, the distinction
    between them becomes less clear. Changes can occur at multiple locations (all
    subscribers and publisher), which introduces a number of issues that need to be
    properly addressed, such as mutual synchronization and conflict resolution. In
    addition, merge replication offers some unique enhancements, such as dynamic
    filtering (which filters replicated data based on parameters provided
    dynamically by subscribers). This type of replication is used in situations
    that involve remote offices or mobile users working with subsets of a
    publication, which are rolled back into the publisher.

Actual work involved in replication is performed by a number of automated
processes, which are implemented as SQL Server Agent jobs, created
automatically when you set up replication via SQL Server Enterprise Manager.
This implies that they run in the security context of the SQL Server Agent
account (on Windows NT and later operating systems) or a logged on user (on
Widows 9x computers). These jobs fall into several distinct categories, known
as replication agents, each based on a separate executable from Program
FilesMicrosoft SQL Server80COM folder or a stored procedure from the
distribution database, and delivering different type of functionality. The
primary ones are as follows:

  • snapshot agent (SNAPSHOT.EXE) – running on distributor, it is
    responsible for creating a snapshot of a publication. The snapshot is then
    copied to a shared folder on distributor, which subsequently is copied to
    subscribers with the help of distribution agent (refer to the next item below).
    This agent is used by every type of replication, since each one involves the creation
    of a snapshot at least once.

  • distribution agent (DISTRIB.EXE) – copies and applies the initial
    snapshot created by the snapshot agent, and, if appropriate, transactional
    changes, prepared by the log reader agent (discussed in the next item below),
    from distributor to subscribers. It is used by snapshot and transactional
    replication. You can configure either one shared agent serving all subscribers
    running on distributor (push subscription) or individual agents operating
    locally on each subscriber (pull subscription).

  • log reader agent (LOGREAD.EXE) – running on distributor, log
    reader agent pulls entries from the transaction log on publisher (the ones
    marked for replication), making them available to the distribution agent. It is
    used only in transactional replication.

  • merge agent (REPLMERG.EXE) – specific to merge replication,
    synchronizes publication content between publisher and its subscribers. Note
    that, in addition to snapshot agent, this is the only one that merge
    replication relies on (providing combined functionality of log reader and
    distribution agents used with transactional replication). Merge agent can
    reside either on a publisher (push subscriptions) or on subscribers (pull
    subscriptions).

  • queue reader agent (QRDRSVC.EXE) – is optional in transactional
    replication (not used for the other two types), applicable only when queued
    updating subscribers in transactional replication are configured (in cases
    where transactional changes to subscribers need to be replicated back to the
    publisher).

There are also several other replication agents whose roles are secondary,
performing mostly cleanup and monitoring tasks, such as agent history clean up
agent, distribution clean up agent, expired subscription clean up agent,
reinitialize subscriptions having data validation failures agent, or
replication agent checkup agent, which are not relevant to our security-focused
discussion, so we will not go beyond just listing them here.

This introductory information will serve as the basis for a discussion on
replication security, in our next article.

»


See All Articles by Columnist
Marcin Policht

Marcin Policht
Marcin Policht
Being a long time reader, I'm proud to join the crowd of technology gurus gathered here. I have a fair share of Microsoft exams behind me, a couple of acronyms after my name - MCSE, MCSD, and MCT, decent familiarity with majority of MS BackOffice products (SMS, SQL, Exchange, IIS), programming and scripting languages (VB, C++, VBScript with wsh) and several years of practical experience with Windows environment administration and engineering. My focus these days is on SQL Server and Windows 2000, and I'll attempt to share the most interesting experiences with these products.
Previous articleOracle’s Wrap Utility
Next articlePadding

Latest Articles