SQL Server 2005 Part 5 – High Availability and Scalability Enhancements

In this article, we continue our discussion on high availability and
scalability enhancements in SQL Server 2005 Beta 2 release. Features such as
clustering, database mirroring, online indexing, fast recovery, database
snapshots, and snapshot isolation transaction level have already been covered
earlier in this series – now we will concentrate on the remaining features,
such as a new method of table and index partitioning, backup and restore
improvements, and new hardware support options.

  •  
    table and index partitioning – intended to improve performance of
    operations performed on large tables. The basic concept is straightforward and
    involves splitting a table into several units (called partitions), which can be
    accessed independently of each other, limiting impact of I/O intensive
    activities performed on the table’s data (queries, data loads, backups and
    restores, maintenance tasks – such as index rebuilds and defragmentations, as
    well as operations that would result in lock escalation to the table level). The
    most common method of splitting data is horizontal partitioning, in which rows
    of a table matching mutually exclusive criteria (such as, range of dates or
    letters in alphabet, for datetime and character data, respectively) are placed
    in designated partitions.

    SQL Server 2000 implementation of horizontal
    partitioning of tables, known as Updateable Distributed Partitioned Views
    (supporting Data Modification statements, unlike its SQL Server 7.0
    predecessor, capable of querying data only), was based on creating identically
    structured tables across separate databases. Separation of data loaded into
    tables was based on a CHECK constraint associated with one of their columns.
    The same partitioned view was then defined on each of the servers hosting
    partitioned data (this required creating linked server entries for each
    participant). Once the view was created, users could access underlying data
    transparently, without being required to know its physical location. Updateable
    Distributed Partitioned Views were a subject of several limitations (outlined
    in more details in the SQL Server 2000 Books Online), however the most relevant
    one was resulting from the fact that partitions had to be spread out across
    multiple SQL Server instances.

    SQL Server 2005 extended the horizontal
    partitioning concept, by allowing placing partitions of the same table within a
    single database. In order to fully realize performance and scalability
    advantages, such partitions should be assigned to separate filegroups based on
    appropriately defined criteria, which are established by partitioning functions
    and associated partitioning schemes. The partition function determines the distribution
    of rows of a table or index across partitions and is applied by comparing its
    formula against values in a designated column (called the partitioning column).
    While the partitioning column will typically contain numeric or date/time
    values, others are also permitted – with exception of timestamp, text, image,
    XML, varchar(max), varbinary(max), CLR user-defined, and alias data types. A partitioning
    scheme assigns each of the partitions to a specific filegroup. Note that in
    order to maximize performance and scalability benefits it is
    important to both properly select the partitioning column and define the
    partitioning function associated with it. The choice should depend on data
    distribution and the type of activities that are typically performed against
    the target table. For example, if data is used to produce monthly reports, then
    partitioning based on the date column using monthly intervals as partition
    boundaries would be a very likely candidate.

    The process of partitioning involves four main
    steps. The first, preliminary step consists of creating filegroups and their
    corresponding files. During the second step, you define a partition function
    using the CREATE PARTITION FUNCTION T-SQL statement, with ranges of values (set
    by the FOR VALUES clause) that determine the beginning and end of each
    partition (and subsequently, their number). For example, the following
    statement:

    CREATE PARTITION FUNCTION pfIncome (money)
    AS RANGE LEFT FOR VALUES ('$30,000', '$90,000')

    is intended to create three
    partitions – the first one, containing values less then or equal to $30,000
    (inclusive, based on the LEFT element of the AS RANGE clause), the second one
    between $30,000 (exclusive) and $90,000 (inclusive), and the third one encompassing
    values of $90,000 and higher.

    In the third step, you create a partition scheme using the CREATE PARTITION
    SCHEME T-SQL statement, which associates partition function to filegroups,
    determining physical layout of data. The association between partitions and
    filegroups is typically one to one, although other arrangements are also possible
    (including specifying extra unassigned partitions for future growth). Following
    our earlier example, we could use the following statement to create a
    partitioning scheme:

    CREATE PARTITION SCHEME psIncome
    AS PARTITION pfIncome
    TO ('fgIncome1', 'fgIncome2', 'fgIncome3')

    This would result in rows within
    each range being redirected into their respective filegroups. By specifying a different
    sequence of (already existing) filegroups – e.g. by specifying TO ('fgIncome1', 'fgIncome1', 'fgIncome2')
    – you could alter the mapping, allocating the first two partitions to the first
    filegroup.

    Once partitions are associated with their respective filegroups, you can
    proceed with the fourth step, in which you create a partitioned table or
    index using CREATE TABLE or CREATE INDEX statements. The ON clause within
    these statements identifies the target partition scheme and partitioning
    column, whose values are compared against the definition of the
    partitioning function.

    For example, the following
    statement would result in the creation of a table with the second column
    designated as the partitioning one:


    CREATE TABLE tbIncome (cSSN char(9), mIncome money)
    ON psIncome(mIncome)

    Once you start performing operations on the target
    table (such as data additions, modifications, or queries), the database engine
    will transparently redirect them to appropriate partitions. For more
    information on the procedure required to set up or modify data partitioning
    (including converting existing non-partitioned tables to partitioned ones and
    vice versa), refer to the relevant topics in the Books Online. You will also
    find there, a more elaborate partitioning example, illustrating this process
    with the AdventureWorks sample database included with the installation of SQL
    Server 2005.

  •  
    backup media set mirroring – increases the reliability of backups
    by performing a concurrent dump of a database to multiple (up to four) backup
    devices, without significant impact on performance. This greatly increases the
    chances for avoiding backup failures due to a faulty tape media or device.
    Note, however, that for mirrored media sets, equivalent backup devices are
    needed (in the case of tape drives, it is recommended to use the same hardware
    models from the same manufacturer). In order to perform mirrored backup of a
    database, you need to add the MIRROR TO clause to the BACKUP DATABASE T-SQL
    statement, followed by the paths of the DISK or TAPE backup devices. In case of
    a backup media error detected during the restore, you can replace the faulty
    one with its mirror.

    Another method increasing the reliability of
    backups involves checksum generation when running BACKUP, which can be enabled
    by including the CHECKSUM clause (backup is also capable of verifying existing
    checksums and detecting torn pages, assuming that the PAGE_VERIFY CHECKSUM and
    TORN_PAGE_DETECTION database options are enabled). With this clause present, the
    generated checksum is recorded on the backup media that can be subsequently
    used to verify the restore operation. Keep in mind, though, that this feature
    introduces negative performance implications (and hence it is disabled by
    default), so you should evaluate its impact before implementing it. You can
    confirm successful completion of backups and media set readability (including
    checksum, if applicable) by executing RESTORE VERIFYONLY against a media or
    backup set. This statement, enhanced in SQL Server 2005, emulates the restore
    procedure without actually performing it. Furthermore, providing that the
    checksum information has been recorded during backup, you can take advantage of
    it by using the CHECKSUM clause when running RESTORE. However, a potential
    problem with this approach is the fact that even a single error encountered
    during restore will result in its failure. If such behavior is not desired, you
    can prevent it by applying the CONTINUE_AFTER_ERROR option, allowing the
    restore to complete (while this is likely to leave a database in inconsistent
    state, you can follow it with a manual repair).

    Another backup reliability enhancement introduced
    in SQL Server 2005 is integration of full-text catalogs into your database
    backups. Full-text catalogs contain full-text indexes, which are used to speed
    up keyword-based queries of text data stored in SQL Server databases in char, varchar,
    nvarchar, and formatted binary columns. While full-text search catalogs are
    stored outside of SQL Server (as part of the file system), they are managed by one
    of SQL Server 2005 components (Microsoft Full-Text for SQL Server service) and
    administered with SQL Server utilities. Integrating full-text search catalogs
    into SQL Server backups and restores resolves the synchronization issues
    present in SQL Server 2000, where each of these procedures had to be performed
    separately. You can also include full-text catalogs when transferring databases
    by combining DETACH and ATTACH operations (this was not the case in SQL Server
    2000).

  •  
    online restore – increases availability by allowing the target
    database to remain online for its duration. Online restore functionality is
    available only in the case of Enterprise and Developer editions of SQL Server
    2005 (at the Beta 2 stage) and applies only to configurations where the
    database consists of multiple filegroups. It can be used when restoring files,
    pages, or when running piecemeal restore (performed in stages, by filegroups,
    starting with the restore of the primary one). Specifics of this process depend
    on a number of circumstances, such as, recovery mode, availability of
    transaction logs necessary to bring database to consistent state, and read-only
    status of filegroups that are to be restored. The online restore feature is useful
    in situations following a failure of a drive hosting an individual filegroup.
    Database administrators might be able to maintain a database on-line, while
    restoring this filegroup only, without impacting access to data residing in
    other filegroups.

  •  
    support for highly available and scalable hardware (since SQL
    Server 2005 relies here on functionality built into the Windows 2003 Server and
    specialized server hardware, you need to have both in place in order to take
    advantage of the features listed below):

    •  
      64-bit version availability – on both Intel (64-bit Itanium and
      Xeon EM64T) and AMD (Opteron with Direct Connect Architecture) platforms, when
      running 64-bit version of Windows 2003 Server – increases considerably the
      amount of supportable memory (up to the theoretical limit of 32TB), eliminating
      4 GB limit inherent to 32-bit platform (although it is possible to push this
      limit to 32GB of RAM by taking advantage of Advanced Windowing Extensions in
      32-bit versions of Windows 2000 and Windows 2003 Servers) and makes it
      comparable with other database management systems operating in 64-bit computing
      environments. The new version includes 64-bit versions of Analysis Services and
      Integration Services, according to the recently released Community Technology
      Preview. SQL Server 2005 has also been designed to operate more efficiently in
      the Non-Uniform Memory Architecture (NUMA) systems.

    •  
      hot-plug memory – takes advantage of the functionality present in
      Windows 2003 Server Enterprise and Datacenter Editions (providing that underlying
      hardware supports it as well) which makes it possible to add (but not remove)
      physical memory while the operating system is running. SQL Server 2005 is
      capable of detecting the change dynamically (previous versions of SQL Server
      could only recognize the amount of memory present at the startup). In order for
      this feature to be available, the SQL Server 2005 needs to have Address
      Windowing Extensions activated (using the awe enabled server
      configuration option) and -h option specified at the startup.

  •  
    increased number of instances supported on a per server basis –
    the number has been raised to 50 with the Enterprise Edition of SQL Server 2005
    (from 16 in SQL Server 2000).

  •  
    reduced number of restarts required for changing configuration
    options – it is no longer necessary to restart SQL server to perform such
    modifications as adjusting CPU affinity or to perform changes to the AWE
    configuration (including, as described earlier, recognizing newly added
    memory).

This concludes our review of high availability and scalability enhancements
in SQL Server 2005 Beta 2. In our next article, we will look into new security
related features.

»


See All Articles by Columnist
Marcin Policht

Marcin Policht
Marcin Policht
Being a long time reader, I'm proud to join the crowd of technology gurus gathered here. I have a fair share of Microsoft exams behind me, a couple of acronyms after my name - MCSE, MCSD, and MCT, decent familiarity with majority of MS BackOffice products (SMS, SQL, Exchange, IIS), programming and scripting languages (VB, C++, VBScript with wsh) and several years of practical experience with Windows environment administration and engineering. My focus these days is on SQL Server and Windows 2000, and I'll attempt to share the most interesting experiences with these products.

Latest Articles