SQL Server 2005 Part 5 - High Availability and Scalability Enhancements

In this article, we continue our discussion on high availability and
scalability enhancements in SQL Server 2005 Beta 2 release. Features such as
clustering, database mirroring, online indexing, fast recovery, database
snapshots, and snapshot isolation transaction level have already been covered
earlier in this series – now we will concentrate on the remaining features,
such as a new method of table and index partitioning, backup and restore
improvements, and new hardware support options.

table and index partitioning – intended to improve performance of
operations performed on large tables. The basic concept is straightforward and
involves splitting a table into several units (called partitions), which can be
accessed independently of each other, limiting impact of I/O intensive
activities performed on the table’s data (queries, data loads, backups and
restores, maintenance tasks – such as index rebuilds and defragmentations, as
well as operations that would result in lock escalation to the table level). The
most common method of splitting data is horizontal partitioning, in which rows
of a table matching mutually exclusive criteria (such as, range of dates or
letters in alphabet, for datetime and character data, respectively) are placed
in designated partitions.
SQL Server 2000 implementation of horizontal
partitioning of tables, known as Updateable Distributed Partitioned Views
(supporting Data Modification statements, unlike its SQL Server 7.0
predecessor, capable of querying data only), was based on creating identically
structured tables across separate databases. Separation of data loaded into
tables was based on a CHECK constraint associated with one of their columns.
The same partitioned view was then defined on each of the servers hosting
partitioned data (this required creating linked server entries for each
participant). Once the view was created, users could access underlying data
transparently, without being required to know its physical location. Updateable
Distributed Partitioned Views were a subject of several limitations (outlined
in more details in the SQL Server 2000 Books Online), however the most relevant
one was resulting from the fact that partitions had to be spread out across
multiple SQL Server instances.
SQL Server 2005 extended the horizontal
partitioning concept, by allowing placing partitions of the same table within a
single database. In order to fully realize performance and scalability
advantages, such partitions should be assigned to separate filegroups based on
appropriately defined criteria, which are established by partitioning functions
and associated partitioning schemes. The partition function determines the distribution
of rows of a table or index across partitions and is applied by comparing its
formula against values in a designated column (called the partitioning column).
While the partitioning column will typically contain numeric or date/time
values, others are also permitted – with exception of timestamp, text, image,
XML, varchar(max), varbinary(max), CLR user-defined, and alias data types. A partitioning
scheme assigns each of the partitions to a specific filegroup. Note that in
order to maximize performance and scalability benefits it is
important to both properly select the partitioning column and define the
partitioning function associated with it. The choice should depend on data
distribution and the type of activities that are typically performed against
the target table. For example, if data is used to produce monthly reports, then
partitioning based on the date column using monthly intervals as partition
boundaries would be a very likely candidate.
The process of partitioning involves four main
steps. The first, preliminary step consists of creating filegroups and their
corresponding files. During the second step, you define a partition function
using the CREATE PARTITION FUNCTION T-SQL statement, with ranges of values (set
by the FOR VALUES clause) that determine the beginning and end of each
partition (and subsequently, their number). For example, the following
statement:
__ATLAS_CODE_BLOCK__0__
is intended to create three
partitions – the first one, containing values less then or equal to $30,000
(inclusive, based on the LEFT element of the AS RANGE clause), the second one
between $30,000 (exclusive) and $90,000 (inclusive), and the third one encompassing
values of $90,000 and higher.
In the third step, you create a partition scheme using the CREATE PARTITION
SCHEME T-SQL statement, which associates partition function to filegroups,
determining physical layout of data. The association between partitions and
filegroups is typically one to one, although other arrangements are also possible
(including specifying extra unassigned partitions for future growth). Following
our earlier example, we could use the following statement to create a
partitioning scheme:
__ATLAS_CODE_BLOCK__1__
This would result in rows within
each range being redirected into their respective filegroups. By specifying a different
sequence of (already existing) filegroups – e.g. by specifying TO ('fgIncome1', 'fgIncome1', 'fgIncome2')
– you could alter the mapping, allocating the first two partitions to the first
filegroup.
Once partitions are associated with their respective filegroups, you can
proceed with the fourth step, in which you create a partitioned table or
index using CREATE TABLE or CREATE INDEX statements. The ON clause within
these statements identifies the target partition scheme and partitioning
column, whose values are compared against the definition of the
partitioning function.
For example, the following
statement would result in the creation of a table with the second column
designated as the partitioning one:
__ATLAS_CODE_BLOCK__2__
Once you start performing operations on the target
table (such as data additions, modifications, or queries), the database engine
will transparently redirect them to appropriate partitions. For more
information on the procedure required to set up or modify data partitioning
(including converting existing non-partitioned tables to partitioned ones and
vice versa), refer to the relevant topics in the Books Online. You will also
find there, a more elaborate partitioning example, illustrating this process
with the AdventureWorks sample database included with the installation of SQL
Server 2005.
backup media set mirroring – increases the reliability of backups
by performing a concurrent dump of a database to multiple (up to four) backup
devices, without significant impact on performance. This greatly increases the
chances for avoiding backup failures due to a faulty tape media or device.
Note, however, that for mirrored media sets, equivalent backup devices are
needed (in the case of tape drives, it is recommended to use the same hardware
models from the same manufacturer). In order to perform mirrored backup of a
database, you need to add the MIRROR TO clause to the BACKUP DATABASE T-SQL
statement, followed by the paths of the DISK or TAPE backup devices. In case of
a backup media error detected during the restore, you can replace the faulty
one with its mirror.
Another method increasing the reliability of
backups involves checksum generation when running BACKUP, which can be enabled
by including the CHECKSUM clause (backup is also capable of verifying existing
checksums and detecting torn pages, assuming that the PAGE_VERIFY CHECKSUM and
TORN_PAGE_DETECTION database options are enabled). With this clause present, the
generated checksum is recorded on the backup media that can be subsequently
used to verify the restore operation. Keep in mind, though, that this feature
introduces negative performance implications (and hence it is disabled by
default), so you should evaluate its impact before implementing it. You can
confirm successful completion of backups and media set readability (including
checksum, if applicable) by executing RESTORE VERIFYONLY against a media or
backup set. This statement, enhanced in SQL Server 2005, emulates the restore
procedure without actually performing it. Furthermore, providing that the
checksum information has been recorded during backup, you can take advantage of
it by using the CHECKSUM clause when running RESTORE. However, a potential
problem with this approach is the fact that even a single error encountered
during restore will result in its failure. If such behavior is not desired, you
can prevent it by applying the CONTINUE_AFTER_ERROR option, allowing the
restore to complete (while this is likely to leave a database in inconsistent
state, you can follow it with a manual repair).
Another backup reliability enhancement introduced
in SQL Server 2005 is integration of full-text catalogs into your database
backups. Full-text catalogs contain full-text indexes, which are used to speed
up keyword-based queries of text data stored in SQL Server databases in char, varchar,
nvarchar, and formatted binary columns. While full-text search catalogs are
stored outside of SQL Server (as part of the file system), they are managed by one
of SQL Server 2005 components (Microsoft Full-Text for SQL Server service) and
administered with SQL Server utilities. Integrating full-text search catalogs
into SQL Server backups and restores resolves the synchronization issues
present in SQL Server 2000, where each of these procedures had to be performed
separately. You can also include full-text catalogs when transferring databases
by combining DETACH and ATTACH operations (this was not the case in SQL Server
2000).
online restore – increases availability by allowing the target
database to remain online for its duration. Online restore functionality is
available only in the case of Enterprise and Developer editions of SQL Server
2005 (at the Beta 2 stage) and applies only to configurations where the
database consists of multiple filegroups. It can be used when restoring files,
pages, or when running piecemeal restore (performed in stages, by filegroups,
starting with the restore of the primary one). Specifics of this process depend
on a number of circumstances, such as, recovery mode, availability of
transaction logs necessary to bring database to consistent state, and read-only
status of filegroups that are to be restored. The online restore feature is useful
in situations following a failure of a drive hosting an individual filegroup.
Database administrators might be able to maintain a database on-line, while
restoring this filegroup only, without impacting access to data residing in
other filegroups.
support for highly available and scalable hardware (since SQL
Server 2005 relies here on functionality built into the Windows 2003 Server and
specialized server hardware, you need to have both in place in order to take
advantage of the features listed below):
- 64-bit version availability – on both Intel (64-bit Itanium and
  Xeon EM64T) and AMD (Opteron with Direct Connect Architecture) platforms, when
  running 64-bit version of Windows 2003 Server – increases considerably the
  amount of supportable memory (up to the theoretical limit of 32TB), eliminating
  4 GB limit inherent to 32-bit platform (although it is possible to push this
  limit to 32GB of RAM by taking advantage of Advanced Windowing Extensions in
  32-bit versions of Windows 2000 and Windows 2003 Servers) and makes it
  comparable with other database management systems operating in 64-bit computing
  environments. The new version includes 64-bit versions of Analysis Services and
  Integration Services, according to the recently released Community Technology
  Preview. SQL Server 2005 has also been designed to operate more efficiently in
  the Non-Uniform Memory Architecture (NUMA) systems.
- hot-plug memory – takes advantage of the functionality present in
  Windows 2003 Server Enterprise and Datacenter Editions (providing that underlying
  hardware supports it as well) which makes it possible to add (but not remove)
  physical memory while the operating system is running. SQL Server 2005 is
  capable of detecting the change dynamically (previous versions of SQL Server
  could only recognize the amount of memory present at the startup). In order for
  this feature to be available, the SQL Server 2005 needs to have Address
  Windowing Extensions activated (using the awe enabled server
  configuration option) and -h option specified at the startup.
increased number of instances supported on a per server basis –
the number has been raised to 50 with the Enterprise Edition of SQL Server 2005
(from 16 in SQL Server 2000).
reduced number of restarts required for changing configuration
options – it is no longer necessary to restart SQL server to perform such
modifications as adjusting CPU affinity or to perform changes to the AWE
configuration (including, as described earlier, recognizing newly added
memory).

This concludes our review of high availability and scalability enhancements
in SQL Server 2005 Beta 2. In our next article, we will look into new security
related features.

»

See All Articles by ColumnistMarcin Policht

SQL Server 2005 Part 5 – High Availability and Scalability Enhancements

Marcin Policht

Company

Categories