Many critical business applications now execute in an environment separate from that of the enterprise database server. The database administrator often finds monitoring and performance tuning of these “distributed” applications to be especially difficult. This article looks at common performance issues of distributed applications and presents advice to assist the IBM DB2 database administrator in mitigating performance problems.
The most common performance issues
in a distributed environment are caused by excessive or unnecessary locks, poor
commit frequency, and inappropriate or volatile data access patterns.
Locking and Data Access: What the DBA Must Know
We love the phone call from the
user that begins with the question, "Why is my application slow"?
What does slow mean? And the onslaught of questions begins … How do we know
it’s slow? Was it slow yesterday? And so forth. To address these questions with
accuracy, having a monitoring tool or set of tools is imperative.
Understand the difference between locks and claims
The DBA needs to understand how and why DB2 uses
transaction locks for concurrently control, and how claims are used for SQL
Consider (carefully) using uncommitted read access for SELECT statements
It is possible, and sometimes
desirable, to SELECT data from a database without acquiring locks on that data.
This can be accomplished at the package level with the appropriate Bind option,
or on a statement-by-statement basis using the WITH UR parameter. This option
allows the application to read while acquiring few locks, but at the risk of
reading uncommitted data.
Keep in mind that binding a package with uncommitted read may raise red flags with auditors
because using uncommitted read risks accessing data that is not yet committed
and may be changed (or deleted!) before the SQL statement finishes. In
addition, some applications (such as payroll) need the data to remain fixed and
unchanged during the course of the application.
Know the lock sizes of the objects and how application SQL will access them
In a dynamic distributed
environment, transaction elapsed time is often a concern. In some cases, to
avoid deadlocks and timeouts on some tables the DBA will consider row-level
locking rather than page-level locking. The effect of this reduced lock
granularity is to minimize the locking footprint of the transaction. However,
there are several side-effects, including additional CPU spent on lock
management and an increase in the absolute number of locks held by the
transaction. These must be taken into account during the database design and
application design phases.
Coordinate application transaction execution with required infrastructure jobs
DB2 has evolved to the point
where utility execution (copy, reorg, and so forth) rarely contends with
application data access. This is primarily due to the Sharelevel Change options
added to the utilities, which allows concurrent update to the data during
One exception to this occurs
during a tablespace reorganization. At the end of the reorganization process,
the utility requires access to the table called a drain that prevents new SQL
statement execution and waits for existing claims to be released.
As the DB2 for z/OS
Administration Guide V8 states,
drain quiesces the applications by allowing each one to reach a commit point,
but preventing any of them, or any other applications, from making a new claim.
When no more claims exist, the process that drains (the drainer) controls
access to the drained object. The applications that were drained can still hold
transaction locks on the drained object, but they cannot make new claims until
the drainer has finished."
This means that in order to
complete successfully the Reorg utility requires a drain on the table being
reorged. This necessitates that applications be designed to commit at
COMMITs and Commit Logic
Determining the correct commit frequency
The COMMIT statement releases
most locks and provides a recovery point. Applications in a distributed
environment will normally commit logic at the end of one or more transactions,
depending upon the purpose of the application. At the database server, commits
result in logging of data denoting the end of the transaction, as well as
externalizing (i.e., writing to DASD) information necessary for recovery.
Since the commit process
involves a certain amount of CPU and I/O, don’t commit too frequently. Here’s
where a monitoring tool that can show you the impact of the commit comes in
handy. Monitoring can point out the effects that multiple distributed
applications can have upon each other, such as deadlocks and timeouts. It also
may indicate other related issues such as concurrent long-running batch
programs, which may interfere with application data access.
Using commits to set recovery points
The primary reason for commit
processing from the application’s point of view is to set a logical point for
recovery. In a distributed environment, it is rare for an application to be
coded with restart logic (that is, the ability to recover from an in-flight
error such as a deadlock or timeout and then re-drive the current transaction
from the beginning). It is still possible for fatal errors to occur. In these cases,
the application design is predicated on the ability of the database management
system to return the database to a consistent state by rolling back recent
changes to the previous commit point.
Data Access Patterns
It is important to understand
how data volatility affects the database.
A hot spot is a physical place
in the database where a lot of data access is happening. The worst situation is
when a lot of inserts, updates, deletes, and selects are happening to a set of
rows in one location. A flood of inserted rows requires DB2 to find space to
store the data, eating up embedded free space. Simultaneous rows updates and
selects can lead to deadlocks and timeouts. Hot spots are common in databases
accessed by distributed applications.
Usually, hot spots can be
predicted (and perhaps avoided) during database design. Take the case of an
account table where new rows are assigned new, ascending account numbers. If
the table is clustered by account number, the physical end of the table now
becomes a hot spot where all new rows will be inserted.
Another example might be a
table keyed by date, where all transactions for a particular date range are
stored (i.e., clustered) close together.
Some hot spots can be avoided,
or at least mitigated, during database design. The DBA can embed free space for
rows to be inserted, and can schedule frequent reorgs to re-distribute rows. It
may be possible to physically cluster data in ways that spread new row inserts
evenly across the table. The DBA will also coordinate with the data owners to
ensure that old or unused data is regularly purged or archived.
The requirement for a regular purge or archive process
Tables of ever-increasing size
usually result in longer and longer query execution times, hence the need for a
purge or data archive process. There are several database design alternatives
that may make purging this easier. One common possibility is to partition data
such that you can purge by partition. For example, a table keyed by date where
the chronologically oldest data is to be purged can be partitioned by date
range. The purge process might then be designed to simply empty the oldest
partition, rather than execute costly SQL delete statements to remove the rows.
Such designs also lend themselves to data archival processes.
Dynamic SQL in a distributed
environment presents several performance challenges to the application team and
the DBA. Typical bottlenecks tend to be due to delays associated with waiting
for locks, access path issues, or other resource constraints.
Application tuning in this
environment should concentrate proactively on good database design to avoid hot
spots, monitoring to detect problems, regular performance reporting and trend