How DBAs Can Tune Distributed IBM DB2 Applications
July 19, 2010
Many critical business applications now execute in an environment separate from that of the enterprise database server. The database administrator often finds monitoring and performance tuning of these "distributed" applications to be especially difficult. This article looks at common performance issues of distributed applications and presents advice to assist the IBM DB2 database administrator in mitigating performance problems.
The most common performance issues in a distributed environment are caused by excessive or unnecessary locks, poor commit frequency, and inappropriate or volatile data access patterns.
Locking and Data Access: What the DBA Must Know
We love the phone call from the user that begins with the question, "Why is my application slow"? What does slow mean? And the onslaught of questions begins How do we know it's slow? Was it slow yesterday? And so forth. To address these questions with accuracy, having a monitoring tool or set of tools is imperative.
Understand the difference between locks and claims
The DBA needs to understand how and why DB2 uses transaction locks for concurrently control, and how claims are used for SQL serialization control.
Consider (carefully) using uncommitted read access for SELECT statements
It is possible, and sometimes desirable, to SELECT data from a database without acquiring locks on that data. This can be accomplished at the package level with the appropriate Bind option, or on a statement-by-statement basis using the WITH UR parameter. This option allows the application to read while acquiring few locks, but at the risk of reading uncommitted data.
Keep in mind that binding a package with uncommitted read may raise red flags with auditors
This is because using uncommitted read risks accessing data that is not yet committed and may be changed (or deleted!) before the SQL statement finishes. In addition, some applications (such as payroll) need the data to remain fixed and unchanged during the course of the application.
Know the lock sizes of the objects and how application SQL will access them
In a dynamic distributed environment, transaction elapsed time is often a concern. In some cases, to avoid deadlocks and timeouts on some tables the DBA will consider row-level locking rather than page-level locking. The effect of this reduced lock granularity is to minimize the locking footprint of the transaction. However, there are several side-effects, including additional CPU spent on lock management and an increase in the absolute number of locks held by the transaction. These must be taken into account during the database design and application design phases.
Coordinate application transaction execution with required infrastructure jobs
DB2 has evolved to the point where utility execution (copy, reorg, and so forth) rarely contends with application data access. This is primarily due to the Sharelevel Change options added to the utilities, which allows concurrent update to the data during utility execution.
One exception to this occurs during a tablespace reorganization. At the end of the reorganization process, the utility requires access to the table called a drain that prevents new SQL statement execution and waits for existing claims to be released.
As the DB2 for z/OS Administration Guide V8 states,
This means that in order to complete successfully the Reorg utility requires a drain on the table being reorged. This necessitates that applications be designed to commit at appropriate points.
COMMITs and Commit Logic
Determining the correct commit frequency
The COMMIT statement releases most locks and provides a recovery point. Applications in a distributed environment will normally commit logic at the end of one or more transactions, depending upon the purpose of the application. At the database server, commits result in logging of data denoting the end of the transaction, as well as externalizing (i.e., writing to DASD) information necessary for recovery.
Since the commit process involves a certain amount of CPU and I/O, don't commit too frequently. Here's where a monitoring tool that can show you the impact of the commit comes in handy. Monitoring can point out the effects that multiple distributed applications can have upon each other, such as deadlocks and timeouts. It also may indicate other related issues such as concurrent long-running batch programs, which may interfere with application data access.
Using commits to set recovery points
The primary reason for commit processing from the application's point of view is to set a logical point for recovery. In a distributed environment, it is rare for an application to be coded with restart logic (that is, the ability to recover from an in-flight error such as a deadlock or timeout and then re-drive the current transaction from the beginning). It is still possible for fatal errors to occur. In these cases, the application design is predicated on the ability of the database management system to return the database to a consistent state by rolling back recent changes to the previous commit point.
Data Access Patterns
It is important to understand how data volatility affects the database.
A hot spot is a physical place in the database where a lot of data access is happening. The worst situation is when a lot of inserts, updates, deletes, and selects are happening to a set of rows in one location. A flood of inserted rows requires DB2 to find space to store the data, eating up embedded free space. Simultaneous rows updates and selects can lead to deadlocks and timeouts. Hot spots are common in databases accessed by distributed applications.
Usually, hot spots can be predicted (and perhaps avoided) during database design. Take the case of an account table where new rows are assigned new, ascending account numbers. If the table is clustered by account number, the physical end of the table now becomes a hot spot where all new rows will be inserted.
Another example might be a table keyed by date, where all transactions for a particular date range are stored (i.e., clustered) close together.
Some hot spots can be avoided, or at least mitigated, during database design. The DBA can embed free space for rows to be inserted, and can schedule frequent reorgs to re-distribute rows. It may be possible to physically cluster data in ways that spread new row inserts evenly across the table. The DBA will also coordinate with the data owners to ensure that old or unused data is regularly purged or archived.
The requirement for a regular purge or archive process
Tables of ever-increasing size usually result in longer and longer query execution times, hence the need for a purge or data archive process. There are several database design alternatives that may make purging this easier. One common possibility is to partition data such that you can purge by partition. For example, a table keyed by date where the chronologically oldest data is to be purged can be partitioned by date range. The purge process might then be designed to simply empty the oldest partition, rather than execute costly SQL delete statements to remove the rows. Such designs also lend themselves to data archival processes.
Dynamic SQL in a distributed environment presents several performance challenges to the application team and the DBA. Typical bottlenecks tend to be due to delays associated with waiting for locks, access path issues, or other resource constraints.
Application tuning in this environment should concentrate proactively on good database design to avoid hot spots, monitoring to detect problems, regular performance reporting and trend analysis.
A Data-Centric Approach to Distributed Application Architecture