Dynamic SQL Performance Tuning Techniques for IBM DB2 z/OS

Applications
running with dynamic SQL are ever-increasing. Without some methods to contain
the onslaught of queries, performance drops and phone calls increase. What can
a DBA do to remain sane and avoid a visit from the head of IT?

This article addresses common issues
encountered in performance of applications containing dynamic SQL, and how to
avoid them. While the primary focus is distributed applications that access IBM
DB2 for z/OS on the mainframe, most of these bottlenecks and their solutions
arise with other DBMSs as well.

Dynamic SQL and the Distributed Environment
Tuning Strategies
Being Proactive About Monitoring
Application Tuning for Dynamic SQL
Data Access and Its Effects
Summary

Dynamic SQL and the Distributed Environment

The mainframe-centric DBA tends to think
of a distributed application as one executing on an application server (usually
in the enterprise’s intranet) that accesses DB2 data on the mainframe. The
application itself is usually coded in Java, although other languages and
environments also exist.

Access to DB2 data can be accomplished
in several ways. These different methods vary in their performance
characteristics, maintainability, and security, and are usually a function of
whether the application is built in-house or purchased. In this article, I will
focus on distributed applications that mainly execute dynamic SQL.

Tuning Strategies

Common tuning strategies available to
the DBA include:

• Application tuning focuses
attention on suites of programs that access particular tables, batch windows,
units of work, and online service levels. This strategy concentrates on meeting
Service-Level Agreements (SLAs) through analysis of transaction units of work,
locking effects, commit frequency and transaction design, and data access
paths.
• Object tuning concentrates on
the general definitions and configurations of tables, indexes, stored
procedures, and other database objects.
• SQL tuning involves SQL review
and knowledge of potential access paths, table and index data distribution, and
statistics. Here, the emphasis is on managing the SQL creation process,
creating an inventory of existing (and planned) SQL and access paths, and
proactively addressing potential issues.
• Resource constraint tuning is
an analysis of possible trade-offs among CPU, elapsed time, I/O, memory,
network traffic, and other resources.
• System tuning deals with the
IBM DB2 DBMS infrastructure that supports the previously mentioned strategies.

In a distributed environment,
application tuning is the best strategy. The other strategies tend to involve
in-depth technical knowledge or advanced performance reporting tools.

Being Proactive About Monitoring

The DBA (or other responsible party)
should set up specific active and historical performance monitoring reports for
each application. In some cases, this should include detail down to specific
SQL statements. Several standard reports are extremely helpful. These top-N
reports contain lists of applications and SQL statements (by environment) that
consume large amounts of resources:

Worst CPU users
High synchronous I/O waits
High number of GetPages
High number of locks.

Simply executing the reports isn’t
enough. They must be relevant to a business unit that has a vested interest in
performance, they should be regularly reviewed, and the process that produces
them may need to be updated.

For example, you may find that some of
your SQL statements responsible for a large number of I/Os are already
optimally tuned. This may occur for analytical queries, or those associated
with a data warehouse. These queries that are statistically different from
others (or significantly change over time) need to be considered separately.

On the other hand, a query that suddenly
shows up as long-running begs the question: Why did this happen? The query that
used to run for two seconds and now runs for two hours should be assigned to
the programming group for analysis. The query then needs to be documented or
marked in some way so it doesn’t re-appear on the report.

The next step is to regularly execute
these reports and store the results in a repository (such as a database) where
they can be queried. Make this information easily available to the business
unit so they can check on their own problems that crop up. This allows for
convenient historical analysis, especially if you’re making multiple
performance enhancements and want to correlate their effects.

Application Tuning for Dynamic SQL

The most common performance-related issues
encountered using dynamic SQL are an overabundance of locks and poor data
access patterns. Here are some points to ponder.

Understand the difference between locks
and claims. The DBA needs to understand how and why DB2 uses transaction locks
for concurrent control and how claims are used for SQL serialization control.

Carefully consider using uncommitted
read access for SELECT statements. It is possible, and sometimes desirable, to
SELECT data from a database without acquiring locks on that data. This can be
accomplished at the package level with the appropriate bind option, or on a
statement by statement basis using the WITH UR parameter. This option lets the
application read while acquiring few locks, but at the risk of reading
uncommitted data.

Know the lock sizes of the objects and
how application SQL will access them. In a distributed environment transaction,
elapsed time is often a concern. To avoid deadlocks and timeouts on some
tables, the DBA will consider row-level locking rather than page-level locking.
The effect of this reduced lock granularity is to minimize the transaction
"locking footprint." However, there are several side effects,
including additional CPU spent on lock management and an increase in the
absolute number of locks the transaction holds. These must be taken into
account during database design and application design phases.

Coordinate application transaction
execution with required infrastructure jobs. DB2 has evolved to the point where
utility execution (copy, reorg, and so forth) rarely contends with application
data access. This is primarily due to the Sharelevel Change options added to
the utilities, which allow concurrent update to the data during utility
execution. One exception occurs during a table space reorganization. At the end
of the reorganization process, the utility requires access to the table called
a drain that prevents new SQL statement execution and waits for existing claims
to be released.

As the DB2 for z/OS Administration
Guide, V9 states: "The drain quiesces the applications by allowing each
one to reach a commit point, but preventing any of them, or any other
applications, from making a new claim. When no more claims exist, the process
that drains (the drainer) controls access to the drained object. The applications
that were drained can still hold transaction locks on the drained object, but
they can’t make new claims until the drainer has finished."

This means that to successfully
complete, the Reorg utility requires a drain on the table being reorged. This
necessitates that applications be designed to commit at appropriate points.

Data Access and Its Effects

It’s important to understand how data
volatility affects the database. The "hot spot" is a physical place
in the database where a lot of data access is happening. The worst situation is
when a lot of inserts, updates, deletes, and selects are happening to a set of
rows in one location. A flood of inserted rows requires DB2 to find space to
store the data, eating up embedded free space. Simultaneous updates and selects
to rows can lead to deadlocks and timeouts. Hot spots are common in databases
accessed in a distributed environment.

Usually, hot spots can be predicted (and
perhaps avoided) during database design. Take the case of an account table
where new rows are assigned new, ascending account numbers. If the table is
clustered by account number, the physical "end" of the table now
becomes a hot spot where all new rows will be inserted.

Another example might be a table keyed
by date, where all transactions for a particular date range are clustered. Some
hot spots can be avoided, or at least mitigated, during database design. The
DBA can embed free space for rows to be inserted, and can schedule frequent
reorgs to redistribute rows. It may be possible to physically cluster data in
ways that spread new row inserts evenly across the table. The DBA also can
coordinate with the data owners to ensure old or unused data is regularly
purged or archived.

Tables of ever-increasing size usually
result in longer query execution times; this drives the need for a purge or
data archive process. There are several database design alternatives that may
make this easier. One common possibility is to partition data so you can purge
by partition. For example, a table keyed by date where the chronologically
oldest data is to be purged can be partitioned by date range. The purge process
might then be designed to simply empty the oldest partition, rather than
execute costly SQL delete statements to remove the rows. Such designs also lend
themselves to data archival processes.

Summary

Dynamic SQL in a distributed environment
presents several performance challenges to the application team and DBA.
Typical issues tend to be due to delays associated with waiting for locks, access
path issues, or other resource constraints. Application tuning in the
distributed environment should proactively concentrate on good database design
to avoid hot spots, monitoring to detect problems, regular performance
reporting, and trend analysis.