Database management systems (DBMSs) have simultaneously simplified and complicated the lives of many IT workers. Error codes passed from the database back to the application can take on more than 1000 values. What level of error checking should developers include in applications?
Gone are the days of file processing. In the past, it was enough to
check for end-of-file conditions and key errors. Other errors such as device
failure or resource lockups either caused the program to terminate abnormally
or forced the system to suspend the program until the condition was fixed.
Today, the application is expected to handle all of these conditions and more.
Applications accessing DB2 fall
into this category. Error codes passed back to the application can take on more
than 1000 values. How many of these are actually fatal? Which should the
application check?
The Singleton Select
The term singleton select
refers to a SELECT statement embedded in a program. This statement is expected
to retrieve only a single row and is typically found in random or direct
accessing of data (e.g., retrieving a row from the Customer table based upon a
unique customer number).
Possible problems with this
statement stem from differences between the application developer’s
expectations of the data and data model and the definition of the table. What
if the data model or database design changes? Consider a table with a datetime column
as the key that is defined as unique. Application code is written to select
based on this key. Then, at some point, the table definition is changed to
allow duplicate datetime values. What will happen in the application? Where
does the responsibility lie for error checking in this case?
Multi-Row Retrieval
An application can retrieve
rows from a multi-row result set using a cursor. Cursors appear in several
different SQL statements including Open, Fetch, Update, Delete and Close. Some
languages such as java have constructs that can be used to issue all of these
cursor operations automatically. With other languages such as COBOL, it is the
responsibility of the developer to code each of these operations explicitly.
In either case, some part of
the application must handle error conditions that may occur during cursor
operations. Some of the possible conditions that may be encountered are:
1. DB2 is
not active
2. The
application is not authorized to connect to DB2
3. The
application’s plan or package is invalid or disabled
4. One or
more of the tables being accessed are not available
5. One or
more of the result set columns allow Nulls and no indicator variables are
specified
6. FETCH
was specified for a cursor that was not open
7. OPEN was
specified for a cursor that was already open
8. Update
or Delete was specified for a cursor that was not open
9. Deadlock
or Timeout encountered
Note that some of these issues
(such as 1-4 above) may indicate a system, connection, or configuration issue;
however, the application is still in control! How does it respond to these
errors? Who should it notify?
Other errors (6-8) may indicate
application logic issues. This may be possible if application code is changed
without thorough testing.
Error 9 in the above list is
particularly annoying and can occur under several circumstances. I treat this
in a separate section below.
Row Insertion
Application logic for row
insert may encounter some unique issues. These include:
- Capacity
— Table or index out-of-space conditions - Constraint
Violation — Insert of invalid foreign key values, violation of column check
constraints or uniqueness constraints - Lock
escalation — A high volume of Inserts coupled with few (or no) Commits may lead to
DB2 locking the entire table on behalf of the application
Row Update
An application updating a row
may encounter authorization issues, including violation of column or
referential integrity constraints.
Row Delete
Most errors associated with
Delete operations involve referential integrity. When an application attempts
to delete a row from a parent table, the delete rules for the foreign keys in all
descendent tables determine whether the Delete is successful and, if so, what
other rows in other tables will be affected. Regrettably, the application
cannot ascertain how many other rows are affected unless this is included in
the application logic prior to the Delete.
Connectivity Problems
These involve attempts to
attach to DB2, use of the Connect statement in the distributed environment and
distributed processing errors. What should the program do when it encounters
errors of this kind?
Deadlock, Timeout and Resource Problems
All applications accessing DB2
resources should be prepared for deadlock and timeout conditions. In the
typical scenario, the program is several levels deep in if-then-else logic, has
several cursors open, and has processed and perhaps updated several DB2 tables.
When a resource issue occurs, all is undone.
In this case, DB2 detected a
deadlock or timeout condition or an unavailable resource. It is informing the
application that all update activities (Insert, Update and Delete) in the
current transaction have been rolled back. Should the application report this?
Should it re-attempt the transaction from the beginning (and how)?
This can make program logic
really nasty. Must the application put special logic after every SQL statement
to handle each specific deadlock condition? No. Generally, the program need
only be restarted from the beginning of the most recent transaction.
It is the application’s
responsibility to recover from deadlocks and timeouts in a manner that will
restart the current transaction. In some cases, the program can attempt to
analyze the reasons for the allocation failure and report them.
One last problem involves
exceeding installation-specified resource limits. This normally indicates the
application has used excessive CPU time as defined in the resource limit table.
Summary and Best Practices
What level of error checking
should developers include in applications? The answer lies somewhere between
minimal and too much.
Consider a suite of DB2
applications that do only minimal error checking. Exception conditions that
occur in production must now be handled by either users, help desk staff or
on-call personnel. How many of the situations described so far can be
replicated, let alone fixed? Diagnosis of a problem may require turning traces
on, using an on-line monitor or debugging tool, re-creating the application
situation and hoping the error occurs again. Minimal error checking leads to
execution errors, a larger maintenance staff and disgruntled users.
The alternative is to build as
much error-handling logic into applications as possible. This is most easily
accomplished by inclusion of standard error processing code, perhaps with an
optional subroutine to process "fatal" errors. Such a subroutine
would be responsible for analyzing errors, extracting diagnostic data,
correcting problems if possible, and notifying the proper authorities.
Return codes from the
subroutine would inform the application which steps must be taken.
While completeness has its
virtues, there is a practical limit to error checking. Developers will start
tearing their hair out if every other line they code is error-checking logic.
IT shops must develop installation-wide standard for error checking. These
should include a list of error codes or messages that must be checked
after:
- Every
SQL statement - Specific
SQL statements (such as Insert, Delete, etc.)
Standards would include a set
of error-processing routines that would be required, including methods for
programs to route error and diagnostic information to the proper destination.
The bottom line: Create
meaningful standards for error checking and have applications check for as many
errors as possible consistent with those standards.
References:
IBM: IBM DB2 FOR Z/os Information
Center