Tuning Your ERP System for DB2
March 28, 2011
Most ERP package vendors provide product tuning recommendations, usually in the form of white papers or documentation downloadable from their website. In addition, there are many user groups and conferences (like the International DB2 Users Group) where presentations on tuning are given.
I will not cover all of these issues in this article — you should research this on your own. Instead, I will provide you with a point at which to start developing your tuning process.
In an ERP package environment, the two most common reasons for "tuning" are:
Since these reasons are both classified as behaviors, most tuning efforts tend to concentrate on application-related behaviors. The primary culprits are assumed to be poorly-written SQL statements, poor index design, and lack of good data distribution statistics. While these may be good places to start your investigation, remember that you want to determine and fix the underlying cause of the behavior, not just the behavior itself.
The starting point for tuning poor application behavior is usually "problem" SQL statements and index construction and selection. There are many good references and guides available for beginning this process, both from vendors and consultants. Meanwhile, let's look at the system-related behaviors.
ERP Package System Behavior
For most ERP packages Tuning issues typically revolve around two things:
For the first of these, the most common issues revolve around n-tier hardware and software configuration. To address them you must measure and monitor network flows and document and confirm proper configurations. This typically involves things like TCP/IP tuning, DB2Connect parameter setting, and the like. Regrettably, much of this will be beyond your control.
To do proper DASD I/O tuning, you must be aware of your DASD configuration. Many shops have moved away from standard "round, brown, and spinning" hardware to RAID configurations. We won't cover specific technology tactics here; instead, we will concentrate on what to measure.
As stated before, ERP Packages are usually I/O-bound. Further, much of the I/O time is spent waiting for synchronous reads. These are reads issued by DB2 on behalf of an ERP process, usually a SQL statement, during which the process must wait for the read to complete. This is in contrast to asynchronous reads, such as prefetch, that can execute independent of the SQL statement or process.
Reducing I/O Wait Times
Much of your initial ERP package tuning efforts will be concentrated on reducing waits for synchronous reads. The most common reason for such a wait is a SQL statement that requires access to a table in a non-clustering sequence (usually through an index that has a low ClusterRatio). Thus, reads to the table are random, hopping from one page to another across the tablespace. While this application-related behavior can usually be fixed (perhaps by re-clustering, access path tuning, index changes, etc.) we first need to know that it is happening.
The best source for synchronous I/O wait times is the SMF records produced by DB2. (While it is possible to get a general idea of DASD I/O times from such things as RMF, these tools don't really give an accurate picture of DB2 I/O. This is because the DB2 read and write engines function independently of SQL and other processes. This means that synchronous and asynchronous I/Os will be combined in any reporting.) There are many tools available for gathering this data. The most common seem to be the Accounting Detail and Statistics reports available with the IBM DB2 Performance Monitor (DB2PM). Again, there are also several third-party tools that can produce many of the equivalent reports.
Begin your tuning efforts with the Accounting Detail report, and gather measurements of Class 3 Sync I/O Wait Time and Sync Read I/Os. Take the ratio of wait time to I/Os. The Rule of Thumb, according to the DB2 Administration Guide: Less than 40 ms. per I/O is good.
If you are experiencing longer average wait times, you don't have many choices at the system level. Typical responses are listed below.
Reallocate Virtual Pools to a larger size
The problem may lie in the DASD subsystem. A bigger buffer pool may allow page residency time to increase. You can also consider allocating Hiperpools or Data Spaces for the virtual pools, as these will reside outside the DBM1 address space. Have your operating systems group monitor system paging.
Reduce dynamic prefetch activity
Consider more frequent Commits, use OPTIMIZE FOR 1 ROW in certain SQL statements.
Research dataset placement
In a RAID environment, it may no longer make sense to "physically separate" tablespaces from indexes, since these devices simulate MVS volumes at a logical level. However, some hardware implementations have the capability to "pin" datasets or volumes in the RAID cache, basically making the datasets resident in memory on the device. This should reduce waits for I/Os, although it is expensive in terms of hardware usage.
Once you have pre-diagnosed that your wait time(s) are high, you can use RMF reporting to get statistics at the volume level to see if there is a DASD subsystem problem.
Your next step will be to look at the following:
The Administration Guide has additional tuning suggestions regarding these items.
Additional Subsystem Tuning
On a subsystem-wide note, you need to determine if your total DB2 subsystem is read-intensive or update-intensive. To do this, research your normal DB2 logging rate. Print a copy of a recent Bootstrap Dataset (BSDS) and pick a succession of archive logs that were created during ERP package execution. (Warning! Most of the times in the BSDS are in GMT, not local time. Check the Administration Guide and the comments at the beginning of the BSDS report to determine this.)
Note the begin and end RBAs and times of the Archives and use subtraction to determine elapsed time and RBA range. Divide the total RBA size by the total elapsed time to determine logging rate.
A general guideline:
Note that 1 Mb/Sec is the equivalent of about 80 cylinders of 3390 DASD per minute.
If your subsystem is update-intensive then you need to consider additional monitoring and tuning. Here are the usual tactics used in this situation. Many of these are not possible in an ERP package environment (rearranging table columns, for example); however, it is more and more common to implement ERP packages in non-exclusive DB2 environments. Consider the following for your DB2 subsystems as a whole:
Keep frequently-updated columns close to each other
Help to reduce logging volume by keeping frequently-updated columns physically adjacent in table definitions; the Administration Guide covers Logging and such tuning tactics in great detail.
Reconsider COMPRESS YES
The cost to decode the row, do the update, and re-encode the row may be significant in an update-intensive subsystem. There is also the overhead of placing rows in overflow. Consider setting COMPRESS to NO for tables of small size in total pages).
Analyze Log Write performance for possible bottleneck(s)
It is possible that your Logging process is the bottleneck in an update-intensive system. Check MSTR address space logs for "Log Full" messages; Consider increasing the size of Log Buffers (the OUTBUFF ZParm); Analyze DASD I/O times for your log volumes.
ERP package tuning efforts usually concentrate on increasing throughput by decreasing response times that are primarily caused by I/O-intensive queries. Most commonly the correct approach is to monitor for synchronous read events and review alternatives for reducing them (such as index re-design, clustering, partitioning, and parallelism).