Monitoring an ERP Application in DB2 for z/OS

Monitoring an ERP application suite involves using several basic measurement techniques. Since it probably co-exists with other applications within a DB2 subsystem or data sharing group, a means must be found to separate ERP package performance measures from those of other applications. Additionally, when measuring subsystem performance you must isolate the effects of the package from those of other applications.

Monitoring an ERP Application in DB2 for z/OS

Creating Measurement Processes

There are three things to remember when creating measurement processes:

  1. Implement Automation. One-time measures are certainly useful – performance ‘snapshots’ are sometimes essential for debugging performance problems; however, an ERP package contains many thousands of objects and hundreds of processes (or more). You need a process that develops meaningful measures that allow you to get overall status information as well as the ability to drill down to the details. To do this you must find the means to automate both the data gathering and measurement processes. Many performance measurement products have features for automating either data gathering or performance reporting.
  2. Save Historical Data. It takes time to develop a complete picture of ERP package behavior. It is unrealistic to believe that you will be able to do a good job of monitoring and tuning within a few weeks – or months. You must gather data on a regular basis and store it for later processing. (Relational databases are great for this!) Then, over the course of time, you can do trending analysis and capacity planning. Historical data is essential for regression testing; for example, if you are considering making major performance changes you will want a set of measures both before and after the changes to see if there was any improvement.
  3. Develop graphical reports. A spreadsheet filled with numbers will not be enough. You must decide on the best way to present your data in order to interpret it correctly. Bar charts, scatter diagrams, and pie charts can all be used to good effect.

What to Monitor; What to Measure

There are two general categories of measurements: system-related and application-related. These categories overlap somewhat, but generally speaking system-related measurements are at a high, or global level. Applications demand work and resources from the DB2 subsystem (such as data storage and retrieval) and the subsystem then performs work on the applications’ behalf.

Some common system-related measurements are:

Virtual pool threshold attainment
      Sortpool, EDM Pool, Rid Pool filling or failure
      Logging rate and Log Write performance
      Global buffer pool usage
      Subsystem address space CPU usage

Application-related measurements are more detailed and specific. They deal with suites of SQL statements (either separately or taken in sets). Here, the most common measure deal with I/O:

      Buffer Pool hit rate (or page residency times)
      Waits for synchronous I/O
      Prefetch rates
      Hardware Cache usage
      Index efficiency
      Data access paths

Measurement Classes

There are also two classes of measurements that you will need: measurements of resources (which may be overused, underutilized, or constrained) and measurements of behaviors (which may be symptoms of underlying problems).

Resources

      Memory
            DBM1 Address Space (Virtual Storage constraints)
            EDM Pool, Sort Pool, RID Pool
            Virtual Pools (Page Residency)
      CPU
      I/O Subsystem
      Network

Behaviors

      Locking
      Compression
      Work File Extents
      Deadlocks, Timeouts

The difficulty in dealing with behaviors is that they are symptoms only; they are not the problems themselves. It is easy to fall into the trap of thinking that you have isolated a problem to a few SQL statements. True, making changes (adding OPTIMIZE FOR n ROWS, re-designing indexes, running multi-column RunStats) may increase statement execution speed; however, the net result may be that you have obscured or totally missed the underlying problem(s).

This is why you need to gather measurements from all classes: resources and behaviors, system and application.

In order to better understand what you are measuring, here’s an example from one installation in the early stages of an ERP package tuning effort. Table 1 shows a cross-reference of their measurement categories and classes with specific measurements. Table 2 shows some typical tactics and tuning efforts for each quadrant.

Table 1: Measurement Categories and Classes for an ERP Installation

 
System-Related
Application-Related
Resources
Memory
    DBM1 Addr Space (total)
    SortPool
    RID Pool
    EDM Pool
CPU
    Addr Space Usage
I/O Subsystem
    Log Writing
    Pageset Placement
    Global BP Sizing
CPU
    CPU used by SQL stmts
    Class 1 & Class 2 Times
I/O Subsystem
    Pageset Placement
    Hardware Cache Usage
    "xxLARGE" Tablespaces
Network
    3-Tier Configuration
    DB2 Connect Settings
Behaviors
Compression
    Hardware Compression
Work Files
    Total Size
    Bpool Assignment
    Bpool Thresholds
    Physical Separation
Locking Behaviors
    IRLM Parms
    Incidence of Lock Escalation
Compression
    Selection of Tablespaces
Access Paths
    Access Path Analysis
    Index Selection
Locking Behaviors
    Incidence of Deadlocks
    Incidence of Lock Escalation
    LOCKSIZE settings

Table 2: Typical Tuning Efforts by Measurement Category and Class

 
System-Related
Application-Related
Resources
Memory
    Pool Re-sizing
CPU
    Forbid Parallelism
    Remove Compression
I/O Subsystem
    Catalog/Directory Reorg
    Consider HiperPools
    BP Thresholds
    LOGLOAD/CHKFREQ
Network
    Re-Configure DB2Connect
CPU
    No Compression for "small" tables
    Access Path Tuning
I/O Subsystem
    Implement Data Spaces
    FREEPAGE, PCTFREE
 
Behaviors
Work Files
    Enlarge
Locking Behaviors
    NUMLKUS, NUMLKTS
    IRLMRWT, DEADLOK
Compression
    OFF for Small Tables
Access Paths
    BIND Acquire and Release
    Partitioning
    Optimizer Hints
    Parallelism
    RunStats (multi-column)
Locking Behaviors
    Row-level Locking
    Large PCTFREE; Reorg

So, What Should I Measure?

For ERP systems there are three initial areas to emphasize, depending upon data and transaction volumes. These relate to the package’s use of memory and the I/O subsystem:

  • DBM1 Address Space total real storage
  • the EDM Pool
  • the virtual pools (buffer pools)

The DBM1 Address Space

The DBM1 address space contains many areas of memory called pools essential to proper ERP functioning. Among these are the Virtual Pools, EDM Pool, Sort Pool, RID Pool, and others. You should construct a spreadsheet that lists your various storage allocations.

You can estimate usage based on ZParm settings for most of these pools; however, in a memory-constrained environment you should use whatever tools you have available to get real-time storage measurements. For example, it is possible to use DSNWDMP (described briefly in the IBM DB2 Diagnosis Guide) to dump the DBM1 address space and get actual pool size measurements. Other third-party tools may have summary or detail statistics you can use.

The important things to do (as mentioned at the beginning of this article) are to:

  • Automate the measurements
  • Retain historical data
  • Implement graphical reports

So, you can begin with a simple daily (or hourly) measurement of total DBM1 address space real storage usage. Store the measurements over time, and graph the result. You will probably find that memory usage increases sharply on DB2 subsystem startup. This should be no surprise; many of the pools are allocated only on first use; also, some control blocks stored in DBM1 are related to datasets that are not open yet.

Ensure that ZParms and other changes are controlled (SRTPOOL, MAXRBLK, MAXKEEPD, EDMBSPAC, EDMPOOL, DSMAX, CTHREAD, MAXDBAT, buffer pools, etc.). The settings for these and other ZParms will affect memory allocation in the DBM1 address space.

As the production week rolls on your DBM1 real storage graph should stabilize, particularly as ERP package functions are executed. For example, if your package includes a payroll function you should see real storage allocations jump as payroll begins running.

Over time if you do not see real storage usage stabilize then it is possible that there is a DB2 subsystem problem. Check your maintenance level and refer to IBMLink for APARs related to storage issues.

The EDM Pool

The EDM Pool contains database objects such as database descriptors (DBDs) and, if you have implemented dynamic SQL caching, the dynamic SQL cache. Most third-party tools have a process for measuring the EDM pool contents, including the percentage of the pool taken up by each object category.

Begin with this and (as above) map your EDM Pool usage over time. Pay particular attention to the ratio of Request for Pages to Page Already in Pool for each object category. If these ratios are all above 99% it indicates that, in general, your EDM Pool is functioning well. Whenever DB2 needed a database object or a prepared SQL statement, 99% of the time the object was already in the EDM pool and no I/O to disk was required to fetch it.

The dynamic SQL cache needs to be treated a bit differently. ERP packages typically implement much of their functionality with dynamic SQL. Good performance will necessitate that you cache the prepared versions of these statements. The Global Dynamic Cache is allocated when ZParm CACHEDYN is set to YES. Based on your DBM1 storage allocation report you may consider your environment to be virtual storage constrained. In that case, consider setting ZParm EDMDSPAC to YES. This will place the global cache in a Data Space, thus outside of the DBM1 address space. (Note that you must also set additional ZParms that control the size of this data space. These ZParms all begin with “EDM”.)

One final note on the EDM Pool. There seems to be some difference of opinion on the Pages in Use statistic, especially when it is near 100%. Many experts recommend sizing the EDM Pool to allow a certain percentage (say 10%) of the EDM Pool to be unused. Theoretically, this allows for a cushion in case a long-running (and not committing) transaction is responsible for holding object definitions in the EDM Pool for an extended period.

Continue to measure your EDM Pool usage over time. As the maximum size stabilizes, note the high-water mark. If the pool is near 100% used, increase its size gradually to allow for about 10% non-used. Continue to monitor over time. A sudden jump to 100% used may signal the appearance of a rogue application or process.

The Virtual Pools

With most ERP package processes being I/O-constrained, the virtual pools deserve special attention

First, it is assumed that you have assigned objects to the virtual pools based on activity (random access, sequential access, etc.). (You can verify these assignments by a simple query against SYSTABLESPACE and SYSINDEXSPACE.) Now you will need to collect measurements of virtual pool use. You can begin with the -DISPLAY BPOOL . . . DETAIL command if you wish; alternatively, there are several IBM and third-party tools available for measuring and tuning buffer pools.

Your initial measurements should concentrate on when and how often thresholds were exceeded. After that, you can continue with hit ratios. Threshold settings are described in the Administration Guide, including how best to determine them for random and sequential workloads.

A complete discussion of virtual pool measurements and tuning is beyond the scope of this article. I recommend you begin with the materials on virtual pool tuning in the Administration Guide.

In the next article, I’ll cover tuning an ERP application.

Wikipedia – Enterprise Resource Planning – Jan 01, 2011 –

IBM – IBM InfoCenter – Jan 13, 2011 –

IBM – System z Parallel Sysplex Best Practices – Jan 01, 2011

IBM – Infrastructure Solutions: SAP IS Automotive on an IBM Platform – Dec 01, 2008

» See All Articles by Columnist Lockwood Lyon

Lockwood Lyon
Lockwood Lyon
Lockwood Lyon is a systems and database performance specialist. He has more than 20 years of experience in IT as a database administrator, systems analyst, manager, and consultant. Most recently, he has spent time on DB2 subsystem installation and performance tuning. He is also the author of The MIS Manager's Guide to Performance Appraisal (McGraw-Hill, 1993).

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles