How to Monitor an Oracle Database Automatically
January 19, 2011
Automatic monitoring of an Oracle database often results in time consuming, resource intensive tactic, in terms of DBA hours. A more elegant way to monitor database performance automatically is Oracle's Automatic Database Diagnostic Monitor (ADDM). ADDM captures and analyzes statistical data in the Automatic Workload Repository to determine if potential performance related database issues exist.
Automatic monitoring of an Oracle database, in the past and to some extent still today, has often resulted in DBAs developing a huge variety of scripts that would be scheduled to run against the database, producing many reports for the DBA to comb through. Sometimes, when the DBA got around to it, the DBA would sometimes store the performance statistical data in tables or in flat files for future reference. While this tactic obviously worked for a vast majority of DBAs, it was very time consuming and resource intensive (DBA hours).
For those of us who are either constrained by time, don't know how to script or just want a more elegant way to monitor database performance automatically, there is Oracle's Automatic Database Diagnostic Monitor (ADDM). Oracle describes ADDM quite nicely in their Oracle Database 2 Day + Performance Tuning Guide. (ADDM), which is software actually built into the Oracle Database, captures statistical data in the Automatic Workload Repository (AWR) and analyzes data to determine if there are any potential performance related database issues. Once issues are found ADDM can then help determine what the root cause of the performance problem is and recommend methods to solve them as well as report on the expected benefits of performing a tuning operation.
Most DBAs, who have been around a while, remember the old BSTAT and ESTAT snapshots. ADDM is somewhat similar in this regard, as it performs analysis after an AWR snapshot is taken with results being saved in the database for viewing and interpretation through Oracle's Enterprise Manager (OEM). ADDM analysis takes a top down approach--identifying performance issues at a high level and then refining its analysis of issues until a root cause is determined. ADDM's prime statistic used is DB time to help pinpoint problems; aiming to reduce this time for a given database request/workload. This is a good place to start, as DB time is an indicator of how much time has been spent in the database for wait time and CPU time fulfilling user requests. When DB time is reduced then we can expect results to our database requests to be fulfilled quicker and we can also ask more of our database. ADDM thus reports on system resources that consume a large portion of DB time as the problem areas.
Obviously, or not so obvious, we must be cautious of trying to tune those database requests that inherently require a lot of DB time and we are unable to solve them. Take for instance a large "batch" job that must retrieve every row within a very large table. Depending on other database requests, after we tune this request, this query/report may still show up within ADDM as a very high resource intensive query/report and consume very much DB time.
Just be cautious of those items that cannot be tuned any further.
Taking the analysis a bit further, ADDM will often recommend changes/solutions for a DBA to take. These solutions take the form of making hardware changes such as bigger/new CPUs, altering database configuration/parameter settings, making changes to a schema, changing how your request/application works such as using bind variables, or even using/invoking other advisors within the Oracle database.
Again, be cautious of what you can actually tune. You may get stuck within the goal of tuning because you either cannot afford a new/bigger CPU, can't alter an application, or even changing a database configuration parameter will make something else perform poorly. Making changes to a database is where the real skill of a DBA comes into play. This skill can only be acquired through countless hours of actually knowing how your database works, the workload it responds to, and testing, testing, and testing in a controlled QA/stress test environment. A key point put forth by the Oracle 2 Day + Performance Tuning Guide, and one you should memorize and not take too lightly, is that performance tuning is an interactive process. Even with all things staying the same, consistent workload, the attempt to fix a problem/bottleneck can often cause another bottleneck or shift that bottleneck somewhere else; many tuning cycles may be required before things settle again.
I have always found that keeping precise records of how things performed before a change and after a change is critically helpful. Keep in mind that if you have no idea of how EVERYTHING performed before a database change then there is no way you can determine if a change improved something or not. Also, keep in mind that if you want to back-out a change you should go back to a state that was previously experienced. I don't know how many times that I've seen DBAs supposedly back-out a change only to experience a different bottleneck. This might seem very silly to some but this is where lazy record keeping of changes can cause havoc on yourself and others. Be meticulous about what you change and record everything!
If you are in an Oracle Real Application Cluster (Oracle RAC) environment, you should be aware of the fact that you can use ADDM to help analyze database cluster throughput performance. ADDM for RAC will consider DB time as the sum of times for all database instances and report its findings at the cluster level. This too is very important, as zeroing on performance issues must be considered at times both at the node and cluster level. For the cluster level, insignificant times may be overlooked if only individual nodes were looked at, losing sight of larger DB times when summed across all nodes in a cluster.
Equally, just because large DB times across all nodes may be reported doesn't mean that just one cluster could be having problems. Just remember to understand your environment and drill down to the proper level depending on what you are looking for.
Oracle's Automatic Database Diagnostic Monitor (ADDM) is a great tool to have in your tool belt. Its ability to automatically collect, save, and analyze database statistics reduces DBA time and resources that would be required to accomplish the same goal, but probably less effectively. Having this tool within the database, something not often considered, actually reduces the resources required to run such a tool on the database system and reduces/eliminates the burden of modifying a toolset from the DBA.