DBA Call to Action: Zeroing in on Performance Problems - Page 2
February 13, 2004
When being thrown into a performance problem, you should initially and quickly determine if there are any obvious high-level performance issues. You should look at CPU usage, memory usage, network usage, and the internal Oracle high-level statistics through the V$SYSTEM_EVENT view. Listing 2 gives you a quick list of tools to use that can give you a quick glimpse into your database system.
Listing 2 Where to look for statistics
While top, sar, iostat, and netstat will give you real-time statistics, V$SYSTEM_EVENT will give you an accumulated representation of database statistics from when the database was started. Listing 3 will give you a quick and easy method to determine what the events are that are having trouble within your database. This is where your interview with the users comes into play. Make sure you take your first snapshot of the statistics before the users execute the task that they are experiencing as a problem and then after the task completes you need to take an ending snapshot of statistics. After you have both beginning and ending statistics, issue the SQL provided in Listing 3 to determine the true events and wait times for them.
Make it Simple
At first glance, you may think the above approach to be too simplistic. While you may need to go deeper into a problem area, it would be pre-mature to begin tracing all the applications and users within your system before you get a good handle on the high-level health of your database. I have seen too many DBAs begin to turn on tracing, shut down applications, and literally take over the database with high performance sucking DBA tasks that mask, hide, and provide too much detail before they know where the problem resides. I cannot count the number of database systems I have been asked to evaluate and the issue will reside somewhere in the network or on a client machine that was not performing properly. Performance problems outside of the database were all seen by looking at the high-level statistics and noticing the database was not performing any work what so ever and thus could not be involved in the performance problem. Just imagine the time and effort wasted if you went in and started tracing user sessions and going through trace output to only notice, the sessions were doing nothing.
Going a bit deeper
When you have zeroed in on what the high-level performance issues are, through the high-level statistics, you can now begin to delve deeper into other Oracle related statistics to help you in your diagnosis. Typically the way I do this is through the capture of beginning and ending statistics for other V$ views the same way I did in Listing 3. These also should be wrapped around the user task or workload that is performing poorly. Listing 4 gives you the base tables that I always collect beginning and ending statistics for. After you have collected the statistics, all you need to do is take your favorite DBA script, relate the two tables of statistics by a key and subtract any values with which you are concerned.
All I can tell you is to stick to your guns. Identify the issues with relation to the users and the problems they are experiencing. Your success as a DBA is the pain you can alleviate from your users of the database system. Find those events in the database or system level resources that are consuming the most amount of time and try to reduce them by applying what you have learned about database systems. Try to acknowledge when you are going down the wrong path and do not fall into the trap of trying to solve problems that do not exist. Prioritize your efforts to reduce the pain felt by users and you will succeed.