Oracle Database 11gR2 I/O Performance Tuning: A Primer
June 30, 2010
Tuning the performance of Oracle Database 11gR2 is still just as much of an art as it is a science. Jim Czuprynski focuses on the detection and analysis of performance bottlenecks within the database's underlying input/output (I/O) subsystem.
Synopsis. Oracle 11g Release 2 (11gR2) continues the trend toward automated database performance tuning tools that started in Oracle 10g. However, tuning the performance of an Oracle 11gR2 database is still just as much of an art as it is a science. This article - the first in this series - provides a primer that focuses on one of the most challenging arenas of database performance tuning: the detection and analysis of performance bottlenecks within the database's underlying input/output (I/O) subsystem.
As a fledgling Oracle DBA many years ago, one of the first lessons I learned about database performance tuning was that there are only three potential types of bottlenecks to investigate whenever a database application performance issue rears its head:
Memory. An Oracle database instance may simply not have enough memory in the right places to execute all of the SQL statements issued against it, then the database is said to be memory-bound. For example, while there may be sufficient memory, it may have been seriously mis-allocated between the database's Shared Global Area (SGA) and Program Global Area (PGA). Another possibility is that the memory components within the SGA itself have not been divided up properly, and the database buffer cache is significantly undersized for the workload being generated, or the library cache is oversized as compared to the number of SQL statements being issued.
As of Oracle 11g Release 1 (11gR1), however, these memory bottlenecks are easily detected through the automated diagnostic infrastructure that Automatic Database Diagnostic Monitor (ADDM) provides. Oracle 11gR1's Automatic Memory Management (AMM) feature can also be activated to detect and automatically adjust memory allocations as the database instance's workload changes. Finally, if the Oracle DBA decides against using AMM, Oracle 11g Enterprise Manager (EM) provides several elegant GUI-based memory tuning advisors that encompass the instance's complete memory footprint (SGA + PGA) as well as individual advisors for the SGA, PGA, Database Buffer Cache, and the Library Cache; there are also SQL-based memory advisors for the Streams Pool, Large Pool, and Java Pool.
CPU. A database application's SQL statements may be poorly written, or they may be unable to take advantage of efficient execution plans because of missing optimizer statistics. SQL statements may also be efficiently written but may be heavily dependent on complex arithmetic computations or complex set manipulation. In these cases, a database instance may quickly become CPU bound; in other words, there are simply not enough CPU cycles to complete the necessary calculations required to return a query's result set or to process DML statements.
Fortunately, Oracle 11gR1 has already made it simpler than ever to detect, analyze, and resolve these types of "problem" SQL statements. ADDM, in concert with Automatic Workload Repository (AWR) snapshots and reports, are excellent at identifying the top SQL statements that are most degrading to the overall database instance's performance within any specific time frame or workload, and Active Session History (ASH) reports can analyze an application workload at an even deeper granularity. Oracle 11g's SQL Tuning Advisor and SQL Access Advisor respectively offer specific tuning recommendations for individual SQL statements or even entire application workloads. Finally, Oracle 11g's SQL Performance Analyzer (SPA) and SQL Plan Management (SPM) features can be used to execute performance regression analysis for application workloads and even "lock in" a known, better-performing execution plan when upgrading from a prior database version to 11g.
I/O. In my experience, if a database application is experiencing extremely poor performance, it's more likely to stem from the Oracle database instance being either significantly memory-bound or CPU-bound, so it makes sense that Oracle 11g's performance tuning tools are aimed primarily at those classes of root causes. However, it's also possible for a database instance to suffer from poor I/O performance. A database thus becomes I/O bound when it cannot respond quickly enough to requests to either read data from and/or write data to the underlying I/O subsystem.
Unlike CPU-bound or memory bound systems, however, the identification, analysis, isolation, and eventual repair of the root causes for poor I/O response time is significantly more difficulty because of the complex nature of today's storage platforms. So let me attempt to pull back the curtain on I/O performance tuning by explaining it from a completely different perspective: that of the storage administrator.
Isolating the Real Problem
Here's the first big secret that storage administrators know intimately, but Oracle DBAs sometimes ignore, when tackling an I/O performance tuning issue: No database truly performs I/O directly and immediately against disk. In fact, even in the simplest case - say, a single disk drive with a native installation of either the NTFS or EXT3 file system - disk I/O is always buffered in the cache memory that's part of each individual disk's I/O controller.
Expanding upon this extremely simple case, I/O performance issues can be much more difficult to accurately diagnose because of various factors, including:
For mid-range and enterprise-level high-performance storage systems, other configuration options may introduce additional complexity:
Application Workload I/O Profiles
If the potential complexity of the I/O subsystem doesn't make an Oracle DBA's head spin, here's another factor: I/O performance tuning is often compounded by mixed database application workloads. Only a few years back, Oracle recommended that database application workloads should be isolated at the database level - for example, by creating separate OLTP and DSS databases, each with their own set of storage - and that was certainly helpful when attempting to isolate an I/O performance problem.
In the last few years, however, the dramatic increase in the maturity of Oracle Real Application Cluster (RAC) database software in Oracle Release 10g has made it simpler than ever to deploy RAC database environments in a matter of hours. Oracle RAC certainly adds to the stability and consistency of database applications because the loss of one node or instance no longer means that the Oracle database is completely unavailable. The addition of RAC database services in Oracle 10g also meant it was easier than ever to control and serve up sufficient bandwidth for multiple applications against a single RAC database, but this also means that it's not uncommon today to encounter a single RAC database that's servicing multiple application workloads with dramatically different I/O performance profiles.
The good news is that there are only four basic types of I/O performance profiles that any Oracle DBA need be concerned about:
Here's the second big secret that our storage administrators keenly understand, sometimes better than even experienced Oracle DBAs: Ultimately, all I/O performance tuning revolves around determining what type(s) of application workload(s) the database is actually performing versus what workload the underlying storage has been configured to handle most of the time. So if the underlying storage platform on which the Oracle database's files exist has been tuned primarily for random read/write access, then sequential read/write application workloads (DSS) will most likely perform significantly worse than an OLTP workload; and vice versa, if the storage has been tuned for predominantly sequential read/write access (like a data warehouse), OLTP application workloads are more likely to encounter serious I/O performance bottlenecks.
Enough theory! It's time to put at least some of these concepts to work. The next article in this series will delve more deeply into how to: