Synopsis. Oracle 11g Release 2 (11gR2) provides excellent tools that help isolate and identify potential root causes of poor I/O performance. This article – the sixth in this series – illustrates the wealth of I/O performance tuning information that Oracle 11gR2’s Automatic Workload Repository (AWR) reports provide, how to use that information to determine if any I/O performance bottlenecks actually exist, and how to form initial hypotheses to resolve those bottlenecks within the database’s underlying input/output (I/O) subsystem.
The prior article in this series continued the demonstration of how to use Quest Software’s Benchmark Factory tool in concert with Oracle Database 11gR2 to:
- Generate repeatable and realistic database application workloads
- Capture appropriate metrics from the target Oracle database and operating system to quantify I/O response times and throughput
Now that I’ve at last successfully generated both TPC-H and TPC-E workloads, it’s time to turn my attention to how well my database’s I/O infrastructure is actually performing … and that means it’s time to pore over the resulting performance metrics that Oracle 11gR2 is quite excellent at capturing. In this article I’ll therefore concentrate on:
- Evaluating AWR reports and OS statistics for possible indication of any I/O performance bottlenecks
- Formulating initial hypotheses on how well my database’s storage infrastructure is responding to different application workloads
I/O Performance Analysis: First Steps
One of the features of Quest’s Benchmark Factory tool that I really appreciate is the ability to view in real time how the simulated application workload being generated is affecting components of my database system’s performance. Figure 6.1.1 below shows an example of the TPC-E 50-User application workload generation at just about the halfway mark of its execution cycle. As you can see, several “spikes” have occurred as the workload progressed, most notably in the average execution time of the workload and the average throughput in bytes per second.
Figure 6.1.1. Benchmark Factory: Displaying RealTime Statistics
While this is certainly interesting information, it really tells me very little about how well the database system is really handling the generated workload’s demand. To find that out, I’ll turn my attention to what I’ve found is the most valuable avenue for attacking just about any Oracle database performance issue, whether perceived or real: the Automatic Workload Repository (AWR) reports for the time period(s) when sub-optimal or questionable database response times have been reported.
(Yes, I know: AWR is a separately licensed feature of the Oracle 11gR2 Diagnostic Pack, but in my experience, it has become the de facto database performance tuning tool for most Oracle database shops. While Statspack reports will certainly provide sufficient details for I/O performance tuning, AWR reports offer significantly more detail, are easier to navigate, and are more efficient to execute because they access performance metric data that’s already been captured within AWR snapshots … and those snapshots were gathered in real time using direct memory access.)
I’ve captured the complete contents of the AWR reports for both the TPC-E 50-user workload as well as for the TPC-H 15-user workload, but for now I’ll focus on separate sections of these reports to demonstrate what specific information I normally investigate to determine if there are any real I/O performance-related issues.
Comparing AWR Reports from Different Periods. It’s not unusual for Oracle DBAs to support competing workloads at different times of the month, week, and even day – especially for “hybrid” databases that may have started out as OLTP platforms, but now encapsulate a DSS or OLAP reporting component. But trying to identify the I/O performance “footprints” that represent application workloads over different timeframes can be downright tedious! I’ve found that many Oracle DBAs aren’t aware of one of my favorite AWR features: its ability to highlight dramatic differences between two different application workloads by directly comparing their corresponding performance metrics via its Compare Periods feature.
To illustrate, I’ve created an AWR Compare Period Report that shows the difference between the TPC-E and TPC-H workloads. What I really like about this feature is that it immediately highlights the major differences between the application workloads generated during these two timeframes. For example, note the significant difference between:
- The number of physical reads in the TPC-H workload versus the number of physical writes in the TPC-E workload
- The number of redo transactions created during TPC-E versus TPC-H workload execution
- The amount of logical reads – almost three times more! – performed from the database’s buffer cache in the TPC-E workload versus the TPC-H workload
I/O Performance Analysis: Top Five Wait Events
The first thing I’ll focus on is whether or not there are any I/O related wait events that need investigation, and I’ll do that by reviewing the Top Five Wait Events section in each of the AWR reports. I’ve reproduced them for the TPC-E and TPC-H application workloads in Figure 6.2.1 and Figure 6.2.2 below. (For an excellent summary of what each of these wait events actually portend, I recommend Section 10.3 of the Oracle Database 11gR2 Performance Tuning Guide.)
Event |
Waits |
Time(s) |
Avg wait (ms) |
% DB time |
Wait Class |
direct path read |
248,069 |
46,559 |
188 |
83.28 |
User I/O |
db file sequential read |
16,846 |
4,249 |
252 |
7.60 |
User I/O |
db file parallel read |
481 |
1,436 |
2985 |
2.57 |
User I/O |
DB CPU |
1,123 |
2.01 |
|||
enq: KO – fast object checkpoint |
666 |
900 |
1351 |
1.61 |
Application |
Figure 6.2.1. TPC-E Workload: Top Five Wait Events
Event |
Waits |
Time(s) |
Avg wait (ms) |
% DB time |
Wait Class |
direct path write temp |
86,267 |
17,790 |
206 |
55.13 |
User I/O |
direct path read |
139,711 |
6,875 |
49 |
21.30 |
User I/O |
direct path read temp |
1,530,266 |
6,208 |
4 |
19.24 |
User I/O |
DB CPU |
1,367 |
4.24 |
|||
log file sync |
60 |
4 |
72 |
0.01 |
Commit |
Figure 6.2.2. TPC-H Workload: Top Five Wait Events
From these two report sections, it’s pretty obvious that my database is performing significant amounts of I/O during each of these application workload execution timeframes…and that’s not necessarily a bad thing. After all, if my database is spending most of its time executing SQL statements and either reading data from or writing data to my database’s datafiles, then it’s doing work rather than waiting for, say, a latch or mutex to be released. What’s important at this stage of the analysis is to realize that there may indeed be opportunities for tuning the I/O subsystem or even the database instance itself to take better advantage of what’s clearly already a pretty well-tuned set of SQL statements.
I/O Performance Analysis: I/O Profiles
I/O Profile Section. Starting in Oracle 11gR1, I/O-related performance metrics and statistics are also available, both from within Enterprise Manager for real-time I/O performance “firefighting” and within AWR snapshots. Oracle 11g breaks out this information across several dimensions:
- I/O By Function. This table shows how read and write I/O activity is distributed across the various foreground and background processes for the database instance. I’ve reproduced this report section for the TPC-E application workload in Figure 6.3.1, which makes it immediately obvious that the majority of the I/O activity is the direct physical loading (via physical reads) of data blocks into the database buffer cache.
- I/O By File Type. As the sample from the TPC-E workload in Figure 6.3.2 shows, this table offers a completely different perspective: which type of file is absorbing the heaviest read or write workload. For the TPC-E application workload, I/O activity is obviously concentrated upon reading from the database’s data files. Also interesting is that small (random) reads outnumber large (sequential) reads by a factor of almost 150%. – again, not a complete surprise for an intense OLTP workload.
- Combined I/O Within Function and File Type. This table combines the two dimensions for a closer look at which background processes are most heavily performing I/O against which type(s) of files.
Function Name |
Reads: Data |
Reqs per sec |
Data per sec |
Writes: Data |
Reqs per sec |
Data per sec |
Waits: Count |
Avg Tm(ms) |
Direct Reads |
240.3G |
183.84 |
182.304 |
0M |
0.00 |
0M |
248.1K |
187.14 |
Buffer Cache Reads |
295M |
17.27 |
.218603 |
0M |
0.00 |
0M |
19.5K |
379.93 |
Others |
55M |
2.33 |
.040756 |
91M |
1.47 |
.067433 |
4384 |
8.60 |
DBWR |
0M |
0.00 |
0M |
35M |
1.87 |
.025936 |
837 |
346.45 |
LGWR |
0M |
0.00 |
0M |
20M |
3.23 |
.014820 |
0 |
|
Direct Writes |
0M |
0.00 |
0M |
1M |
0.01 |
.000741 |
0 |
|
Streams AQ |
0M |
0.01 |
0M |
0M |
0.00 |
0M |
10 |
69.70 |
TOTAL: |
240.6G |
203.45 |
182.564 |
147M |
6.58 |
.108931 |
272.8K |
198.52 |
Figure 6.3.1. TPC-E Workload: I/O Profile Within Function
Filetype Name |
Reads: Data |
Reqs per sec |
Data per sec |
Writes: Data |
Reqs per sec |
Data per sec |
Small Read |
Large Read |
Data File |
240.5G |
201.17 |
182.528 |
38M |
1.89 |
.028159 |
308.56 |
187.42 |
Control File |
49M |
2.22 |
.036310 |
22M |
1.01 |
.016302 |
3.68 |
37.67 |
Flashback Log |
3M |
0.01 |
.002223 |
46M |
0.37 |
.034087 |
15.88 |
21.00 |
Other |
3M |
0.04 |
.002223 |
21M |
0.08 |
.015561 |
8.00 |
8.86 |
Log File |
0M |
0.00 |
0M |
20M |
3.24 |
.014820 |
||
Temp File |
0M |
0.00 |
0M |
0M |
0.00 |
0M |
116.50 |
|
TOTAL: |
240.6G |
203.44 |
182.568 |
147M |
6.59 |
.108931 |
273.77 |
187.40 |
Figure 6.3.2. TPC-E Workload: I/O Profile Within File Type
Function/File Name |
Reads: Data |
Reqs per sec |
Data per sec |
Writes: Data |
Reqs per sec |
Data per sec |
Waits: Count |
Avg Tm(ms) |
Direct Reads |
240.3G |
183.84 |
182.304 |
0M |
0.00 |
0M |
0 |
|
Direct Reads (Data File) |
240.3G |
183.84 |
182.304 |
0M |
0.00 |
0M |
0 |
|
Buffer Cache Reads |
294M |
17.27 |
.217862 |
0M |
0.00 |
0M |
18.5K |
244.49 |
Buffer Cache Reads (Data File) |
294M |
17.27 |
.217862 |
0M |
0.00 |
0M |
18.5K |
244.49 |
Others |
52M |
2.29 |
.038533 |
70M |
1.39 |
.051872 |
3601 |
7.41 |
Others (Control File) |
49M |
2.22 |
.036310 |
21M |
1.01 |
.015561 |
2997 |
3.74 |
Others (Flashback Log) |
3M |
0.01 |
.002223 |
47M |
0.37 |
.034828 |
523 |
27.71 |
Others (Data File) |
0M |
0.06 |
0M |
2M |
0.01 |
.001482 |
81 |
12.04 |
DBWR |
0M |
0.00 |
0M |
35M |
1.87 |
.025936 |
0 |
|
DBWR (Data File) |
0M |
0.00 |
0M |
35M |
1.87 |
.025936 |
0 |
|
LGWR |
0M |
0.00 |
0M |
20M |
3.24 |
.014820 |
0 |
|
LGWR (Log File) |
0M |
0.00 |
0M |
20M |
3.24 |
.014820 |
0 |
|
Direct Writes |
0M |
0.00 |
0M |
1M |
0.01 |
.000741 |
0 |
|
Direct Writes (Data File) |
0M |
0.00 |
0M |
1M |
0.01 |
.000741 |
0 |
|
Streams AQ |
0M |
0.01 |
0M |
0M |
0.00 |
0M |
10 |
69.70 |
Streams AQ (Data File) |
0M |
0.01 |
0M |
0M |
0.00 |
0M |
10 |
69.70 |
TOTAL: |
240.6G |
203.41 |
182.561 |
126M |
6.51 |
.093369 |
22.1K |
205.84 |
Figure 6.3.3. TPC-E Workload: I/O Profile Within Function and File Type
I’ve provided a link to the I/O Profile Section for the TPC-H workload’s corresponding AWR report.
I/O Performance Analysis: Tablespace and Datafile I/O Profiles
The final two sections of the AWR report that are most helpful for isolating I/O bottlenecks describe the I/O patterns for the database’s tablespaces and each tablespace’s corresponding datafiles. For the TPC-E workload, I’ve captured just the first five tablespaces that are the “hottest” in terms of combined read/write activity. This report section makes it simple to see that the TPCE_TRADE_DATA tablespace is getting pummeled the hardest. This makes sense since that tablespace contains the data segments for table TPCE.E_TRADE_DATA, and that table is most heavily accessed during the TPC-E workload:
Tablespace |
Reads |
Av Reads/s |
Av Rd(ms) |
Av Blks/Rd |
Writes |
Av Writes/s |
Buffer Waits |
Av Buf Wt(ms) |
TPCE_TRADE_DATA |
248,884 |
184 |
3.24 |
126.11 |
994 |
1 |
244 |
330.12 |
TPCE_TRADE_HISTORY_IDX |
2,581 |
2 |
159.05 |
1.25 |
11 |
0 |
95 |
320.21 |
TPCE_SETTLEMENT_DATA |
2,313 |
2 |
164.38 |
1.29 |
265 |
0 |
25 |
244.40 |
TPCE_DAILY_MARKET_DATA |
2,417 |
2 |
57.19 |
35.45 |
8 |
0 |
1,037 |
126.75 |
TPCE_TRADE_HISTORY_DATA |
2,368 |
2 |
185.03 |
1.25 |
29 |
0 |
14 |
165.71 |
Figure 6.4.1. TPC-E Workload: Five “Hottest” Tablespaces
The picture is quite different, of course, for the corresponding tablespaces for the TPCH schema objects. Here, just two of the several tablespaces were accessed predominantly during that workload’s execution, and most prominent was the temporary tablespace named TPCH_BIGTEMP. This is not unexpected, since the TPC-H workload is meant to simulate a DSS environment, and that means there’s going to be a fair amount of sorting and aggregation that takes place outside the database’s program global area (PGA):
Tablespace |
Reads |
Av Reads/s |
Av Rd(ms) |
Av Blks/Rd |
Writes |
Av Writes/s |
Buffer Waits |
Av Buf Wt(ms) |
TPCH_BIGTEMP |
1,530,257 |
668 |
0.00 |
6.89 |
86,242 |
38 |
0 |
0.00 |
TPCH_REF_DATA |
139,842 |
61 |
0.03 |
123.51 |
0 |
0 |
47 |
50.43 |
SYSAUX |
140 |
0 |
0.00 |
1.86 |
492 |
0 |
0 |
0.00 |
UNDOTBS1 |
71 |
0 |
0.00 |
1.00 |
364 |
0 |
0 |
0.00 |
SYSTEM |
0 |
0 |
0.00 |
0.00 |
37 |
0 |
0 |
0.00 |
Figure 6.4.2. TPC-H Workload: Five “Hottest” Tablespaces
Datafile I/O Metrics. Since this section of the AWR report tends to be among the lengthiest, I’ve provided links to the extremely granular datafile I/O information for the TPC-E workload and the TPC-H workload, respectively. I’ll use the information within these report sections in the next article in this series to probe which datafiles and temporary files might benefit most from faster I/O devices or removal of other potential I/O bottlenecks.
I/O Performance: Hypotheses For Improvement
Now that I’ve gathered these metrics, I can begin to make some intelligent decisions about where to concentrate my I/O performance tuning efforts:
- Is there any way I can improve the ratio between the number of physical reads to logical reads? For example, is it possible to expand the amount of server memory that’s available for the SGA so that a buffer could be retained for a longer period of time?
- Could any of the datafiles that correspond to the tablespaces, which the TPC-E and TPC-H workloads access most heavily benefit from I/O devices with faster I/O capabilities?
- Finally, is there anything I can do to eliminate I/O bottlenecks that result from the OS file system itself?
Next Steps
I’ll begin to provide answers to the questions I’ve just proposed in the next article in this series, which will concentrate on how to:
- Improve I/O response time via intra-database solutions
- Improve I/O response time through enhancements to the I/O subsystem
References and Additional Reading
Before you proceed to experiment with any of these new features, I strongly suggest that you first look over the corresponding detailed Oracle documentation before trying them out for the first time. I’ve drawn upon the following Oracle Database 11g Release 2 documents for this article’s technical details:
E10881-02 Oracle Database 11gR2 New Features
E10595-05 Oracle Database 11gR2 Administrator’s Guide
E10713-03 Oracle Database 11gR2 Concepts
E10820-02 Oracle Database 11gR2 Reference
E10500-02 Oracle Database 11gR2 Storage Administrator’s Guide