I/O Performance Tuning Tools for Oracle Database 11gR2
July 28, 2010
Oracle Database 11g Release 2 (11gR2) makes it easier than ever to isolate and identify potential root causes of poor I/O performance. This article focuses on the various Oracle and non-Oracle tools to generate sample workloads to provide sufficient metrics for detection and eventual resolution of performance bottlenecks within the database’s underlying input/output (I/O) subsystem.
Synopsis. Tuning the I/O performance of an Oracle 11gR2 database is still just as much art as science. Oracle 11g Release 2 (11gR2) makes it easier than ever, however, to isolate and identify potential root causes of poor I/O performance. This article – the second in this series – focuses on the various Oracle and non-Oracle tools to generate sample workloads to provide sufficient metrics for detection and eventual resolution of performance bottlenecks within the database’s underlying input/output (I/O) subsystem.
The prior article in this series discussed much of the theory behind I/O performance tuning, especially as it related to Oracle 11gR2 database applications. This article will address several key practical concepts, including how to:
Let’s start off with a discussion of how to monitor all of the crucial components of any I/O subsystem via a plethora of available tools – some more sophisticated than others. I’m going to focus on the Linux operating system for this article series, but corollary utilities in most cases exist in the Windows OS environment and certainly exist in all UNIX OS environments.
I/O Performance Monitoring Tools
Most of the tools that Oracle DBAs continue to use for I/O performance monitoring offer the ability to reveal sufficient I/O tuning information from deep within the operating system. These tools break down into two basic groups:
Command-Line Tools. These reliable standard command-line tools have been around since the dawn of computing (well, at least since green characters were typed upon black backgrounds). Since they are character-based, however, they don’t offer much in the way of slick display; on the other hand, they tend to present a relatively small memory footprint - and that may be crucial when trying to analyze an I/O performance problem that’s masquerading as a memory-bound or CPU-bound issue. Here are a few of my favorites:
Oracle-Based Tools. Not surprisingly, the Oracle Database software itself also offers several I/O performance monitoring tools.
Starting in Oracle 11gR1, EM also offered the ability to view I/O statistics from several different perspectives, including within background and foreground process type, I/O type, and even resource consumer group. I’ve included a few sample screens below to show this 11g feature, and I’ll be using this extensively in future articles to investigate and drill down through real-time I/O performance metrics.
I/O Generation Tools: Extra-Database
Several tools are useful for what I like to call “extra-database” analysis. In other words, these tools can be used to generate an I/O workload from outside an Oracle database environment. However, these tools must be used carefully – and probably never within a production database environment! – because they literally could care less about the state of any extant Oracle database files.
By the way, in my opinion the documentation on ORION is just a little too subtle on what it does “under the covers” to provide a write-intensive workload. Here’s a snippet from the output that pops up when I invoked the 64-bit version of ORION from a terminal window:
ORION: ORacle IO Numbers -- Version 188.8.131.52.0 Parse error: View help screen using ./orion_x86_64 -help. [oracle@11gR2Base ~]$ ./orion_x86_64 -help ORION: ORacle IO Numbers -- Version 184.108.40.206.0 ORION runs IO performance tests that model Oracle RDBMS IO workloads. It measures the performance of small (2-32K) IOs and large (128K+) IOs at various load levels. Each Orion data point is done at a specific mix of small and large IO loads sustained for a duration. Anywhere from a single data point to a two-dimensional array of data points can be tested by setting the right options. An Orion test consists of data points at various small and large IO load levels. These points can be represented as a two-dimensional matrix: Each column in the matrix represents a fixed small IO load. Each row represents a fixed large IO load. The first row is with no large IO load and the first column is with no small IO load. An Orion test can be a single point, a row, a column or the whole matrix. The 'run' parameter is the only mandatory parameter. Defaults are indicated for all other parameters. For additional information on the user interface, see the Orion User Guide. <testname> is a filename prefix. By default, it is "orion". It can be specified with the 'testname' parameter. <testname>.lun should contain a carriage-return-separated list of LUNs The output files for a test run are prefixed by <testname>_<date> where date is "yyyymmdd_hhmm". The output files are: <testname>_<date>_summary.txt - Summary of the input parameters along with min. small latency, max large MBPS and/or max. small IOPS. <testname>_<date>_mbps.csv - Performance results of large IOs in MBPS <testname>_<date>_iops.csv - Performance results of small IOs in IOPS <testname>_<date>_lat.csv - Latency of small IOs <testname>_<date>_tradeoff.csv - Shows large MBPS / small IOPS combinations that can be achieved at certain small latencies <testname>_trace.txt - Extended, unprocessed outputMandatory parameters: run Type of workload to run (simple, normal, advanced, dss, oltp) simple - tests random 8K small IOs at various loads, then random 1M large IOs at various loads. normal - tests combinations of random 8K small IOs and random 1M large IOs advanced - run the workload specified by the user using optional parameters dss - run with random 1M large IOs at increasing loads to determine the maximum throughput oltp - run with random 8K small IOs at increasing loads to determine the maximum IOPS Optional parameters: testname Name of the test run num_disks Number of disks (physical spindles). Default is the number of LUNs in <testname>.lun size_small Size of small IOs (in KB) - default 8 size_large Size of large IOs (in KB) - default 1024 type Type of large IOs (rand, seq) - default rand rand - Random large IOs seq - Sequential streams of large IOs num_streamIO Number of concurrent IOs per stream (only if type is seq) - default 4 simulate Orion tests on a virtual volume formed by combining the provided volumes in one of these ways (default concat): concat - A serial concatenation of the volumes raid0 - A RAID-0 mapping across the volumes write Percentage of writes (SEE WARNING ABOVE) - default 0 cache_size Size *IN MEGABYTES* of the array's cache. Unless this option is set to 0, Orion does a number of (unmeasured) random IO before each large sequential data point. This is done in order to fill up the array cache with random data. This way, the blocks from one data point do not result in cache hits for the next data point. Read tests are preceded with junk reads and write tests are preceded with junk writes. If specified, this 'cache warming' is done until cache_size worth of IO has been read or written. Default behavior: fill up cache for 2 minutes before each data point. duration Duration of each data point (in seconds) - default 60 num_small Number of outstanding small IOs (only if matrix is point, col, or max) - no default num_large For random, number of outstanding large IOs. For sequential, number of streams (only if matrix is point, row, or max) - no default matrix An Orion test consists of data points at various small and large IO load levels. These points can be represented as a two-dimensional matrix: Each column in the matrix represents a fixed small IO load. Each row represents a fixed large IO load. The first row is with no large IO load and the first column is with no small IO load. An Orion test can be a single point, a row, a column or the whole matrix, depending on the matrix option setting below (default basic): basic - test the first row and the first column detailed - test the entire matrix point - test at load level num_small, num_large col - varying large IO load with num_small small IOs row - varying small IO load with num_large large IOs max - test varying loads up to num_small, num_large verbose Prints tracing information to standard output if set. Default -- not set Examples For a preliminary set of data -run simple For a basic set of data -run normal To evaluate storage for an OLTP database -run oltp To evaluate storage for a data warehouse -run dss To generate combinations of 32KB and 1MB reads to random locations: -run advanced -size_small 32 -size_large 1024 -type rand -matrix detailed To generate multiple sequential 1MB write streams, simulating 1MB RAID0 stripes -run advanced -simulate RAID0 -stripe 1024 -write 100 -type seq -matrix col -num_small 0
I’ll likewise demonstrate the use of ORION in future articles in this series, so please stay tuned.
I/O Generation Tools: Freeware
What if I need to capture and then regenerate a specific application workload? There are several tools available:
I/O Generation Tools: Not-So-Freeware
What if I need to capture and then regenerate a specific application workload, and I have significant budget to spend doing that? While free tools are great, the old saying that “You get what you pay for” certainly holds true here … especially when support for a technical problem or special testing case warrants outside help. The good news here is that there are several tools available; here are three of my favorite ones:
Now that I’ve covered some of the more common I/O load generation and performance testing tools, it’s time to actually tackle some real-world testing. The next article is this series will focus on how to:
References and Additional Reading
Before you proceed to experiment with any of these new features, I strongly suggest that you first look over the corresponding detailed Oracle documentation before trying them out for the first time. I’ve drawn upon the following Oracle Database 11g Release 2 documents for this article’s technical details: