While Oracle 11gR2 Database definitely provides some excellent tools to drill down to the potential root causes of poor I/O performance, generating a standard, consistent, and repeatable I/O workload is crucial for fair comparisons. This article – the fourth in this series – discusses which Transaction Processing Council (TPC) sample schema(s) are available, and which one(s) are most appropriate for evaluation of a database system’s underlying input/output (I/O) subsystem based on corresponding application database application workloads.
The prior article in this series showed how to use Oracle’s free extra-database I/O generation tool, ORION, to:
- Implement extra-database I/O performance tests
- Evaluate the results of extra-database I/O performance testing
While most system administrators and storage administrators may claim that this level of testing is sufficient to establish the efficiency of the underlying I/O subsystem, I’m relatively certain that most Oracle DBAs would contend that extra-database testing has not even waged half of the battle. After all, the real test of how heavily an Oracle database will tax its underlying I/O subsystem and related infrastructure is to actually tax that infrastructure using representative database application workloads.
This article will therefore aim its sights at how to:
- Choose appropriate database schema(s) for realistic testing
- Create example TPC-E and TPC-H database schemas
- Perform initial loading of these schemas using Quest Benchmark Factory
TPC Sample Schemas and Workloads:
As I mentioned in the first article in this series, the Transaction Processing Council (TPC) offers several standardized database schemas and corresponding application workloads that can be used to simulate different database performance profiles. To provide my Oracle database with repeatable yet significantly diverse application workloads, I’ve chosen the two most popular standard schemas:
- The TPC-E Standard schema is a relatively newer workload that appears to be targeted as the replacement for the original OLTP standard schema, TPC-C. While the TPC-C schema provides a relatively small number of tables (8) and few indexes and constraints, TPC-E provides over thirty (33) tables and corresponding indexes and constraints. The TPC-E application workload simulates a high-volume OLTP financial transaction environment (e.g. ATM transactions, bond trading, stock market trades) that also incorporates the need to report against transactions in real time. It’s therefore perfect for evaluating an intense random read / random write database application I/O workload profile.
- At the other end of the spectrum, the TPC-H Standard is a true data warehousing (DW) schema. Its eight tables encompass a traditional dimensional / fact table entity DSS structure, and it’s aimed squarely at duplicating sequential read database application performance during reporting and duplicating sequential write database application performance during extraction, transformation and loading (ETL) operations,
Detailed information on these schemas, exacting specifications for generation of consistent application workloads, and accurate metrics for comparison of different system’s responses is available at the Transaction Processing Council’s web site. At the end of this article, I’ve included links to the most recent versions of these documents as of this writing.
Using Benchmark Factory: Loading the TPC-H Schema
I’m certain that any Oracle DBA with reasonable skills could construct these schemas, write code to generate sufficient sample data, and then load them in the exact ratios that the TPC standards demand for accurate comparisons. As I mentioned in the prior articles in this series, however, there are excellent alternatives that can help any overly-leveraged Oracle DBA to leverage her limited spare time. To construct and load these schemas quickly and efficiently, I’ll utilize one of my favorite tools: Quest’s Benchmark Factory.
I’ll use Benchmark Factory to illustrate how to create and load the TPC-H schema with a relatively small amount of sample data that’s populated in exactly the proportions mandated by the TPC-H Standard. Before I do that, however, I’ll create a new schema, TPCH, and separate tablespaces for each table and index so that it’ll be easier to migrate these tablespaces to different disk configurations in later experiments. I’ve captured these tablespace creation commands in Listing 4.1.
To insure speedier table loading and index creation within this schema, I’ve also temporarily deactivated the creation of archived redo logs and flashback logs via SQL*Plus commands:
SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup mount; ORACLE instance started. Total System Global Area 810053632 bytes Fixed Size 2217712 bytes Variable Size 398461200 bytes Database Buffers 406847488 bytes Redo Buffers 2527232 bytes Database mounted. SQL> alter database flashback off; Database altered. SQL> alter database noarchivelog; Database altered. SQL> alter database open; Database altered.
Now that my database environment is ready to go, I’ll use Benchmark Factory to complete the entire table creation and loading process, including index creation. As shown in Figures 4.1.1 – 4.1.3 below, the first thing I’ll need to do is prepare a profile under which my load scenario will be executed.
Figure 4.1.1. Benchmark Factory: Creating a Profile.
Figure 4.1.2. Benchmark Factory: Choosing the Profile’s Database Driver.
Figure 4.1.3. Benchmark Factory: Creating a Profile.
I’ll next attach a load scenario to the profile I’ve just created. Benchmark Factory makes it extremely simple to select an appropriate scenario from several “canned” ones, so I’ll set up a Benchmark Factory job that will handle the creation of all required tables, indexes and constraints necessary for a complete TPC-H benchmark as shown in Figures 4.2.1 – 4.2.4 below.
Figure 4.2.1. Benchmark Factory: Creating a Load Scenario.
Figure 4.2.2. Benchmark Factory: Creating a Load Scenario.
Figure 4.2.3. Benchmark Factory: Selecting a Benchmark.
Figure 4.2.4. Benchmark Factory: Attaching a Profile to the Selected Benchmark.
Once I’ve chosen TPC-H as my benchmark, Benchmark Factory offers me options to load the corresponding TPC-H tables in the exact proportions that the TPC-H Standard demands – including a pretty accurate estimate of the amount of disk space required to load these tables – as shown in Figure 4.3.1 below.
Figure 4.3.1. Benchmark Factory: Specifying the TPC-H Benchmark Scale.
Clicking on the Advanced tab of this screen offers me the ability to specify multiple attributes for each table and index in the TPC-H schema:.
Figure 4.3.2. Benchmark Factory: Listing TPC-H Schema Objects.
I chose the same settings for all objects during their creation via Benchmark Factory. As Figure 4.3.3 shows for table H_PART, for example, I specified:
- In which tablespace each object should reside
- How many parallel processes should be used to speed object creation
- Whether redo logging should be activated during the loading
- Whether table monitoring and caching should be activated after the table has been created
Figure 4.3.3.Benchmark Factory: Specifying Table Creation and Loading Options.
(If you find this type of object-by-object editing tedious, be aware that it’s also possible to edit the XML configuration files that Benchmark Factory uses to store all of this information to make global changes to these specifications.) I’ve included the XML file I edited during my configuration efforts in Listing 4.2.
Now that I’ve made all the desired adjustments to the TPCH schema’s tables and indexes, it’s time to submit the schema creation job. At this point in the dialogue, I can either simply save this Benchmark Factory job for later execution, or submit the job for immediate execution:
Figure 4.4.1. Benchmark Factory: Submitting the TPC-H Loading Job.
I can view the progress of this job as it executes in real time by clicking on the corresponding Messages link from within Benchmark Factory’s Jobs Queue page:
Figure 4.4.2. Benchmark Factory: Monitoring the TPC-H Loading Job.
When the TPC-H loading job reaches completion, Benchmark Factory returns the final statistics for the loading process. I’ve excerpted the final few messages from its Messages log below.
...
Loading table H_Lineitem ( 5994000 (99.90%) rows completed )
Loading table H_Lineitem ( 5995000 (99.92%) rows completed )
Loading table H_Lineitem ( 5996000 (99.93%) rows completed )
Loading table H_Lineitem ( 5997000 (99.95%) rows completed )
Loading table H_Lineitem ( 5998000 (99.97%) rows completed )
Completed loading table H_Lineitem.
Create Indexes
Test completed successfully.
** Job finished.
Total Time to perform the job:
Start Time: 10/3/2010 12:17:17 AM
End Time : 10/3/2010 12:26:50 AM
Elapsed : 0 Days 00:09:32.190
At last, a TPC schema, which I can test against! This entire run took less than 75 minutes to complete. I’ve included the DDL for the tables that Benchmark Factory created in Listing 4.3, and the DDL for the resulting indexes in Listing 4.4.
I repeated this same process for the TPC-E schema, creating a new user account (TPCE), building separate tablespaces for each table and index, selecting all of the standard tables and indexes, and then setting identical parameters for each table as shown in Figure 4.3.3. Once all tables and indexes had been created for both schemas, I brought my Oracle database back into ARCHIVELOG mode via SQL*Plus:
SQL> shutdown immediate; Database closed. Database dismounted. ORACLE instance shut down. SQL> startup mount; ORACLE instance started. Total System Global Area 810053632 bytes Fixed Size 2217712 bytes Variable Size 398461200 bytes Database Buffers 406847488 bytes Redo Buffers 2527232 bytes Database mounted. SQL> alter database archivelog; Database altered. SQL> alter database flashback on; Database altered. SQL> alter database open; Database altered.
Next Steps
I’ve now got my TPC-E and TPC-H database schemas in place, so it’s time to move on to some serious I/O performance simulation and investigation. The next article in this series will focus on how to:
- Generate realistic workloads using Benchmark Factory
- Capture appropriate metrics from my Oracle database and the operating system to quantify I/O response times and throughput
References and Additional Reading
The latest versions of the Transaction Processing Council standards for benchmarking the TPC-E and TPC-H application workloads can be found at:
TPC-E (Extreme OLTP) Standard Specification, v1.1.12:
http://tpc.org/tpce/spec/v1.12.0/TPCE-v1.12.0.pdf
TPC-H (Data Warehouse) Standard Specification, v.1.0.8:
http://tpc.org/tpch/spec/tpch3.8.0.pdf