Free Newsletters:
DatabaseJournal  
DBANews
Database Journal
Search Database Journal:
 
HOME News MS SQL Oracle DB2 Access MySQL PostgreSQL PHP SQL Etc Scripts Links Discussion
internet.com

» HOME
» NEWS
» FEATURES
» SERIES
MS SQL
Oracle
MS Access
MySQL
DB2
» RESOURCES
Products
Scripts
Links
» DISCUSSION
» TECH JOBS

Marketplace Partners
Be a Marketplace Partner




internet.commerce
Be a Commerce Partner
Promotional Pens
Laptops
Domain registration
Desktop Computers
Web Hosting Directory
Online Education
KVM Switch over IP
Online Shopping
Prepaid Phone Card
Data Center Solutions
Promos and Premiums
Find Software
Phone Cards
Remote Online Backup




MySpace Joins eBay, Yahoo in Open Profile Push

News Corp. Unit Under Fire for Ties to Hacker

Are Non-PC Devices Hurting 'Net Innovation?

internet.com
IT
Developer
Internet News
Small Business
Personal Technology
International

Search internet.com
Advertise
Corporate Info
Newsletters
Tech Jobs
E-mail Offers


Linked Data Planet Conference & Expo

CA ERwin® Data Modeler Proven database design and modeling. Efficiently analyze, design and deploy effective database solutions. Whitepaper: Manage SQL Server Deployments
Try it free: CA ERwin® Data Modeler


Guide to Oracle 11g and Database Migration
Oracle Database 11g includes more features for self-management and automation, which makes it easier for customers to cost-effectively manage their data. Download this Internet.com eBook for an overview of some of the new features in 11g and for an overview of the issues you need to consider as you prepare for a database migration. »
Innovate Faster with Oracle Database 11g
Read this in-depth analysis of 56 customers, which shows significant differences between the value software vendors Oracle and SAP deliver to midsize companies. »
Oracle Business Intelligence Standard Edition One
Find out how Newport Beach, CA-based Mobilitie is shaking up the telecom industry by leveraging technology to provide an entirely different financial model for deploying, upgrading, and owning wireless and wireline network assets. »
Business Intelligence and Enterprise Performance Management: Trends for Emerging Businesses
Quickly implementing an ERP software solution can be of tremendous benefit; however, companies often struggle to balance the benefits of reducing implementation time and cost with the risks of an accelerated deployment. Read this white paper to learn about easy-to-follow best practices for achieving a successful accelerated implementation. »
Making the Case for Oracle Database on Windows
Users benefit as vendors reduce enterprise complexity and deliver integration. »

Production Manager (hands on)
Aquent
US-MA-Cambridge

Justtechjobs.com Post A Job | Post A Resume
Oracle
April 30, 2008
Oracle Database 11g: Database Replay, Part 1
By Jim Czuprynski

Synopsis. Oracle DBAs have long wished to be able to capture applications’ workloads in a production environment and then replay that workload against a test environment to determine the impact of proposed database or application changes on database performance. Oracle Database 11gR1’s new Database Replay feature enables an Oracle DBA to capture, massage, and then replay workloads either selectively or in their entirety across a wide range of database environments and platforms. This article provides a primer for using Oracle 11g Database Replay to effectively predict the performance of rapidly changing applications within increasingly fluid database environments.

If my time served in information technology has taught me anything, it’s the utter truth behind Murphy’s Law: “Anything that can go wrong will go wrong.” Over the years, I’ve also learned the veracity of several corollaries to Murphy’s Law, including the fact that “interchangeable parts won’t” and especially the rejoinder that “Murphy was actually an optimist.” I hope to add my own corollary to these grim testimonials one day with the following observation obtained at great pain: “Nothing works in production like it does in the test environment.”

All attempts at ill humor aside, Oracle DBAs face a severe challenge on a regular basis: to predict accurately how the next set of changes to the database’s application code, database patch set, or hardware configuration will affect negatively the entire database environment. When I say the entire environment, I do mean that literally: every SQL statement that any running application has executed, regardless if it’s just a simple query or a complex batch job that issues a huge volume of DML statements, must be captured. This challenge has become even more acute because today’s typical application workloads are generated across a landscape that spans myriad technologies: n-tier application servers, web farms, and even client-server legacy applications. Moreover, when an application appears to be performing poorly, the root cause of a detected performance issue is often nearly impossible to track down; it may be the indirect result of an incorrect network configuration, the application server’s configuration, or even the application client’s environment variable settings.

The only chance to accomplish this lofty yet crucial goal is via an application workload capture-and-playback product suite. This type of application is specifically designed to capture the complete workload that’s been executed against the current production database environment (P+0) and then play back precisely that same workload against the next iteration of the production environment (P+1). However, my past experience has shown this means asking upper management to open the company checkbook to purchase a relatively expensive third-party solution (e.g. HP’s LoadRunner tool). The cost of licensing this software, the cost of procuring and configuring the servers it runs upon, and most especially the manpower expense to configure and prove out the capture-and-playback suite even before workloads can be captured and replayed may easily run into the six-figure USD range. And that’s why many IT organizations have surrendered to the perception that it’s simply impossible to test for application system regression in a cost-effective manner.

Beyond Performance Regression

My prior article series about Oracle Database 11g’s new SQL statement performance tuning features – SQL Performance Analyzer (SPA) and SQL Plan Management (SPM) – discussed how Oracle 11g allows the DBA to easily isolate SQL statements whose performance has either progressed, remained unchanged, or regressed after the application environment has been changed. Any robust capture and playback suite must certainly capture and compare the differences between the original (P+0) versus the future (P+1) system, application, and database performance statistics, especially SQL statements that are now performing dramatically worse. However, there are two other types of regression that any robust capture-and-playback suite needs to address as well:

Error Regression. When a captured workload is being played back, it’s not unusual to encounter errors, and in fact this is almost certainly a desirable outcome. For example, I’d like to verify that an expected exception like the violation of referential integrity (i.e. violation of a primary key, foreign key, unique key, CHECK, or NOT NULL constraint) actually arises and is trapped correctly. Also, I’d like to insure that an appropriate and expected exception is raised whenever a crucial business rule is violated. For example, if during payroll check processing an employee’s net salary doesn’t balance to gross pay less all deductions, I’d like that exception to be trapped as an error.

Therefore, any robust capture and playback suite must be capable of monitoring three types of error regression:

  • Have all expected errors occurred?
  • Next, have any error conditions been raised that were not expected? This obviously indicates that severe error regression is possible because of system or application changes.
  • Finally, have any expected errors not occurred? This condition is much more subtle, because it indicates that something sinister has changed within the system or application, and crucial business rules are either being misapplied or not applied at all.

Data Regression. Any robust capture and playback suite also must pay strict attention to any variance in the data itself after the playback has completed. For example, if I’m testing a mission-critical financial system, I must insure that the identical financial transactions have completed in the appropriate order and all accounts total to the same balances in the P+1 environment as they would in the P+0 environment. If I don’t receive identical results, I must assume that something in my application, database, or environment has changed to prevent a perfect playback.

Another crucial feature of the capture and playback suite: It must be able to insure that the captured workload is played back against the P+1 environment only after that environment has been reset to reflect the state of the P+0 environment at the time workload capture was initiated. Otherwise, there’s a chance that “false positive” indications of data regression will be detected that have nothing to do with changes to the application, database, or environment.

The metaphor of a player piano roll especially appeals to me when I imagine how a workload’s individual, dependent transactions need to be captured and then played back: Once the pianist has finished playing a piece of music, the piano roll has not only captured the exact notes performed, but also the exact frequency at which individual keys were struck. In essence, it gave the listener an exact duplicate that encapsulates the maestro’s playing style, including all of her performance’s subtle pauses. (Humorous aside to readers younger than 25: If you’ve never seen a player piano roll, please substitute “MP3” or even “WAV file” … or ask one of your older co-workers to explain how things were “back in the old days.”)

Database Replay: Features Summary

Thankfully, Oracle 11g’s new Database Replay (DBR) suite offers all of the features I’ve just described. DBR permits an Oracle DBA to:

  • Capture a workload that’s being generated on a production database system. This includes the ability to capture concurrent execution of the same SQL statements across multiple sessions while simultaneously gathering all dependent transactions.
  • Massagethe captured workload before it’s eventually executed against a representative test system. This allows the DBA to adjust the frequency at which the workload is played back, as well as the capability to remap connections to different user sessions, different services, or – in the case of playback on an Oracle 11g Real Application Clusters (RAC) test system – one or more database instances.
  • Replay the captured and “massaged” workload on a test system. The test system is configured to conform to the P+1 configuration so that the DBA can accurately determine exactly how the workload will react to any proposed system changes, including application changes, software patches, and even hardware upgrades. The target for the test system can be a test or QA database environment, or even a snapshot standby database. (More on this last item in later articles.)
  • Perform regression analysis to highlight any differences that exist between the P+0 and P+1 simulated workload. DBR will automatically identify and analyze the vectors of any error regression, data regression, or SQL statement regression.

The beauty of Database Replay is that it eliminates the necessity to fashion simulated workloads to perform regression analysis. Instead, the DBA can execute precisely the same SQL statements that have been recorded, so this tends to provide a more accurate picture of system regression because other extraneous factors (e.g. network latency) are reduced or eliminated. Best of all, since all the recorded SQL statements that comprise the workload are played back, there’s virtually no chance that even seemingly insignificant or rarely executed code will be ignored as a possible vector for regression. This can be crucial in determining if an application will scale properly in a Real Application Clusters (RAC) clustered database environment.

The next four sections of this article provide a high level primer on how Database Replay features interact to accomplish their common goal: an accurate determination of just how much regression can be expected when migrating a production system from P+0 to P+1. In the next articles in this series, I’ll demonstrate how to utilize Database Replay to capture, preprocess, replay, and analyze results from a DBR operation.

Phase 1: Recording a Workload

Oracle 11g Enterprise Manager Database Control provides an intuitive interface to Database Replay features that enables capture, massage, replay, and regression analysis of a workload for performance, error, and data regression. The interface also provides excellent feedback on the status of each phase of a Database Replay task set. Figure 1 below shows the initial screen that’s displayed when a Database Replay operation is initiated by selecting the Database Replay option from the Real Application Testing section of EM’s Software and Support tab:


Figure 1: Database Replay Master Panel.

During this phase of a Database Replay operation, the complete workload as seen from the perspective of the production database is captured and recorded. The DBA simply needs to insure that there is a sufficiently “interesting” (i.e. pertinent) workload running on the production system; DBR does the rest by capturing all executing SQL statements issued from all external clients. This includes:

  • SQL queries, DML statements, and DDL statements
  • PL/SQL blocks and Remote Procedure Calls (RPCs)
  • Object Navigation requests and OCI calls

Note that while the DBR capture operation is running, Oracle 11g doesn’t halt any running background jobs; any internal clients continue making requests as well.

DBR records the workload via a series of shadow processes. These shadow processes then filter the necessary information to precisely reproduce the system workload, and finally write this metadata into a series of XML files – one set of files for each process -- for processing and playback at a later time. About the only concern an Oracle 11g DBA might have, is whether there is sufficient space on the file system for the numerous XML files that are written during this process.

Phase 2: “Massaging” the Workload

Once a DBR workload has been recorded, it almost certainly will need some slight adjustments before it’s replayed. For example, it may be necessary to remap certain external client connections so they can be replayed accurately in the P+1 environment. During this phase, DBR prepares workload-specific metadata for its eventual replay, and any parameters that might affect the outcome of the replay can be modified at this stage.

This preprocessing must occur on the same database version as the replay, but as long as the database versions match, the scrubbing operation can be performed on a production, test, or other database system. In fact, Oracle highly recommends performing this metadata “massaging“ on a host other than the production server so as not to affect the health or performance of that server.

Phase 3: Replaying the Workload

Now that the workload has been massaged, it’s time to initiate its replay. Once the metadata scrubbing described above is done, designated DBR replay client(s) can replay the workload as often as analyses are required.

Resetting the Test Environment. Before any replays are initiated, however, the Oracle DBA must first “reset” the target database and host environment being used for testing, because it’s vital that the test server matches the production server before any changes are applied; otherwise, unexpected regression may occur (a.k.a. a false positive). Fortunately, this is much simpler to perform with the advent of FLASHBACK DATABASE features in Oracle 10g. Other alternatives include normal point-in-time incomplete recovery via RMAN, or even DataPump Export and Import. Once the testing environment has been reset properly, the Oracle DBA next applies all proposed changes to the production system on the test server so that it is now in state P+1, and then she transfers the previously captured workload to the P+1 server.

Replaying the Workload via the Replay Driver. When it’s finally time to replay the previously captured workload against the P+1 server, a special application called the Replay Driver begins sending requests to the target RDBMS. Since the Replay Driver is client-agnostic, it makes no difference to Oracle 11g as to what type of client was sending the requests initially. The Replay Driver consumes the recorded workload and sends appropriate requests to the P+1 system so that it behaves as if external clients were actually issuing the requests.

Since it will distribute all workload capture streams between all replay clients, the Replay Driver can take into account network bandwidth, CPU, and memory capacity. The Replay Driver can also take advantage of remapped connection strings so that they correlate one-to-one (i.e. single-instance to single-instance) or many-to-one (i.e. single node to several Real Application Cluster nodes), and that means that connection load balancing can be taken into account. Just as important, the Replay driver will ignore any activity that was originally generated by internal clients (e.g. EM Database Control) and simply not replay that activity. It will also ignore any recorded activity that utilized connections to external databases via database links or accessed directory objects.

One other intriguing advantage of using Database Replay: A captured workload can be played back in either synchronous or asynchronous replay mode. In synchronous mode, each transaction is replayed in exactly the order that it was recorded; however, DBR can also replay a workload back in asynchronous mode, i.e. without regard to transaction synchronicity, so that a heavier-than-recorded workload can be generated. This is an especially useful feature when attempting to perform a “test to destruction” of a new or modified database environment.

DBR Workload Replay Scope. Oracle 11gR1 Database Replay can accurately evaluate the following types of changes to a database environment during the workload replay phase:

  • Database upgrades
  • Database patches
  • Changes to database schemas
  • Changes to initialization parameters
  • Modifications to one or more Real Application Cluster (RAC) nodes as well as their interconnect configurations
  • Upgrades to the operating system via OS patches and upgrades
  • OS platform modifications, including transitioning from 32-bit to 64-bit environments
  • Changes to server memory or CPU configurations
  • Modifications to the database’s storage configurations, including migration of database files between a named file system (e.g. EXT3, NTFS), ASM storage, and/or raw storage

DBR Workload Replay Limits. Database Replay’s simulation abilities do have some noteworthy (and justifiable) limitations:

  • SQL*Loader direct path loading cannot be replayed; however, conventional path SQL*Loader operations are indeed replayable.
  • Import and export operations, whether via traditional or DataPump methods, are not replayable.
  • Oracle Shared Server sessions cannot be traced.
  • Flashback Database recoveries and Flashback Query operations are not replayable.
  • Oracle Streams, including Advanced Queuing (AQ) operations that aren’t PL/SQL based, can’t be replayed.
  • Distributed transaction processing, including remote COMMIT operations, is only replayable as local transactions.
  • Oracle Call Interface (OCI)-based object navigation isn’t replayable.

For the most part, these limitations do make sense. For example, a Flashback Database operation is essentially an incomplete recovery of the database, so it’s not part of normal transaction processing, and I certainly wouldn’t be concerned about whether its performance has regressed. While the limitation against Shared Server sessions does make sense, there are still a few database shops out there that do utilize Shared Server for connection pooling, so this is a minor disappointment.

Phase 4: Regression Analysis

After the workload replay is completed, Database Replay provides several different analyses of how the replayed workload’s performance varied in the P+1 environment versus its original performance in the P+0 environment. As I discussed at the beginning of this article, any good regression testing suite has the capability to trap and analyze performance regression, data regression, and error regression, and DBR definitely doesn’t disappoint on these features.

For example, DBR can detect immediately any performance differences at a summary level via its set of Capture and Replay reports. From these reports, it’s possible to drill down into detailed analysis of database performance stored within Automatic Database Diagnostic Monitor (ADDM) runs, Automatic Workload Repository (AWR) reports, and Active Session History (ASH) reports.

Whatever the source of the divergence, DBR post-replay analysis can identify and handle two different flavors of divergence:

  • Online divergence may indicate that the database replay has badly malfunctioned, and should probably be halted, as the results of the replay are possibly less than meaningful.
  • Offline divergence is actually an expected result of a successful database replay operation; this type of divergence is typically detected and measured after the replay is completed.

Next Steps

Enough theory! Any tool this elegant deserves a significant evaluation, so in the next article in this series, I’ll demonstrate how to:

  • Capture a simple workload on a single-instance Oracle 11gR1 database
  • Massage (i.e. pre-process) that workload for eventual replay
  • Replay the massaged workload on a two-node Oracle 11gR1 Real Application Cluster (RAC) clustered database
  • Identify possible issues that might arise during transition to a similar target environment

References and Additional Reading

While I’m hopeful that I’ve given you a thorough grounding in the technical aspects of the features I’ve discussed in this article, I’m also sure that there may be better documentation available since it’s been published. I therefore strongly suggest that you take a close look at the corresponding Oracle documentation on these features to obtain crystal-clear understanding before attempting to implement them in a production environment. Please note that I’ve drawn upon the following Oracle Database 11gR1 documentation for the deeper technical details of this article:

B28274-01 Oracle Database 11gR1 Performance Tuning Guide

B28279-02 Oracle Database 11gR1 New Features Guide

» See All Articles by Columnist Jim Czuprynski

Tools:
Add databasejournal.com to your favorites
Add databasejournal.com to your browser search box
IE 7 | Firefox 2.0 | Firefox 1.5.x
Receive news via our XML/RSS feed

Oracle Archives

Flash Demo: Learn how IBM Information Server Blade is easy to manage, highly scalable and efficient.
Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers. Sponsored by HP, Citrix, and Intel.
Download: Solaris 8 Migration Assistant. Run Solaris 8 apps on the latest SPARC systems and Solaris 10.
Data Sheet: IBM Information Server Blade
Five Trends for Application Development. Download Your Complimentary Report. Exclusive. Act Now.


Latest Forum Threads
Oracle Forum
Topic By Replies Updated
GET DATA FROM .DBF FILE, ORACLE 9i revelation 5 May 5th, 10:55 AM
Could not locate Java runtime. Oracle installation error revelation 0 April 10th, 12:06 AM
Database Backup junOOni 4 March 20th, 06:28 AM
Helpme to How to Write Text File intelram_18 1 March 17th, 02:54 PM







JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
Microsoft Article: HyperV-The Killer Feature in WinServer ‘08
Avaya Article: How to Feed Data into the Avaya Event Processor
Microsoft Article: Install What You Need with Win Server ‘08
HP eBook: Putting the Green into IT
Whitepaper: HP Integrated Citrix XenServer for HP ProLiant Servers
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 1
Intel Go Parallel Portal: Interview with C++ Guru Herb Sutter, Part 2--The Future of Concurrency
Avaya Article: Setting Up a SIP A/S Development Environment
IBM Article: How Cool Is Your Data Center?
Microsoft Article: Managing Virtual Machines with Microsoft System Center
HP eBook: Storage Networking , Part 1
Microsoft Article: Solving Data Center Complexity with Microsoft System Center Configuration Manager 2007
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Intel Video: Are Multi-core Processors Here to Stay?
On-Demand Webcast: Five Virtualization Trends to Watch
HP Video: Page Cost Calculator
Intel Video: APIs for Parallel Programming
HP Webcast: Storage Is Changing Fast - Be Ready or Be Left Behind
Microsoft Silverlight Video: Creating Fading Controls with Expression Design and Expression Blend 2
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Sun Download: Solaris 8 Migration Assistant
Sybase Download: SQL Anywhere Developer Edition
Red Gate Download: SQL Backup Pro and free DBA Best Practices eBook
Red Gate Download: SQL Compare Pro 6
Iron Speed Designer Application Generator
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
How-to-Article: Preparing for Hyper-Threading Technology and Dual Core Technology
eTouch PDF: Conquering the Tyranny of E-Mail and Word Processors
IBM Article: Collaborating in the High-Performance Workplace
HP Demo: StorageWorks EVA4400
Intel Featured Algorhythm: Intel Threading Building Blocks--The Pipeline Class
Microsoft How-to Article: Get Going with Silverlight and Windows Live
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES