Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

DB2

Posted Dec 28, 2010

PureScale Performance Proven

By Julian Stuhler

DB2 for z/OS provides mainframe users with unmatched levels of resilience and scalability through technology known as data sharing. IBM announced that similar capabilities would be delivered for DB2 for LUW, in an optional facility dubbed pureScale. Read on to learn whether DB2 pureScale has met or exceeded expectations.

Welcome to my last column for 2010. Rather than the traditional look back on the past year, this month I’d like to share some practical experiences for one of the technologies I’ve mentioned several times recently: pureScale.

Background

I’ve covered the major pureScale concepts in previous columns, but here’s a quick refresher.

For many years, DB2 for z/OS has been able to provide mainframe users with unmatched levels of resilience and scalability courtesy of some rather neat technology known as data sharing. This makes use of IBM’s Parallel Sysplex technology to allow many DB2 subsystems (or “members”) to share the same data in a shared-disk architecture. In October 2009, IBM announced that similar capabilities would be delivered for the DB2 for Linux, Unix and Windows product, in an optional facility dubbed pureScale.

As shown in the diagram below, a component known as a CF (aka “coupling facility”, “clustering facility” or more properly “PowerHa pureScale Server”) handles the difficult job of coordinating the updates made by each member to ensure data integrity is maintained. Each member has direct access to the CF via an InfiniBand high-speed network interconnect, minimizing the performance overhead and providing excellent scalability, (IBM has measured near-linear scalability right up to the architectural limit of 128 members).

Data on shared disk subsystem

One of the design goals for pureScale was to minimize the impact to the applications running in the cluster, and although there may be some need to make minor changes to eke out the very best performance, it is perfectly possible for an application to run on a pureScale cluster without making any changes whatsoever. It’s possible to run two CFs in a duplexed arrangement, with DB2 automatically keeping primary and secondary CF in sync. So, with dual CFs and multiple DB2 members all hosted in separate physical boxes and a fault-tolerant disk subsystem, there’s no single point of failure – losing a member, a CF or a physical disk still allows processing to continue (albeit at a potentially slower pace due to each surviving server having to shoulder more of the processing load). This is therefore a true “active/active” clustering solution.

Finally, IBM is introducing an interesting new licensing option with pureScale. Daily Licensing effectively allows a customer to pay for the capacity they are actually using at any given time, rather than having to size (and pay for) a given environment for the peak capacity which may be only needed for a few days per year. This is shown in the two sample diagrams below, with the red areas on the second chart showing the potential capacity/license savings with this model.

potential capacity/license savings

potential capacity/license savings

I’ve gone on the record stating that pureScale is the single most important development in DB2 for LUW for the past decade, and that means that my organization has been busy building practical experience in the new technology. Here are a few of the more interesting discoveries and technology validations we’ve been making recently (with thanks to my colleague James Gill who has done most of the hard work).

Resilience

One of the major benefits of going down the pureScale route is resilience: it is possible to configure a system so there is no single point of failure, and the loss of any given component will not result in an application outage. Automatic Online Member Recovery makes it possible for DB2 to detect the loss of a member (if an LPAR or server crashes, for example) and automatically restart the failed environment on one of the surviving LPARs to allow recovery action to be taken. In the meantime, client connections are automatically re-routed to the surviving members, with workload balancing ensuring that all of them shoulder an equal proportion of the increased load. Once the failed server/LPAR is available again, DB2 automatically detects the fact and restarts the failed member on the original host. Once again, the workload balancing feature will re-distribute incoming work to ensure all members receive approximately the same load.

A similar situation exists in the event of a primary CF failure: DB2 will detect the fact that it’s no longer getting a heartbeat from the CF and temporarily suspend all work until the secondary CF is brought completely up to date. Once that’s done, the secondary CF takes over as the new primary (in simplex mode) and work is allowed to continue. In practice, this means a “blip” in transaction response times while the secondary CF takes over. (Note if the secondary CF fails DB2 merely continues in simplex mode until such time as the CF can be re-started).

Below is a chart of some internal testing showing a steady transaction rate (using an 80/20 read/write ratio with 100% data sharing) until the primary CF is intentionally crashed. As you can see, the throughput drops to zero for around 10 seconds until the secondary CF takes over and work resumes again, with no transactions lost.

chart of some internal testing showing a steady transaction rate (using an 80/20 read/write ratio with 100% data sharing)

The table below shows the various failure scenarios we’ve tested, and confirms that pureScale actually delivers on the promise of automatic and transparent recovery in the event of a loss of any single component.

pureScale actually delivers on the promise of automatic and transparent recovery

Performance

Delivering a resilient system is all well and good, but if it doesn’t perform adequately due to excessive clustering overheads the technology is useless. Therefore, we have also been seeing quite how far we can push our system (which is based on low-end commodity hardware). Our original results were quite impressive: a 2-member cluster (based on an Intel D510M0, dual core 1GHz Atom processor, 3GB RAM, 40GB SSD) managed 1000 transactions per second against a 2.5M row table (32 concurrent threads, 25ms think time, 80/20 read/write ratio, 100% data sharing). That’s an incredible achievement for a couple of low-end boxes running “netbook” processors – especially as it was “out of the box” with no performance tuning or optimization done to the DB2 configuration.

The chart below shows even more impressive numbers. This workload was run on the same hardware as I’ve outlined above, but on a tuned DB2 system. This time around, we had 14 client connections with a 1ms think time running against a 250,000 row table, and saw an amazing 5,500 transactions per second with the member’s showing just 50% CPU load – see chart below.

5,500 transactions per second with the member’s showing just 50% CPU load

Of course, these are simulated OLTP workloads running against hardware that is not officially supported by pureScale (only IBM x and p servers are currently supported) so your mileage will definitely vary. However, those kinds of transaction rates would have been firmly in mainframe territory not so long ago.

Summary

In our testing to date, DB2 pureScale has met or exceeded every one of our expectations, delivering on the promise of a highly robust, scalable and efficient clustering solution for DB2 customers. We’ll be continuing our research as new capabilities are delivered during 2011, and I’ll keep you updated on the results.

» See All Articles by Columnist Julian Stuhler



DB2 Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 



















Thanks for your registration, follow us on our social networks to keep up-to-date