PureScale Performance Proven
December 28, 2010
DB2 for z/OS provides mainframe users with unmatched levels of resilience and scalability through technology known as data sharing. IBM announced that similar capabilities would be delivered for DB2 for LUW, in an optional facility dubbed pureScale. Read on to learn whether DB2 pureScale has met or exceeded expectations.
Welcome to my last column for 2010. Rather than the traditional look back on the past year, this month Id like to share some practical experiences for one of the technologies Ive mentioned several times recently: pureScale.
Ive covered the major pureScale concepts in previous columns, but heres a quick refresher.
For many years, DB2 for z/OS has been able to provide mainframe users with unmatched levels of resilience and scalability courtesy of some rather neat technology known as data sharing. This makes use of IBMs Parallel Sysplex technology to allow many DB2 subsystems (or members) to share the same data in a shared-disk architecture. In October 2009, IBM announced that similar capabilities would be delivered for the DB2 for Linux, Unix and Windows product, in an optional facility dubbed pureScale.
As shown in the diagram below, a component known as a CF (aka coupling facility, clustering facility or more properly PowerHa pureScale Server) handles the difficult job of coordinating the updates made by each member to ensure data integrity is maintained. Each member has direct access to the CF via an InfiniBand high-speed network interconnect, minimizing the performance overhead and providing excellent scalability, (IBM has measured near-linear scalability right up to the architectural limit of 128 members).
One of the design goals for pureScale was to minimize the impact to the applications running in the cluster, and although there may be some need to make minor changes to eke out the very best performance, it is perfectly possible for an application to run on a pureScale cluster without making any changes whatsoever. Its possible to run two CFs in a duplexed arrangement, with DB2 automatically keeping primary and secondary CF in sync. So, with dual CFs and multiple DB2 members all hosted in separate physical boxes and a fault-tolerant disk subsystem, theres no single point of failure losing a member, a CF or a physical disk still allows processing to continue (albeit at a potentially slower pace due to each surviving server having to shoulder more of the processing load). This is therefore a true active/active clustering solution.
Finally, IBM is introducing an interesting new licensing option with pureScale. Daily Licensing effectively allows a customer to pay for the capacity they are actually using at any given time, rather than having to size (and pay for) a given environment for the peak capacity which may be only needed for a few days per year. This is shown in the two sample diagrams below, with the red areas on the second chart showing the potential capacity/license savings with this model.
Ive gone on the record stating that pureScale is the single most important development in DB2 for LUW for the past decade, and that means that my organization has been busy building practical experience in the new technology. Here are a few of the more interesting discoveries and technology validations weve been making recently (with thanks to my colleague James Gill who has done most of the hard work).
One of the major benefits of going down the pureScale route is resilience: it is possible to configure a system so there is no single point of failure, and the loss of any given component will not result in an application outage. Automatic Online Member Recovery makes it possible for DB2 to detect the loss of a member (if an LPAR or server crashes, for example) and automatically restart the failed environment on one of the surviving LPARs to allow recovery action to be taken. In the meantime, client connections are automatically re-routed to the surviving members, with workload balancing ensuring that all of them shoulder an equal proportion of the increased load. Once the failed server/LPAR is available again, DB2 automatically detects the fact and restarts the failed member on the original host. Once again, the workload balancing feature will re-distribute incoming work to ensure all members receive approximately the same load.
A similar situation exists in the event of a primary CF failure: DB2 will detect the fact that its no longer getting a heartbeat from the CF and temporarily suspend all work until the secondary CF is brought completely up to date. Once thats done, the secondary CF takes over as the new primary (in simplex mode) and work is allowed to continue. In practice, this means a blip in transaction response times while the secondary CF takes over. (Note if the secondary CF fails DB2 merely continues in simplex mode until such time as the CF can be re-started).
Below is a chart of some internal testing showing a steady transaction rate (using an 80/20 read/write ratio with 100% data sharing) until the primary CF is intentionally crashed. As you can see, the throughput drops to zero for around 10 seconds until the secondary CF takes over and work resumes again, with no transactions lost.
The table below shows the various failure scenarios weve tested, and confirms that pureScale actually delivers on the promise of automatic and transparent recovery in the event of a loss of any single component.
Delivering a resilient system is all well and good, but if it doesnt perform adequately due to excessive clustering overheads the technology is useless. Therefore, we have also been seeing quite how far we can push our system (which is based on low-end commodity hardware). Our original results were quite impressive: a 2-member cluster (based on an Intel D510M0, dual core 1GHz Atom processor, 3GB RAM, 40GB SSD) managed 1000 transactions per second against a 2.5M row table (32 concurrent threads, 25ms think time, 80/20 read/write ratio, 100% data sharing). Thats an incredible achievement for a couple of low-end boxes running netbook processors especially as it was out of the box with no performance tuning or optimization done to the DB2 configuration.
The chart below shows even more impressive numbers. This workload was run on the same hardware as Ive outlined above, but on a tuned DB2 system. This time around, we had 14 client connections with a 1ms think time running against a 250,000 row table, and saw an amazing 5,500 transactions per second with the members showing just 50% CPU load see chart below.
Of course, these are simulated OLTP workloads running against hardware that is not officially supported by pureScale (only IBM x and p servers are currently supported) so your mileage will definitely vary. However, those kinds of transaction rates would have been firmly in mainframe territory not so long ago.
In our testing to date, DB2 pureScale has met or exceeded every one of our expectations, delivering on the promise of a highly robust, scalable and efficient clustering solution for DB2 customers. Well be continuing our research as new capabilities are delivered during 2011, and Ill keep you updated on the results.