DB2 for z/OS provides mainframe users with unmatched levels of resilience and scalability through technology known as data sharing. IBM announced that similar capabilities would be delivered for DB2 for LUW, in an optional facility dubbed pureScale. Read on to learn whether DB2 pureScale has met or exceeded expectations.
Welcome to my last column for 2010. Rather than the
traditional look back on the past year, this month I’d like to share some
practical experiences for one of the technologies I’ve mentioned several times
recently: pureScale.
Background
I’ve covered the major
pureScale concepts in previous columns, but here’s a quick refresher.
For many years, DB2 for z/OS has been able to provide
mainframe users with unmatched levels of resilience and scalability courtesy of
some rather neat technology known as data sharing. This makes use of IBM’s
Parallel Sysplex technology to allow many DB2 subsystems (or “members”) to
share the same data in a shared-disk architecture. In October 2009, IBM
announced that similar capabilities would be delivered for the DB2 for Linux,
Unix and Windows product, in an optional facility dubbed pureScale.
As shown in the diagram below, a component known as a CF
(aka “coupling facility”, “clustering facility” or more properly “PowerHa
pureScale Server”) handles the difficult job of coordinating the updates made
by each member to ensure data integrity is maintained. Each member has direct
access to the CF via an InfiniBand high-speed network interconnect, minimizing
the performance overhead and providing excellent scalability, (IBM has measured
near-linear scalability right up to the architectural limit of 128 members).
One of the design goals for pureScale was to minimize the
impact to the applications running in the cluster, and although there may be
some need to make minor changes to eke out the very best performance, it is
perfectly possible for an application to run on a pureScale cluster without
making any changes whatsoever. It’s possible to run two CFs in a duplexed arrangement,
with DB2 automatically keeping primary and secondary CF in sync. So, with dual
CFs and multiple DB2 members all hosted in separate physical boxes and a
fault-tolerant disk subsystem, there’s no single point of failure – losing a member,
a CF or a physical disk still allows processing to continue (albeit at a potentially
slower pace due to each surviving server having to shoulder more of the
processing load). This is therefore a true “active/active” clustering solution.
Finally, IBM is introducing an interesting new licensing
option with pureScale. Daily Licensing effectively allows a customer to pay for
the capacity they are actually using at any given time, rather than having to
size (and pay for) a given environment for the peak capacity which may be only
needed for a few days per year. This is shown in the two sample diagrams below,
with the red areas on the second chart showing the potential capacity/license
savings with this model.
|
I’ve gone on the record stating that pureScale is the single
most important development in DB2 for LUW for the past decade, and that means
that my organization has been busy building practical experience in the new
technology. Here are a few of the more interesting discoveries and technology
validations we’ve been making recently (with thanks to my colleague James Gill
who has done most of the hard work).
Resilience
One of the major benefits of going down the pureScale route
is resilience: it is possible to configure a system so there is no single point
of failure, and the loss of any given component will not result in an
application outage. Automatic Online Member Recovery makes it possible for DB2
to detect the loss of a member (if an LPAR or server crashes, for example) and
automatically restart the failed environment on one of the surviving LPARs to
allow recovery action to be taken. In the meantime, client connections are
automatically re-routed to the surviving members, with workload balancing ensuring
that all of them shoulder an equal proportion of the increased load. Once the
failed server/LPAR is available again, DB2 automatically detects the fact and
restarts the failed member on the original host. Once again, the workload
balancing feature will re-distribute incoming work to ensure all members receive
approximately the same load.
A similar situation exists in the event of a primary CF
failure: DB2 will detect the fact that it’s no longer getting a heartbeat from
the CF and temporarily suspend all work until the secondary CF is brought
completely up to date. Once that’s done, the secondary CF takes over as the new
primary (in simplex mode) and work is allowed to continue. In practice, this
means a “blip” in transaction response times while the secondary CF takes over.
(Note if the secondary CF fails DB2 merely continues in simplex mode until such
time as the CF can be re-started).
Below is a chart of some internal testing showing a steady
transaction rate (using an 80/20 read/write ratio with 100% data sharing) until
the primary CF is intentionally crashed. As you can see, the throughput drops
to zero for around 10 seconds until the secondary CF takes over and work
resumes again, with no transactions lost.
The table below shows the various failure scenarios we’ve
tested, and confirms that pureScale actually delivers on the promise of
automatic and transparent recovery in the event of a loss of any single
component.
Performance
Delivering a resilient system is all well and good, but if
it doesn’t perform adequately due to excessive clustering overheads the
technology is useless. Therefore, we have also been seeing quite how far we can
push our system (which is based on low-end commodity hardware). Our original
results were quite impressive: a 2-member cluster (based on an Intel D510M0,
dual core 1GHz Atom processor, 3GB RAM, 40GB SSD) managed 1000 transactions per
second against a 2.5M row table (32 concurrent threads, 25ms think time, 80/20
read/write ratio, 100% data sharing). That’s an incredible achievement for a
couple of low-end boxes running “netbook” processors – especially as it was
“out of the box” with no performance tuning or optimization done to the DB2
configuration.
The chart below shows even more impressive numbers. This
workload was run on the same hardware as I’ve outlined above, but on a tuned
DB2 system. This time around, we had 14 client connections with a 1ms think
time running against a 250,000 row table, and saw an amazing 5,500 transactions
per second with the member’s showing just 50% CPU load – see chart below.
Of course, these are simulated OLTP workloads running
against hardware that is not officially supported by pureScale (only IBM x and
p servers are currently supported) so your mileage will definitely vary.
However, those kinds of transaction rates would have been firmly in mainframe
territory not so long ago.
Summary
In our testing to date, DB2 pureScale has met or exceeded
every one of our expectations, delivering on the promise of a highly robust,
scalable and efficient clustering solution for DB2 customers. We’ll be
continuing our research as new capabilities are delivered during 2011, and I’ll
keep you updated on the results.