In October 2009, IBM announced that some rather neat technology known as data sharing would be delivered for the IBM DB2 for Linux, Unix and Windows product, in an optional facility dubbed pureScale. Julian Stuhler takes a closer look at this technology and the possible implications for the entire IBM DB2 world.
Introduction
For many years, DB2 for z/OS has been able to provide
mainframe users with unmatched levels of resilience and scalability courtesy of
some rather neat technology known as data sharing. This makes use of IBM’s
Parallel Sysplex technology to allow many DB2 subsystems (or “members”) to
share the same data in a shared-disk architecture. In October 2009, IBM
announced that similar capabilities would be delivered for the DB2 for Linux,
Unix and Windows product, in an optional facility dubbed pureScale. In this
month’s column, I’m going to take a closer look at this technology and the
possible implications for the entire DB2 world.
Why pureScale?
Organizations may have many specific requirements for their chosen
database system, but there are a few generic needs that most of them will share.
These include old-fashioned virtues such as integrity, security,
recoverability, reliability and performance. Larger organizations will
typically add even more demanding items to that list, such as scalability (the
ability to significantly increase the available computing capacity to cope with
increased workload without making fundamental changes to the application) and
resilience (the ability to maintain a service despite and hardware/software
failures that may occur, by eliminating or minimizing single points of
failure).
Traditionally these last two requirements have been met by
hosting applications on mainframe computing platforms (and DB2 for z/OS Data
Sharing in particular). However, as more and more organizations base
mission-critical applications on DB2 for Linux, Unix and Windows, there is
increasing demand for mainframe-like levels of resilience and scalability on these
other platforms. Even in its initial incarnation, the pureScale feature goes a
long, long way to plugging that gap.
A pureScale Primer
PureScale adopts many of the same concepts and terminology
as the well-established DB2 for z/OS Data Sharing technology, usually
considered to be the “gold standard” for shared-disk database architectures.
Multiple DB2 instances, or “members” accept and service incoming DB2 work, with
all of them accessing a single copy of the data (usually held on a shared,
high-performance, fault tolerant disk subsystem).
So, how do you stop multiple processes all updating the same
data at the same time? That’s where the clever technology known as the
“coupling facility” (or CF) comes in. The CF is responsible for co-ordinating
the activities of all of the DB2 members in the pureScale group, and takes the
form of a dedicated unit officially called a “PowerHa pureScale Server” (now
you know why I’m calling it a CF…).
The CF holds shared locking information and cached data
pages of interest to one or more members of the group. Each member has direct
access to the CF via an InfiniBand high-speed network interconnect, minimizing
the performance overhead.
One of the design goals for pureScale was to minimize the
impact to the applications running in the cluster, and although there may be
some need to make minor changes to eke out the very best performance, it is
perfectly possible for an application to run on a pureScale cluster without
making any changes whatsoever.
Workload balancing facilities are provided, to allow work to
be intelligently distributed to various member DB2 systems based on how busy
each one is. Again, for most applications no changes will be required to take
advantage of this.
How does all of this relate to the resilience and
scalability requirements we discussed earlier? Well hopefully the scalability
one is fairly obvious; a pureScale user that needs to expand the available
computing resource in order to handle higher workload volumes can simply add
additional members to the group. IBM has used the hard-won experience of DB2
for z/OS Data Sharing to build in lots of optimizations to minimize the sharing
overheads, providing excellent scalability. In lab tests, that scalability has
been impressively close to being linear (e.g. doubling the number of members
almost doubles the available capacity) but as always your mileage may vary. New
capacity-based charging models are also being introduced, allowing users to
rapidly scale up and scale down their available resource in a very
cost-effective manner.
The resilience angle becomes apparent when you realize that
it’s possible to run two CFs in a duplexed arrangement, with DB2 automatically
keeping primary and secondary CF in sync. So, with dual CFs and multiple DB2
members all hosted in separate physical boxes and a fault-tolerant disk
subsystem, there’s no single point of failure – losing a member, a CF or a
physical disk still allows processing to continue (albeit at a potentially
slower pace due to each surviving server having to shoulder more of the
processing load). This is therefore a true “active/active” clustering
solution.
One more point before we go on to look at the potential
impact of this technology. It’s important to understand that the performance of
a pureScale cluster is critically dependent upon the speed of the interconnects
between the various members and the CF. Therefore, this technology is suitable
for providing local resilience only (i.e. within the same machine room). If you
also require an off-site DR capability (just in case the apocryphal Jumbo jet
really does decide to use your server room as an emergency landing strip),
you’ll need to combine pureScale with other solutions such as HADR, which are
capable of operating over greater distances due to their asynchronous nature.
The Bottom Line
So far so good, but what impact is this technology likely to
have and why aren’t we hearing more about it?
Well, the second of those questions is perhaps easier to
answer than the first. pureScale was announced in October 2009 and the initial
release went GA shortly thereafter, but since then IBM has been busy adding new
capabilities and optimizations to the technology while working with a select
group of customers and partners to build skills, awareness and experience in
the new technology. DB2 for z/OS Data Sharing took several releases to really
bed in, and my bet is that we’ll see a new version of the pureScale technology
later in 2010 that will be a higher profile affair than the original
announcement, with strong partner support from the likes of SAP and some
interesting customer case studies.
And the impact? I’ve already gone on record as saying that
pureScale is the single biggest development in the DB2 for LUW product in the
last ten years and I stand by that remark. While DB2 for z/OS will remain the
“gold standard” for scalability and ultimate availability, pureScale makes it
possible for a whole new set of users to benefit from IBMs unmatched experience
with shared-disk architecture. If this article is the first time you’ve heard
of pureScale, you can bet you’re going to hear plenty more about it in the
future.