In-memory databases are becoming increasingly popular and an ever-more important factor in performance-critical activities such as stream processing and deep data analytics. Join Julian Stuhler as he delves into the world of in-memory databases: a technology that’s both reassuringly familiar and intriguingly novel at the same time.
Happy New Year, and welcome to the first column of 2011. The
next year shows no signs of being any less hectic than the last, and I’m
looking forward to sharing interesting IBM Information Management news and
developments with you over the next 12 months. In this column, I’ll be taking a
look at in-memory databases: a technology that’s both reassuringly familiar and
intriguingly novel at the same time.
The basic concept of data in memory is certainly nothing
new. Early hard disks were slow by today’s standards, but even with the
increased rotational speeds and bit density of today’s technology it still
takes several orders of magnitude longer to read or write data to a magnetic disk
when compared to accessing it directly in memory. “I/O avoidance” is therefore
a major preoccupation for anyone concerned about the overall performance of an
IT system, and a great way of avoiding I/O is to keep data in memory once it
has been read, so that next time it’s needed it can be accessed immediately.
The very earliest file management and database systems were
severely memory constrained and had no choice but to read and write data
directly from and to (slow) magnetic disk systems. As computing platforms
evolved and more memory became available, a new generation of systems began to cache
commonly used data to prevent it from constantly being read and re-written to
disk. In DB2, these memory areas were (and still are) known as “buffer pools”. This
feature was present in the very first release of the product on the mainframe
in the mid-1980s, and remains a critical factor in the overall performance of most
DB2 applications today. Of course, as buffer-pools are stored in volatile
memory additional steps need to be taken to ensure that committed updates made
to cached data are recoverable in the event of a system crash (hence DB2’s “log
write ahead” protocol).
Many presentations and papers have been written on the art
and science of DB2 buffer pool tuning. Various approaches exist for sizing the
pools, setting the various thresholds that govern their activity and segmenting
data across them. Facilities such as sequential prefetch allow data to be
asynchronously loaded into a buffer pool in advance of the application actually
needing it, massively reducing the I/O delay for sequential operations.
A favorite tuning technique used by many DBAs involves
setting up one or more buffer pools that are dedicated to small, high-access
tables (such as “code” or “lookup” tables). Such buffer pools are typically
sized so that they are able to hold the entire table and/or indexes,
effectively eliminating all physical IO operations once each page has been
accessed once and read into the pool. DB2 10 for z/OS takes this one stage further
and introduces a new feature (via the PGSTEAL(NONE) attribute) which causes DB2
to automatically read all of the relevant pagesets into the
buffer pool at the first SQL access. Furthermore, the optimizer knows and takes
into account that the data is “pinned” in memory and assumes that no I/O will
be required when making access path decisions.
The advent of 64-bit support on most processors and
operating systems has opened up more possibilities for storing greater amounts of
data in buffer pools during the past couple of years. However, you still need
to have enough physical memory in your server to support those pools, and even
at today’s relatively low prices it rapidly becomes uneconomical to configure
systems beyond a few hundreds of Gigabytes.
Enter the SSD
So, what else has changed to open a new chapter in this
story? One answer to that question is the SSD, or Solid State Disk (aka Solid State Drive). This is essentially a device that uses flash-based, non-volatile memory
to store data, while presenting itself to the outside world (and the operating
system) as just another hard disk drive. Compared to conventional magnetic
media, SSDs are typically somewhat more expensive, but much faster (up to 70
times the number of I/O operations per second), quieter and use significantly
less power. As such, SSDs represent a compromise between the outright speed of
conventional memory access and the low cost of a traditional magnetic disk
drive (see diagram below).
SSDs have been around for while but as the chart above indicates,
they are falling rapidly in price: they are currently around 10 times the price
of a Hard Disk Drive (HDD) on a cost-per-megabyte basis but that is expected to
fall to just 2-3 times over the next 12-24 months.
Today, the SSD is mainly being used as a straightforward
replacement for conventional HDDs. For a DB2 system that is predominantly
I/O-constrained, moving the high-access tablespaces onto SSDs can give some
dramatic and immediate performance benefits with relatively minor
implementation cost and effort.
At the same time, many database vendors are beginning to
extend and enhance their traditional RDBMs systems to make them “SSD aware” and
specifically exploit their unique characteristics. For example, some vendors
are looking at the possibility of multi-tier cache systems based on the “temperature”
of the data, as shown in the diagram below.
In this kind of environment, conventional buffer pools would
continue to host the “hot” data exactly as they currently do. However, the
system administrator can also create a new type of larger second-level buffer
pool on an SSD, aimed at hosting “warm” data, which may be needed again shortly.
Instead of “casting out” updated pages all the way back to the disk when the
conventional buffer pool needs to free up space, the page will instead be moved
to the second-level pool where it can be quickly and efficiently retrieved
again if needed. Conventional “cast out” processing can occur to write updated pages
from either the conventional or level-2/SSD buffer pool to disk when they become
It remains to be seen what sort of real-world performance that
such an architecture could deliver, but internal lab testing has shown up to
2.6x performance improvements for a sample TPC C workload when compared to a more
conventional 2-tier approach.
Dedicated Data in Memory Database Systems
Finally, no discussion on in-memory databases would be
complete without at least a brief mention of the high performance in-memory database
systems that have been built from the ground up to support today’s search and
social networking sites. Google, Twitter, Facebook and many others all rely on various
forms of in-memory database to provide rapid response times in the face of
ever-increasing data volumes. Other data in memory solutions such as SolidDB
(acquired by IBM in 2007) provide a somewhat more generalized solution to address
many of the same issues, and can even be used as a kind of high performance
front end to more conventional disk-based databases.
From a simple HDD swap to SSD-aware RDBMS systems and
beyond, in-memory databases are becoming increasingly popular and an ever-more
important factor in performance-critical activities such as stream processing
and deep data analytics. Expect to hear much more about them over the next few