Cooling Your Overloaded Database with New DB2 UDB EEE Partitions
February 25, 2003
Today the server hardware is so powerful that it is not unusual to
find standalone low cost SMP (Symmetric Multiprocessing Machine) servers in
different clustered and non-clustered configurations. Data volume and the
number of concurrent database connections on those systems are rising higher
and higher. When we approach warning levels in our hardware utilization
we can simply expand the system with more hardware resources (disk space,
processors, memory) and new DB2 UDB EEE database partitions.
This article covers:
UDB EEE Terminology
Database Design Consideration
Database Configuration Inventory
for Adding New Database Partition
DB2 UDB EEE Terminology
Database Instance is a database manager logical environment consisting of
several database engine processes and their memory allocations. A database is
created within a database instance.
Node is a physical or logical machine hosting one or more DB2
UDB EEE instances or partitions.
Database Partition is a logical database unit with private data and index
subsets, configuration files and transaction logs. Database partitions can be
configured to support Intra-partition (SQL query broken in smaller parts and
executed in parallel on single partition), and inter-partition parallelism (SQL
query executed in parallel on several partitions), or both of them.
Catalog partition is the main database partition which is running DB2
UDB EEE instance manager. This partition has the number 0 and holds the DB2 UDB
EEE system tables that are not distributed to any other partition.
Coordinator partition (dispatcher) is a partition
that coordinates query execution, sends query parts on other partitions for
execution, assembles the result data set and returns it to the requester.
group is a named
set of one or more database partitions that belong to the same database. Node
group provide logical layer between tablespaces and partitions.
Partitioned Database Design Consideration
- Hardware Configuration
Partition design depends on the underlying hardware
configuration. In physical partition design, we have separate partitions on
separate hosts connected together with high-speed links. Today a logical
partition design, where we have several standalone n-CPU machines with a large amount
of RAM and several database partitions running on the same physical host, is
often used. In this configuration, every partition has exclusive access to its
own disk and memory. CPU resources are shared between partitions.
DB2 UDB EEE
The failover function is strictly dependent on underlying
cluster software such as IBM HACMP (High Availability Cluster Multi-Processing),
Veritas Cluster Server, Sun Cluster, and Microsoft Cluster Server. The main
consideration in failover design is the proper distribution of data from the
failed partition to all others after failover. Cluster software is unaware of
partitioning schema inside the database. The default behavior of just switching
the database partition from one node to another would create an unbalanced
database configuration, with one or more overloaded nodes. For that reason we
must configure the cluster software with a takeover list and DB2 UDB EEE with
multiple partitions per node so that each surviving node takes over the same amount of data.
Recommended Data Volume per Partition
the best performance, it is recommended to hold to a maximum of 50-100 GB of user
data per partition.
CPU Number Per Partition
This is the optimal performance that we can get with 2 CPUs per
Partitions Number Per Table
The number of partitions per user table varies and is dependant on
table size. The largest table should span all partitions while small tables
should be partitioned using the join key as a partition key. Small tables should
be grouped together on catalog node.
Buffer Pool Configuration
pool configuration depends on the tablespace(s) page size. For every page size
that differs from the default (4KB), we have to calculate and create a new
buffer pool for that tablesapce(s).