DB2 Enterprise Edition
DB2 Enterprise Edition (DB2 Enterprise) is the premier data server offering from IBM that is the foundation of many mission-critical systems and the primary focus of this book. It is fully Web-enabled, scalable from single core servers to symmetric multicore servers and to massively parallel systems.
This edition of DB2 is considered to be the full-function offering from IBM (for example, it's the only DB2 edition that includes table partitioning). It is available on the same supported flavors of Linux, UNIX, and Windows as DB2 Workgroup. Applications built for DB2 Enterprise can scale upward and execute on massively parallel systems or downward to smaller servers.
DB2 Enterprise is meant for large and mid-sized departmental servers. DB2 Enterprise includes all the functions of the DB2 Express and DB2 Workgroup editions, and more. Additionally, there is a set of feature packs that are exclusive to this edition, such as the new DB2 9 Storage Optimization feature that provides deep row compression, and more.
Note - Some feature packs, such as the Database Partitioning and the Storage Optimization features, are not available using the authorized user metric.
DB2 Enterprise can be licensed using the same VU metric as DB2 Express and DB2 Workgroup; however, in the case of DB2 Enterprise, there is no RAM or VU server rating limits for the server on which this product is installed. DB2 Enterprise can also be licensed via the Authorized User metric. In contrast to DB2 Express and DB2 Workgroup, when licensing DB2 Enterprise using the Authorized User metric, the minimum number of Authorized Users for which it must be licensed is 25 Authorized Users for every 100 VUs on the server where it is installed. For example, if you installed DB2 Enterprise on an 8-core System p server (rated at 800 VUs), you would have to buy 200 Authorized User licenses at a minimum to use this licensing option, even if you intended for this data server to support fewer than 200 users.
Add-on Feature Packs for DB2 Enterprise Edition
DB2 Enterprise also comes with a set of add-on feature packs just like DB2 Express and DB2 Workgroup. For the most part (the exception being the pureXML feature pack), the extensibility features that can be purchased for DB2 Enterprise are unique to this edition of the DB2 family and provide even richer enterprise-capabilities to the data run-time environment.
The way you license DB2 Enterprise feature packs must be identical to how the DB2 Enterprise data server was licensed. This is one area where DBAs must pay particular attention when deciding how to license their DB2 Enterprise data servers because some feature packs aren't available with the DB2 Enterprise Authorized User license.
The feature packs that are available in DB2 Enterprise via Authorized User or Value Unit licensing are:
Performance Optimization Feature Pack
DBAs get licenses to install DB2 Query Patroller and the DB2 Performance Expert tool with this feature pack. The DB2 Query Patroller component of this feature pack is the same one that's part of the Workload Management feature pack available for DB2 Express and DB2 Workgroup.
The DB2 Performance Expert is a separately purchasable tool that's used to simplify performance management and tuning. It offers DBAs a consistent view into their instances, subsystems, databases, and applications across the DB2 family. It has a set of pre-canned reports to identify resource shortage and exception conditions in DB2 including locking conflicts, deadlocks, and application and SQL statements causing high workload. The DB2 Performance Expert also includes a set of detailed reports about SQL, database, and buffer pool activity with trend analysis and what-if hypothesis testing for performance evaluation.
Note - The version of the DB2 Performance Expert that's part of this feature pack can only be used for DB2 data servers running on Linux, UNIX, and Windows. The DB2 Performance Expert tool when purchased outside of this feature pack can be used across the DB2 family.
pureXML Feature Pack
This feature pack provides the same capabilities as the one offered for DB2 Express and DB2 workgroup detailed earlier in this section.
Advanced Access Control Feature Pack
Provides label-based access control (LBAC) protection services to data stored in a DB2 Enterprise data server. Using this feature, data stewards can control the read and write access of a user at the table column and row level. LBAC implements data access controls by creating an affinity between columns and generated protection security labels. If users attempt to get data from a table, they must have matching label credentials (or a parent label) granted to them. When there's a match, access is permitted; without a match, access is denied.
You can use this feature pack to create a security framework whose architecture is built on a hierarchal representation that matches the data access hierarchy, an array of that business entity, or a mix of the two. LBAC can also be used to restrict access to XML documents in columns (though as of the time of this writing, you cannot attach labels to fragments within the document itself).
Geodetic Data Management Feature Pack
This feature pack includes the DB2 Geodetic Data Management Extender that can be used to provide advanced spatial analysis capabilities. What separates this feature from the free DB2 Spatial Extender that comes with all DB2 data servers is that the Geodetic Data Management feature pack contains a built-in set of algorithms that take into consideration the curvature of objects, such as the earth's surface and so on. For example, maps generally are associated with some sort of applied projection in consideration of the map's purpose. The Mercator projection is very popular for navigational maps. When you look at a map generated using this projection, you'll notice that the top and bottom of the map seem much bigger than they really are. The fact that Greenland is one fourteenth the size of Africa often comes as a surprise when people think back to their public school atlas this feature pack compensates for these distortions.
While distortions caused by projections may not be of significance to applications that attempt to locate an address or division of a city, for weather pattern analysis or defense programs it could be very significant. If you can't afford to lose accuracy because of a projection, the Geodetic Data Management feature pack may be appropriate for your applications.
Real-Time Insight Feature Pack
Useful for managing large volumes of incoming data streams. Existing infrastructures can be easily overwhelmed when trying to manage large volumes of incoming data. Incoming data with message rates of tens to hundreds of thousands of messages per second can make it difficult to analyze this high volume of data.
The DB2 Real-Time Insight feature pack is powered by the DB2 Data Stream Engine (not discussed in this chapter since it's beyond the scope of this book) that enables organizations to store and forward high volumes of data from multiple data streams. The data messages from the feed can be aggregated, filtered, and enriched in real time before being stored or forwarded.
DB2 Data Stream Engine can load high volumes of data into the DB2 data server and make that data available to queries in real time through SQL. One example is a financial market data stream that provides information about financial transactions, such as stock trades and quotes.
The benefits of the Real-Time Insight Feature include:
Scalable solution loads large volumes of data with high throughput and low latency
Simultaneous storing and publishing of data from multiple feeds
Insight into the data with filtering and aggregation from feeds before storing and publishing
Maintenance of metadata, such as current state, for entities that are processed from the feeds
Simultaneous persistence of data to multiple database servers on multiple hosts
Real-time access by use of shared memory storage
Easy access to both real-time and historical data through standard SQL, C-API, and Java API interfaces.
Mobility on Demand Feature Pack
This feature pack provides the components of DB2 Everyplace Enterprise Edition detailed earlier in this chapter. When using DB2e via the feature pack, you need to be aware that the DB2 SyncServer must be collocated with the DB2 Enterprise data server (this is the reason for its reduced cost when compared to a full licensed version of DB2 Everyplace Enterprise Edition).
Homogeneous Federation Feature Pack
This feature pack provides the same capabilities as the one offered for DB2 Express and DB2 Workgroup detailed earlier in this section.
Two feature packs available in DB2 Enterprise only through Value Unit licensing (in other words, you can't buy these feature packs with DB2 Enterprise servers that are licensed with authorized users) are:
Database Partitioning Feature (DPF)
DB2 Enterprise provides the capability to enable DB2 to partition data across clusters or massively parallel servers. To the end user or application developer, a partitioned database appears to be on a single system, yet SQL statements are processed in parallel across all servers, thus increasing the execution speed for any given query.
The DPF delivers the true principals of scalability to a DB2 Enterprise environment, namely:
Double the resources, double the data: Each partition processes the same amount of data as before, and response times and throughput remain constant.
Double the resources, keep data constant: Each partition processes half the amount of data as before, and response times will be cut in half, and throughput will double.
Keep resources constant, double the data: Each partition processes double the amount of data as before, response times should double, and throughput will be cut in half.
You can partition data using the DPF across logical (within a larger SMP) and physical servers. An example of a partitioned database across multiple physical servers (though each server typically is small SMP server) is shown in Figure 17.
DB2 partitioned across multiple servers using DPF
In Figure 17 you can see that there are actually six copies of DB2. However, they all appear as a single copy to applications and administrators. Imagine the performance difference between a single copy of DB2 scanning 600,000 rows versus each copy of DB2 owning its own data and resources scanning just 100,000 rows. This is the power of the DPF.
Parallelism in DB2 is automatic and extended to the hash partitioning algorithm used in the DPF. If you are selecting just a few records (where partitioning key value = X), then DB2 will send that query directly to the node that contains that data. However, if you are scanning large amounts of data (as shown in Figure
1-7, typical in data warehousing), then DB2 will send the query to all partitions in the cluster and automatically parallelize the data access operations, driving more resources (RAM, CPU, and I/O) to get the job done faster.
It's not just query performance and faster maintenance operations that are delivered by the DPF. There is a significant resource savings per server because each server owns 1/nth of that data and generally requires fewer resources. Compare the partitioned database (shown on the left side in Figure 18) and the non-partitioned database on the right.
Saving resources with the Database Partitioning Feature
You can see that the servers comprising the partitioned database on the left side of Figure 18 require less memory, as each server is only responsible for 1/nth of the data. In contrast, the non-partitioned database on the right side of Figure 18 requires that much more memory be allocated to the buffer pool to accommodate the data.
Finally, DB2 can mix its intra-partition parallelism capabilities (where it runs components of an SQL statement in parallel on a single server) with its inter-partition parallelism (the DPF feature) as shown in Figure
Two kinds of parallelism with DB2 and DPF
You should consider the DPF feature pack as an add-on to DB2 Enterprise in the following circumstances:
The speed of database utility operations is key to your business operations. Operations like reorganizations, backups, and so on are parallelized with DPF so they can be performed much quicker. For example, if the six servers in Figure
1-7 all run the backup utility, they each only need to back-up one-sixth of the total data, and this operation should complete in one-sixth of the time when compared to a non-partitioned database.
You need to shrink your batch window because of long extract, transform, and load processes (ETL). For example, data server load jobs are also parallelized. For example, if you had to load 600 GB of data, only 100 GB of data would need to be loaded into each server.
Rolling window data update requirements for the warehouse make parallel SQL processing and additional log space essential. Not only does DPF give you more power to process the SQL, it gives you more resources (log space, memory, and so on).
DPF should be considered if the database contains more than 400 GB of RAW data. Other indicators for using DPF are the total number of rows in the table and whether scan performance is critical to workload performance. Large databases can be supported with DB2 Enterprise Edition, but large databases typically benefit from the DPF.
Your environment is characterized by complex queries that involve large aggregations, multi-table joins, and so on. Having multiple servers working on the same SQL problem can generally return results faster than a single server.
Your servers have plenty of available memory. Even though DB2 has 64-bit memory support for non-partitioned databases, multiple partitions have proven to provide more linear scalability and more efficient usage of memory than SMP parallelism alone.
DB2 Storage Optimization Feature
This feature provides deep row compression for you DB2 tables, as well as a backup compression utility that can significantly reduce the on disk size of your database backups. Tests have shown that more than 70 percent of table compression (data only) can be achieved when using this feature pack. Note that indexes remain uncompressed for performance reasons.
A number of other benefits, in addition to disk savings from having smaller tables, arise from row compression. Backups will be smaller (even without backup compression); maintenance operations (like backup) should run faster since there are fewer data pages to backup up; Q/A and test environments will be smaller; heating, ventilation, and air conditioning (HVAC) charges will be decreased since you need fewer disks; and query performance likely will increase because each I/O to disk brings back more rows into memory.
Row compression will drive up the CPU utilization of a data server. However, most systems (especially data warehouse systems) are I/O bound and, since compression will allow more rows on a data page and therefore more rows in the resident memory buffers, you'll likely see improved overall performance of your application.
Backup compression is also a part of this feature pack. The database backup architecture in DB2 is based on an open pluggable interface that allows you to use either the default compression algorithm that's shipped with DB2 or one of your own. The compression algorithm you use is embedded within the backup image so that it can be restored on a different server. Quite simply, if disaster happens and you lose the server where the compression algorithm is located, you can still get your data back because the compression algorithm is part of the backup image.