Oracle 10gR2 RAC Load Balancing Features
March 21, 2007
Synopsis. To insure an even workload distribution, a clustered database must employ methods to distribute incoming sessions evenly and effectively across the various components of its cluster. This article explores how Oracle 10gR2 takes advantage of simple client-side connection load balancing, server-side load balancing, and the new Load Balancing Advisor (LBA) features to significantly improve the distribution of workloads across the appropriate nodes in an Oracle 10gR2 Real Application Clusters (RAC) clustered database environment.
Oracle 10g Real Application Clusters (RAC) provides a robust, high-availability (HA) architecture that is scalable to an unprecedented size as business computing requirements increase. Since Oracle has placed the concept of services as the centerpiece of this architecture, it also means that application workloads can be distributed appropriately across the entire clustered database. This virtually eliminates the possibility that any one application will starve because of resource requirements during its peak processing periods. It also guarantees that there is always a bit of Kentucky windage in the overall estimates for the required peak capacity of any Oracle 10g database system.
Oracle 10g provides several different methods to support load balancing capabilities for this service-based architecture, including client-side connection load balancing, client-side connection failover, and server-side connection load balancing. Since these methods existed before Oracle 10g, Ill briefly review them to explain their relative benefits and shortfalls.
Client-Side Load Balancing Methods
Client-Side Connection Load Balancing. This load balancing method has been available since Oracle 8i. When a user session attempts to connect to the database, the databases listener will assign the session randomly to one of the listed multiple listening endpoints. Listing 1 shows an example of the TNSNAMES.ORA network configuration file entries for an alias named CSLB that uses the LOAD_BALANCING=ON directive to implement this load balancing feature.
While this load balancing method is certainly simplest to implement, it also has an obvious limitation: the listener has no idea if the session has been assigned to an endpoint whose corresponding database server is already overloaded. Moreover, since the listener is essentially picking the connection completely at random, there is no guarantee that the connection chosen will even be available at that time. This may force the session to wait for a considerable length of time perhaps even minutes, a relative eternity in computing timeframes! until the operating system indicates to the listener that the connection is unavailable, which causes the user session to fail eventually with an ORA-01034 ORACLE not available error.
Client-Side Connection Failover. Obviously, balancing the overall number of connections is a desirable goal, but what happens if the chosen connection point is unavailable because the database servers listener is no longer active? To forestall this, Oracle 8i provided the capability to determine if the connection that has been chosen at random is still alive and, if not, continue to try the other listed connection points until a live connection is found. This simple method tends to limit the chance of a lost connection, but unfortunately it must rely on TCP/IP timeouts to determine if a connection is alive or dead, and this means that an application may wait several seconds (or even longer!) before it receives a notification that the connection has been terminated.
Ive laid out the TNSNAMES.ORA entries to activate client-side connection failover in Listing 2. They are almost identical to Listing 1 with the notable exception of one more directive: FAILOVER=ON.
Server-Side Load Balancing
The two previous methods will adequately handle the distribution of user sessions across available resources while helping to guarantee that no session will wait an excessive time to find a currently active address on which to connect. Clearly, a better solution was needed, and Oracle 9i offered one: server-side load balancing. This method divides the connection load evenly between all available listeners by determining the total number of connections on each listener, and then distributing new user session connection requests to the least loaded listener(s) based on the total number of sessions already connected. While a bit more complex to implement because it requires configuration of multiple listeners, it most definitely helps to even out connections across all available listeners in a database system.
To implement server-side load balancing, at least two listeners must be configured. Also, the REMOTE_LISTENERS initialization parameter must be added to the databases PFILE or SPFILE so that the database knows to search out the value provided in that parameter in the database servers TNSNAMES.ORA configuration file. When server-side load balancing is activated, each listener that contributes a listening endpoint communicates with the other listener(s) via each database instances PMON process. Oracle then determines how many user connections each listener is servicing, and it will distribute any new connection requests so that the load is balanced evenly across all servers. The entries in TNSNAMES.ORA direct the listeners to share information about the relative load connectivity.
As shown in Listing 3, Ive gathered these required changes to TNSNAMES.ORA. Ive also created a new PL/SQL package, HR.PKG_LOAD_GENERATOR, that incorporates three different methods to generate user sessions in an attempt to overload an Oracle 10gR2 database listener. A simple shell script, RandomLoadGenerator.sh, calls a few SQL command files that in turn make calls to the packages procedures and thus create a sample workload of approximately 40 connections against an Oracle 10gR2 database that contains the standard sample schemas.
Load Balancing In Oracle 10g Real Application Clusters Environments
These three methods are actually quite effective for distribution of incoming connections evenly across multiple listeners in any single-instance database configuration. An Oracle 10gR2 Real Application Clusters (RAC) clustered database, on the other hand, needs more robust load balancing capabilities because of the nature of that environment.
A RAC clustered database comprises at least two (and usually many more) nodes, each running a separate instance of the clustered database. In addition, a RAC database usually needs to supply a minimum amount of connections and resources to several applications, each with dramatically different resource needs depending on the current business processing cycle(s), so the application load thats placed on each instance in the clustered database therefore can be dramatically different at different times of the day, week, month, and year. Finally, its likely that a RAC clustered database will need to guarantee a minimum cardinality (i.e. a specific number of nodes on which the application needs to run at all times) to one or more mission-critical applications.
RAC Services. Starting in Oracle 8i, an Oracle database could dynamically register a database directly with its corresponding listener(s) based on the settings for the SERVICE_NAMES initialization parameter through the databases Process Monitor (PMON) background process. To completely support this feature, Oracle strongly suggested that the SERVICE_NAME parameter should be used instead of the original SID parameter in the TNSNAMES.ORA configuration file so that an incoming user session could immediately identify the database instance to which a session intended to connect.
Oracle 10g RAC leverages this service naming feature to distribute application connections efficiently across a RAC clustered database. For example, a clustered database may need to support three different applications, OLTP, DSS, and ADHOC. The OLTP application is the main order entry application for this enterprise computing environment, and therefore it needs a minimum cardinality of two cluster database instances at all times. The DSS application, on the other hand, supports extraction, transformation and load (ETL) operations for the enterprises data warehouse, and thus it requires a minimum cardinality of just one instance. Likewise, the ADHOC application supports OLAP and general user query execution against the data warehouse, but it too only requires a minimum cardinality of a single instance.
Oracle 10gR2 RAC: Server-Side Connect-Time Load Balancing
To demonstrate the implementation of load balancing features in a RAC environment, Ill use a relatively straightforward testing platform: a simple two-node RAC clustered database, RACDB, with two instances, RACDB1 and RACDB2, configured on two nodes (RACLINUX1 and RACLINUX2, respectively). Ive set up this configuration using two VMWare Virtual Machines, each running CentOS Linux Enterprise Server 3 Release 8 (kernel 2.4.21-40) as the guest configuration.
Listing 4 shows the SRVCTL commands Ive issued to create and start three new application services, OLTP, DSS, and ADHOC, on the RACDB clustered database. Note that Ive specified both the RACDB1 and RACDB2 instances as the preferred instances for all three applications. When I execute SRVCTL commands to create these services, Oracle 10g automatically adds these service name values to the SERVICE_NAMES parameter for each instance in the cluster database.
Ive also set up three application aliases for these applications in the client TNSNAMES.ORA entries configuration files. Note that Ive specified the SERVICE_NAME parameter as RACDB so that all nodes in the cluster can participate in distributing the load of these applications across the cluster. Ive also used the two nodes virtual IP addresses (raclinux1-vip and raclinux2-vip) as the connection points for these services. This guarantees that if any one of the listeners or instances servicing these applications should fail, ONS will automatically relocate any new connection requests to a new listener alias on another surviving node.
To complete the configuration of server-side connection load balancing for this RAC clustered database, note that Ive set the *.REMOTE_LISTENERS=RACDB_LISTENERS initialization parameter in the databases shared SPFILE. Ive also added a corresponding RACDB_LISTENERS entry to the TNSNAMES.ORA file in the Oracle Home of each node in the cluster. Each databases PMON process will now automatically register the database with the databases local listener as well as cross-register the database with the listeners on all other nodes in the cluster. In this mode, the nodes themselves decide which node is least busy, and then will connect the client to that node.
Its also important to realize that in a RAC environment, the server-side load balancing methodology differs slightly from the methodology used in a single-instance environment because Oracle 10gR2 discriminates whether the incoming connection has been requested as either a dedicated or a shared server connection: