Protecting Oracle Instance with Local Clustering
March 13, 2003Marin Komadina
For a long time, corporations tried very hard to keep systems running under all conditions. For many e-commerce and business applications, database unavailability for an extended period leads to revenue loss. With a wide range of solutions in use (local disk mirroring, RAID, local clustering, remote disk mirroring, replication and local clustering with Oracle Parallel Server or Real Application Clusters) we need to choose the most optimal solution. One of those solutions is Local Clustering with Sun Cluster software.
This article covers:
Local Clustering Definition
Local cluster is defined as two or more physical machines (nodes) that share common disk storage and logical IP address. Clustered nodes exchange cluster information over heartbeat link(s). Cluster software collects information and checks the situation on both nodes. On error condition, software will execute a predefined script and switch the clustered services over to a secondary machine. Oracle instance, as one of clustered services, will be switched off together with listener process, and restarted on the secondary (surviving) node.
HA Oracle Agent
HA Oracle Agent software controls Oracle database activity on Sun Cluster nodes. The agent performs fault checking using two processes on the local node and two process on the remote node by querying V$SYSSTAT table for active sessions. If the database has no active sessions, HA Agent will open a test transaction (connect and execute in serial create, insert, update, drop table commands). Return error codes from HA Agent have been validated against a special action file on location.
# Action file for HA-DBMS Oracle fault monitor # State DBMS_er proc_di log_msg timeout int_err new_sta action message --- co * * * * 1 * stop Internal HA-DBMS Oracle error connecting to db on 28 * * * * di none Session killed by DBA, will reconnect * 50 * * * * di takeover O/S error occurred while obtaining an enqueue co 0 * * 1 0 * restart A timeout has occured during connect --
Takeover - cluster software will switch to another node.
Stop - cluster will stop DBMS
None - no action taken
Restart - database restarted locally on the same node
HA Oracle Agent requires Oracle configuration files (listener.ora, oratab and tnsnames.ora) on unique predefined location /var/opt/oracle.