Activating the Fast-Start Failover Observer
Now
that the configuration of FSFO is complete, all I need to do is enable the
configuration via DGMGRL as shown below. Note that
Im also enabling logging of Data Guard Broker activity for the command-line
utility so that I can track any unexpected issues related to the FSFOs
performance or configuration:
[oracle@11gStdby ~]$ dgmgrl -logfile 11gStdby1_observer.log
DGMGRL for Linux: Version 11.1.0.6.0 - Production
Copyright (c) 2000, 2005, Oracle. All rights reserved.
Welcome to DGMGRL, type "help" for information.
DGMGRL> connect sys/oracle
Connected.
DGMGRL> ENABLE FAST_START FAILOVER;
Enabled.
Finally,
its time to start up FSFO. Once again, Ill use DGMGRL to start the Fast-Start
Failover Observer process:
DGMGRL> START OBSERVER;
Once
the FSFO is started, I can confirm that its been activated properly with the SHOW CONFIGURATION
and SHOW
DATABASE commands:
DGMGRL> show configuration verbose
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_primary - Primary database
orcl_stdby1 - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_stdby1
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database orcl_primary
Database
Name: orcl_primary
Role: PRIMARY
Enabled: YES
Intended State: TRANSPORT-ON
Instance(s):
orcl_primary
Current status for "orcl_primary":
SUCCESS
DGMGRL> show database orcl_stdby1
Database
Name: orcl_stdby1
Role: PHYSICAL STANDBY
Enabled: YES
Intended State: APPLY-ON
Instance(s):
orcl_stdby1
Current status for "orcl_stdby1":
SUCCESS
DGMGRL> show fast_start failover
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_stdby1
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Configurable Failover Conditions
Health Conditions:
Corrupted Controlfile YES
Corrupted Dictionary YES
Inaccessible Logfile NO
Stuck Archiver NO
Datafile Offline YES
Oracle Error Conditions:
(none)
Automatic Detection of Failover Conditions: An Example
Now
that FSFO is fully configured and is ready to detect a failover situation, Ill
use the same technique I used in the prior
article about Data Guard failover to simulate a failure of the primary
database: Ill simply issue the kill -9 <pid> command
against its Server Monitor (SMON)
background process. Once again, the death of the primary database is almost
immediately recorded in its alert log:
. . .
Tue Aug 25 18:54:10 2009
Errors in file /u01/app/oracle/diag/rdbms/orcl_primary/orcl_primary/trace/orcl_primary_pmon_6166.trc:
ORA-00474: SMON process terminated with error
PMON (ospid: 6166): terminating the instance due to error 474
Instance terminated by PMON, pid = 6166
. . .
Just
as before, the loss of connectivity to the primary database is reflected within
the alert log of the corresponding physical standby databases by its Remote File Server (RFS)
background process:
. . .
Tue Aug 25 18:54:49 2009
RFS[2]: Possible network disconnect with primary database
Tue Aug 25 18:54:49 2009
RFS[1]: Possible network disconnect with primary database
Tue Aug 25 18:55:49 2009
. . .
This
time, however, theres a dramatic difference! After approximately three minutes
have elapsed, theres a sudden flurry of activity at the physical standby site
as the FSFO automatically detects the failure
of the primary database. In Listing 7.1,
Ive captured the alert logs of both databases as well as the Data Guard Broker
log entries to show all of the actions that Oracle 11g initiates during a
Fast-Start Failover. After the
automatic failover is complete, the Data Guard configuration fully reflects the
successful actions of the FSFO:
DGMGRL> show configuration verbose
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_stdby1 - Primary database
orcl_primary - Physical standby database (disabled)
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_primary
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
Warning: ORA-16608: one or more databases have warnings
DGMGRL> show database verbose orcl_stdby1
Database
Name: orcl_stdby1
OEM Name: orcl_11gStdby1
Role: PRIMARY
Enabled: YES
Intended State: TRANSPORT-ON
Instance(s):
orcl_stdby1
Properties:
DGConnectIdentifier = 'orcl_stdby1'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = '/u01/app/oracle/oradata/orcl/, /u01/app/oracle/oradata/stdby/'
FastStartFailoverTarget = 'orcl_primary'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gStdby'
SidName = 'orcl_stdby1'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/STDBY/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_stdby1":
Warning: ORA-16829: fast-start failover configuration is lagging
DGMGRL> show database verbose orcl_primary
Database
Name: orcl_primary
OEM Name: orcl_11gPrimary
Role: PHYSICAL STANDBY
Enabled: NO
Intended State: APPLY-ON
Instance(s):
orcl_primary
Properties:
DGConnectIdentifier = 'orcl_primary'
ObserverConnectIdentifier = ''
LogXptMode = 'ASYNC'
DelayMins = '0'
Binding = 'OPTIONAL'
MaxFailure = '0'
MaxConnections = '1'
ReopenSecs = '300'
NetTimeout = '30'
RedoCompression = 'DISABLE'
LogShipping = 'ON'
PreferredApplyInstance = ''
ApplyInstanceTimeout = '0'
ApplyParallel = 'AUTO'
StandbyFileManagement = 'AUTO'
ArchiveLagTarget = '0'
LogArchiveMaxProcesses = '4'
LogArchiveMinSucceedDest = '1'
DbFileNameConvert = ''
LogFileNameConvert = ''
FastStartFailoverTarget = 'orcl_stdby1'
StatusReport = '(monitor)'
InconsistentProperties = '(monitor)'
InconsistentLogXptProps = '(monitor)'
SendQEntries = '(monitor)'
LogXptStatus = '(monitor)'
RecvQEntries = '(monitor)'
HostName = '11gPrimary'
SidName = 'orcl_primary'
StandbyArchiveLocation = '/u01/app/oracle/flash_recovery_area/ORCL/'
AlternateLocation = ''
LogArchiveTrace = '0'
LogArchiveFormat = 'log_%s_%t_%r.arc'
LatestLog = '(monitor)'
TopWaitEvents = '(monitor)'
Current status for "orcl_primary":
Error: ORA-16661: the standby database needs to be reinstated
Reinstating the Original Primary Database
My
previous example of initiating Fast-Start Failover brings to light an
interesting situation: What if the primary database was actually completely healthy at the time that FSFO
acknowledged the conditions for Fast-Start Failover? Heres where the
brilliance of enabling Flashback Logging on both the primary and physical
standby databases really shines through: With a single command, its a simple
matter to reinstate the original
primary database as a physical standby database.
To
illustrate, Ill issue the REINSTATE DATABASE command from a
DGMGRL
session connected to the new primary database, ORCL_STDBY1, and Ill designate
the original primary database, ORCL_PRIMARY, as the target of
the reinstatement:
DGMGRL> reinstate database orcl_primary
Once
again, theres a flurry of activity on the original primary database as Data
Guard Broker successfully attempts the reinstatement. Ive captured the
pertinent alert log entries from the ORCL_PRIMARY database in Listing
7.2, and DGMGRL
reflects the appropriate Data Guard configuration once the reinstatement has
completed:
DGMGRL> show configuration verbose
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_stdby1 - Primary database
orcl_primary - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_primary
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
SUCCESS
Switching Back. Since the original primary
database is now successfully restored as part of the Data Guard environment,
Ill request the original primary and physical standby databases to switch
roles with the SWITCHOVER
command:
DGMGRL> switchover to orcl_primary;
Performing switchover NOW, please wait...
New primary database "orcl_primary" is opening...
Operation requires shutdown of instance "orcl_stdby1" on database "orcl_stdby1"
Shutting down instance "orcl_stdby1"...
ORA-01109: database not open
Database dismounted.
ORACLE instance shut down.
Operation requires startup of instance "orcl_stdby1" on database "orcl_stdby1"
Starting instance "orcl_stdby1"...
ORACLE instance started.
Database mounted.
Switchover succeeded, new primary is "orcl_primary"
DGMGRL> show configuration verbose;
Configuration
Name: MAA_orcl
Enabled: YES
Protection Mode: MaxPerformance
Databases:
orcl_primary - Primary database
orcl_stdby1 - Physical standby database
- Fast-Start Failover target
Fast-Start Failover: ENABLED
Threshold: 180 seconds
Target: orcl_stdby1
Observer: 11gStdby
Lag Limit: 30 seconds
Shutdown Primary: TRUE
Auto-reinstate: TRUE
Current status for "MAA_orcl":
SUCCESS
Deactivating Fast-Start Failover
To
deactivate Fast-Start Failover, all I need to do is issue the DISABLE FAST_START
FAILOVER command from within a DGMGRL session:
DGMGRL> DISABLE FAST_START FAILOVER;
Disabled.
Note
that this only disables the possibility of future Fast-Start Failovers until I
re-enable the Fast-Start Failover configuration with the ENABLE FAST_START
FAILOVER command; all of the Fast-Start Failover configuration
details Ive so carefully constructed are still intact.
Next Steps
The
next article in this series will explore how to construct and maintain a Logical Standby database in Oracle 11g, focusing on their usefulness in data
warehouse and data mart environments.
References and
Additional Reading
While
Im hopeful that Ive given you a thorough grounding in the technical aspects
of the features Ive discussed in this article, Im also sure that there may be
better documentation available since its been published. I therefore strongly
suggest that you take a close look at the corresponding Oracle documentation on
these features to obtain crystal-clear understanding before attempting to
implement them in a production environment. Please note that Ive drawn upon
the following Oracle Database 11g
documentation for the deeper technical details of this article:
B28279-02 Oracle Database 11g New Features Guide
B28294-03 Oracle Database 11g Data Guard Concepts and
Administration
B28295-03 Oracle Database 11g Data Guard Broker
B28320-01 Oracle Database 11g Reference Guide
Also,
this white
paper about Fast-Start Failover Best
Practices on Oracle
Technology Network (OTN) helps clarify this feature set.
»
See All Articles by Columnist Jim Czuprynski