Oracle Health Checks with Health Monitor; Using PL/SQL or ADR

November 6, 2008

Keep on top of corruption in your Oracle database with health checks.

The old saying, there are 100 ways to.... And in keeping with that saying, Oracle, in Release 11g, has introduced database health checks under a framework called health monitor. This health monitor framework is designed to run diagnostic checks on your Oracle database. It will look at different components of the database—checking for corruption, both physical and logical. While not 100 ways to..., Oracle has, as we have looked at in the last few articles on ADR, also tried to implement and interface into the health checks with ADR. Nevertheless, as we will see, it might not be the best way to do this quite yet.

So, do you have datafile, block, undo, redo, dictionary, or xyz corruption in your database? You might actually be running just fine and not even know it. As we will see, in the following example(s), I have a fresh install of Oracle and it has corruption in it. I’d really like to track that down; but that is for another day. When running health checks, you, as a DBA, can decide to run them adhoc and in a totally reactive mode where someone might complain about an error message or you might see an Oracle alert that begs the question for one of these health checks to be run. You can also run health checks using the DBMS_HM PL/SQL package. Depending on how these run on your system, how long they take, and how often you see corruption, you might want to schedule these as a job. However, please take note that this is a new feature and with all new features, you should be 100% confident they will not cause corruption or lock your system. Just test and monitor those runs appropriately. Also note that these health checks can be run with the database online (OPEN or MOUNT) and available, or when the database is offline (NOMOUNT). As there are multiple types of health checks, there are also various health checks that can only be run with the database offline. For instance, the redo integrity check can only be run offline, which makes complete sense when you think of it. Without going into all of the different corruption/integrity checks, you can use the following SQL to look at them. It gives you a good idea as to which checks can be run while the database is online or offline.

SQL> SELECT name,internal_check,offline_capable,description FROM v$hm_check;
NAME                           I O DESCRIPTION
------------------------------ - - ----------------------------------------------
HM Test Check                  Y Y Check for HM Functionality
DB Structure Integrity Check   N Y Checks integrity of all database files
Data Block Integrity Check     N Y Checks integrity of a datafile block
Redo Integrity Check           N Y Checks integrity of redo log content
Logical Block Check            Y N Checks logical content of a block
Transaction Integrity Check    N N Checks a transaction for corruptions
Undo Segment Integrity Check   N N Checks integrity of an undo segment
All Control Files Check        Y Y Checks all control files in the database
CF Member Check                Y Y Checks a multiplexed copy of the control file
All Datafiles Check            Y Y Check for all datafiles in the database
Single Datafile Check          Y Y Checks a datafile
Log Group Check                Y Y Checks all members of a log group
Log Group Member Check         Y Y Checks a particular member of a log group
Archived Log Check             Y Y Checks an archived log
Redo Revalidation Check        Y Y Checks redo log content
IO Revalidation Check          Y Y Checks file accessability
Block IO Revalidation Check    Y Y Checks file accessability
Txn Revalidation Check         Y N Revalidate corrupted txn
Failure Simulation Check       Y Y Creates dummy failures
Dictionary Integrity Check     N N Checks dictionary integrity

Because some of the health checks require additional input parameters, for the sake of simplicity, I have decided to show just one where there are no input parameters required. If you would like to see the checks that require parameters, you can run the following SQL I found in the Oracle documentation:

SELECT c.name check_name, p.name parameter_name, 
       p.type, p.default_value, p.description
  FROM v$hm_check_param p, v$hm_check c
 WHERE p.check_id = c.id and c.internal_check = 'N'
 ORDER BY c.name;

I don’t know how many times I’ve encountered dictionary problems--everything from erroneous information in the dictionary tables to locking problems--so I am semi-excited about the Dictionary Integrity Check. The dictionary integrity check will take a look at the integrity of core dictionary objects, including: tab$, clu$, fet$, uet$, seg$, undo$, ts$, file$, obj$, ind$, icol$, col$, user$, con$, cdef$, ccol$, bootstrap$, objauth$, ugroup$, tsq$, syn$, view$, typed_view$, superobj$, seq$, lob$, coltype$, subcoltype$, ntab$, refcon$, opqtype$, dependency$, access$, viewcon$, icoldep$, dual$, sysauth$, objpriv$, defrole$, and ecol$. That’s a nice long list as far as I’m concerned and the check should take a look at the contents of dictionary entries, constraints, and referential integrity.

To run the dictionary integrity check manually, we need to issue the following DBMS_HM.RUN_CHECK PL/SQL package—giving the additional name for the run. In this example, I’ll name my run DICTIONARY_CHECK_11032008.

BEGIN
    DBMS_HM.RUN_CHECK('Dictionary Integrity Check', 
    'DICTIONARY_CHECK_11032008');
END;
/

Now this is where it gets a bit interesting. The DBMS_HM.RUN_CHECK stores its information in ADR so you can either look at a report using the DBMS_HM PL/SQL package or the ADRCI utility. To me, this isn’t much of a choice as the DBMS_HM PL/SQL package will output the report in HTML, XML, or text format. The ADRCI utility only outputs the report in XML. However, we will attack this from both so that we can see where the deficiencies in ADR exist and where ADR and DBMS_HM cross.

Since the results of the DBMS_HM.RUN_CHECK stores its information in ADR, we can easily see the HM_RUN by issuing the following ADRCI command.

adrci> show hm_run
ADR Home = /opt/app/oracle/diag/rdbms/db11fs/db11FS:
*****************************************************
**********************************************************
HM RUN RECORD 3
**********************************************************
   RUN_ID                        41
   RUN_NAME                      DICTIONARY_CHECK_11032008
   CHECK_NAME                    Dictionary Integrity Check
   NAME_ID                       24
   MODE                          0
   START_TIME                    2008-11-03 18:32:10.819805 -07:00
   RESUME_TIME                   <NULL>
   END_TIME                      2008-11-03 18:32:18.827387 -07:00
   MODIFIED_TIME                 2008-11-03 18:32:18.827387 -07:00
   TIMEOUT                       0
   FLAGS                         0
   STATUS                        5
   SRC_INCIDENT_ID               0
   NUM_INCIDENTS                 0
   ERR_NUMBER                    0
   REPORT_FILE                   <NULL>

The most notable piece of information, or lack of information, is the REPORT_FILE that is <NULL>. This column is only populated when a report is created. The creation of a report can be done through the DBMS_HM package or through the ADRCI CREATE REPORT command.

For the DBMS_HM package use:

SET LONG 100000
SET LONGCHUNKSIZE 1000
SET PAGESIZE 1000
SET LINESIZE 512
SELECT DBMS_HM.GET_RUN_REPORT('DICTIONARY_CHECK_11032008') FROM DUAL;

For the ADRCI CREATE REPORT command use:

CREATE REPORT hm_run DICTIONARY_CHECK_11032008

When using the DBMS_HM package the output looks like:

DBMS_HM.GET_RUN_REPORT('DICTIONARY_CHECK_11032008')
------------------------------------------------------------------
Basic Run Information
 Run Name                     : DICTIONARY_CHECK_11032008
 Run Id                       : 41
 Check Name                   : Dictionary Integrity Check
 Mode                         : MANUAL
 Status                       : COMPLETED
 Start Time                   : 2008-11-03 18:32:10.819805 -07:00
 End Time                     : 2008-11-03 18:32:18.827387 -07:00
 Error Encountered            : 0
 Source Incident Id           : 0
 Number of Incidents Created  : 0
Input Paramters for the Run
 TABLE_NAME=ALL_CORE_TABLES
 CHECK_MASK=ALL
Run Findings And Recommendations
 Finding
 Finding Name  : Dictionary Inconsistency
 Finding ID    : 42
 Type          : FAILURE
 Status        : OPEN
 Priority      : CRITICAL
 Message       : SQL dictionary health check: file$ pk 42 on object FILE$
               failed
 Message       : Damaged rowid is AAAAARAABAAAAByAAF - description: No further
               damage description available
 Finding
 Finding Name  : Dictionary Inconsistency
 Finding ID    : 45
 Type          : FAILURE
 Status        : OPEN
 Priority      : CRITICAL
 Message       : SQL dictionary health check: dependency$.dobj# fk 126 on
               object DEPENDENCY$ failed
 Message       : Damaged rowid is AAAABnAABAAAOiHABI - description: No further
               damage description available
 Finding
 Finding Name  : Dictionary Inconsistency
 Finding ID    : 48
 Type          : FAILURE
 Status        : OPEN
 Priority      : CRITICAL
 Message       : SQL dictionary health check: dependency$.dobj# fk 126 on
               object DEPENDENCY$ failed
 Message       : Damaged rowid is AAAABnAABAAAQXqAA6 - description: No further
               damage description available

After running either the DBMS_HM package or the ADRCI CREATE REPORT command, we can reissue the ADRCI show hm_run command and now the REPORT_FILE information is populated.

REPORT_FILE /opt/app/oracle/diag/rdbms/db11fs/db11FS/hm/HMREPORT_DICTIONARY_CHECK_11032008.hm

When using the ADRCI utility, you must first run a command to generate the report file if it does not exist, and then run another command to display its contents. You can also view the report through ADRCI by issuing the following command. Because the output is in XML it will not be shown here.

show report hm_run DICTIONARY_CHECK_11032008

Regardless of the method used, these checks are going to be a vital piece of information to investigate corruption in an Oracle database. While I have not performed any elaborate testing on timing, locking, or performance issues with these I will caution their use. Anytime corruption checks are run, there has to be some form of locking on the objects. It will only take time to watch these procedures run and the actual effects they have on a database. But then again, if you have problems, don’t you want to find them? Just another feather in the DBA hat to play with.

» See All Articles by Columnist James Koopmann








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers