Oracle RAC: Checking RAC status with SQL/OS level statements

Brief intro


Our last article was about monitoring RAC, which we will get back to in upcoming articles. Monitoring can be a daunting task for a DBA but fortunately, the EM console is a good place to start. There are vendors that deliver products to monitor RAC–we did a 3-part article reviewing the beta product from Quest software called SoRAC or Spotlight on RAC. They are good graphical tools but when it comes to recovering the scenarios like voting disks, crs files and other scenarios you need more than just that.


In this article, we will attempt to run typical scripts to check the status of our RAC nodes. We will also do a quick run on the liveliness of the cluster and then go on to run our typical scripts to see the status of our datafiles. Let us start by checking the status of our RAC on ESX 3.0.1 Server.


Doing quicks checks to check the livliness of the RAC


We have gone through this command several times. It resides under the 4ORACLE_HOME/CRS/bin directory and there are several commands to perform various tasks, such as displaying individual resource and starting and stopping those resources. Anyhow, do the crs –help to get a complete list of each command.

[[email protected] bin]$ crs_stat -t
Name Type Target State Host
————————————————————
ora.esxrac.db application ONLINE ONLINE vm02
ora….c1.inst application ONLINE ONLINE vm01
ora….c2.inst application ONLINE ONLINE vm02
ora….serv.cs application ONLINE ONLINE vm02
ora….ac1.srv application ONLINE ONLINE vm01
ora….ac2.srv application ONLINE ONLINE vm02
ora….SM1.asm application ONLINE ONLINE vm01
ora….01.lsnr application ONLINE ONLINE vm01
ora.vm01.gsd application ONLINE ONLINE vm01
ora.vm01.ons application ONLINE ONLINE vm01
ora.vm01.vip application ONLINE ONLINE vm01
ora….SM2.asm application ONLINE ONLINE vm02
ora….02.lsnr application ONLINE ONLINE vm02
ora.vm02.gsd application ONLINE ONLINE vm02
ora.vm02.ons application ONLINE ONLINE vm02
ora.vm02.vip application ONLINE ONLINE vm02

By checking the status of individual nodes and all the necessary applications, we can see that the VIP, GSD, Listener and the ONS daemons are alive.

[[email protected] bin]$ srvctl status nodeapps -n vm01
VIP is running on node: vm01
GSD is running on node: vm01
Listener is running on node: vm01
ONS daemon is running on node: vm01
[[email protected] bin]$ srvctl status nodeapps -n vm02
VIP is running on node: vm02
GSD is running on node: vm02
Listener is running on node: vm02
ONS daemon is running on node: vm02

Now checking the status of the ASM on both nodes…

[[email protected] bin]$ srvctl status asm -n vm01
ASM instance +ASM1 is running on node vm01.
[[email protected] bin]$ srvctl status asm -n vm02
ASM instance +ASM2 is running on node vm02.

What about the database status?

 [[email protected] bin]$ srvctl status database -d esxrac
Instance esxrac1 is running on node vm01
Instance esxrac2 is running on node vm02

What would be the status of the service that we created at the end of the database installation?

[[email protected] bin]$ srvctl status service -d esxrac
Service fokeserv is running on instance(s) esxrac2, esxrac1

Cluster Status

[[email protected] bin]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[[email protected] bin]$ ssh vm01
The authenticity of host ‘vm01 (172.22.107.25)’ can’t be established.
RSA key fingerprint is c0:0c:75:15:9a:e5:fc:69:5c:0c:e5:c8:94:00:52:c4.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘vm01,172.22.107.25’ (RSA) to the list of known hosts.
Enter passphrase for key ‘/u01/app/oracle/.ssh/id_rsa’:
Last login: Fri May 4 12:56:08 2007 from 172.22.206.62
[[email protected] ~]$ crsctl check crs
CSS appears healthy
CRS appears healthy
EVM appears healthy
[[email protected] ~]$ exit
logout
Connection to vm01 closed.
[[email protected] bin]$

Querying the RAC

SQL*Plus: Release 10.2.0.1.0 – Production on Fri May 25 10:15:27 2007
Copyright (c) 1982, 2005, Oracle. All rights reserved.

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production
With the Partitioning, Real Application Clusters, Oracle Label Security, OLAP and Data Mining Scoring Engine options


Here we quickly query the gv$instance cluster view to get the instances, hostname, and status of our node apps.

SQL> select instance_name, host_name, archiver, thread#, status
2 from gv$instance
3 /
INSTANCE_NAME HOST_NAME ARCHIVE THREAD# STATUS
————– ——————— ——- ——– ——
esxrac1 vm01.wolga.nl STARTED 1 OPEN
esxrac2 vm02.wolga.nl STARTED 2 OPEN

Checking the SGA (System Global Area) on one of the nodes (Note: Both nodes have identical setup).

SQL> show sga
Total System Global Area 608174080 bytes
Fixed Size 1220820 bytes
Variable Size 142610220 bytes
Database Buffers 457179136 bytes
Redo Buffers 7163904 bytes

Getting a list of all our datafiles…

SQL> select file_name, bytes/1024/1024
2 from dba_data_files
3 /
FILE_NAME BYTES/1024/1024
———————————————– —————
+ORADATA/esxrac/datafile/system.259.620732719 500
+ORADATA/esxrac/datafile/undotbs1.260.620732753 200
+ORADATA/esxrac/datafile/sysaux.261.620732767 670
+ORADATA/esxrac/datafile/example.263.620732791 150
+ORADATA/esxrac/datafile/undotbs2.264.620732801 200
+ORADATA/esxrac/datafile/users.265.620732817 5
6 rows selected.

Getting the status of all the groups, type, membership (if any)…

SQL> select group#, type, member, is_recovery_dest_file
2 from v$logfile
3 order by group#
4 /
GROUP# TYPE MEMBER IS_
—— ——- ————————————————— —
1 ONLINE +ORADATA/esxrac/onlinelog/group_1.257.620732695 NO
1 ONLINE +FLASH_RECO_AREA/esxrac/onlinelog/group_1.257.620732699 YES
2 ONLINE +ORADATA/esxrac/onlinelog/group_2.258.620732703 NO
2 ONLINE +FLASH_RECO_AREA/esxrac/onlinelog/group_2.258.620732707 YES
3 ONLINE +ORADATA/esxrac/onlinelog/group_3.266.620737527 NO
3 ONLINE +FLASH_RECO_AREA/esxrac/onlinelog/group_3.259.620737533 YES
4 ONLINE +ORADATA/esxrac/onlinelog/group_4.267.620737535 NO
4 ONLINE +FLASH_RECO_AREA/esxrac/onlinelog/group_4.260.620737539 YES

Querying the v$asm_diskgroup view…

select group_number, name,allocation_unit_size alloc_unit_size,state,type,total_mb,usable_file_mb
from v$asm_diskgroup;
GROUP_NUMBER NAME ALLOC_UNIT_SIZE STATE TYPE TOTAL_MB USABLE_FILE_MB
———— —————————— ————— ———– —— ———- ————–
1 FLASH_RECO_AREA 1048576 CONNECTED EXTERN 10236 2781
2 ORADATA 1048576 CONNECTED NORMAL 20472 8132

Querying v$asm_disk for our volumes (remember the ones we created first on OS level with the asmlib) :

select name, path, header_status, total_mb free_mb, trunc(bytes_read/1024/1024) read_mb, trunc(bytes_written/1024/1024) write_mb
from v$asm_disk;
NAME PATH HEADER_STATU FREE_MB READ_MB WRITE_MB
—– ———- ———— ———- ———- ———-
VOL1 ORCL:VOL1 MEMBER 10236 39617 15816
VOL2 ORCL:VOL2 MEMBER 10236 7424 15816
VOL3 ORCL:VOL3 MEMBER 10236 1123 13059

All datafiles in one go:

SQL> select name from v$datafile
2 union
3 select name from v$controlfile
4 union
5 select name from v$tempfile
6 union
7 select member from v$logfile
8 /
NAME
——————————————————————————–
+FLASH_RECO_AREA/esxrac/controlfile/current.256.620732691
+FLASH_RECO_AREA/esxrac/onlinelog/group_1.257.620732699
+FLASH_RECO_AREA/esxrac/onlinelog/group_2.258.620732707
+FLASH_RECO_AREA/esxrac/onlinelog/group_3.259.620737533
+FLASH_RECO_AREA/esxrac/onlinelog/group_4.260.620737539
+ORADATA/esxrac/controlfile/current.256.620732689
+ORADATA/esxrac/datafile/example.263.620732791
+ORADATA/esxrac/datafile/sysaux.261.620732767
+ORADATA/esxrac/datafile/system.259.620732719
+ORADATA/esxrac/datafile/undotbs1.260.620732753
+ORADATA/esxrac/datafile/undotbs2.264.620732801
+ORADATA/esxrac/datafile/users.265.620732817
+ORADATA/esxrac/onlinelog/group_1.257.620732695
+ORADATA/esxrac/onlinelog/group_2.258.620732703
+ORADATA/esxrac/onlinelog/group_3.266.620737527
+ORADATA/esxrac/onlinelog/group_4.267.620737535
+ORADATA/esxrac/tempfile/temp.262.620732779
17 rows selected.

Listing all the tablespaces…

SQL> select tablespace_name, file_name
2 from dba_data_files
3 union
4 select tablespace_name, file_name
5 from dba_temp_files
6 /
TABLESPACE_NAME FILE_NAME
—————— ——————————————————————————–
EXAMPLE +ORADATA/esxrac/datafile/example.263.620732791
SYSAUX +ORADATA/esxrac/datafile/sysaux.261.620732767
SYSTEM +ORADATA/esxrac/datafile/system.259.620732719
TEMP +ORADATA/esxrac/tempfile/temp.262.620732779
UNDOTBS1 ` +ORADATA/esxrac/datafile/undotbs1.260.620732753
UNDOTBS2 +ORADATA/esxrac/datafile/undotbs2.264.620732801
USERS +ORADATA/esxrac/datafile/users.265.620732817
7 rows selected.

This script will give you information of the +ASM1 instance files:

SQL> select group_number, file_number, bytes/1024/1024/1024 GB, type, striped, modification_date
2 from v$asm_file
3 where TYPE != ‘ARCHIVELOG’
4 /
GRP_NUM FILE_NUM GB TYPE STRIPE MODIFICAT
——- ——– ——– ————— —— ———
1 256 .01 CONTROLFILE FINE 04-MAY-07
1 257 .05 ONLINELOG FINE 25-MAY-07
1 258 .05 ONLINELOG FINE 24-MAY-07
1 259 .05 ONLINELOG FINE 24-MAY-07
1 260 .05 ONLINELOG FINE 25-MAY-07
1 261 .00 PARAMETERFILE COARSE 24-MAY-07
2 256 .01 CONTROLFILE FINE 04-MAY-07
2 257 .05 ONLINELOG FINE 25-MAY-07
2 258 .05 ONLINELOG FINE 24-MAY-07
2 259 .49 DATAFILE COARSE 04-MAY-07
2 260 .20 DATAFILE COARSE 04-MAY-07
2 261 .65 DATAFILE COARSE 23-MAY-07
2 262 .03 TEMPFILE COARSE 04-MAY-07
2 263 .15 DATAFILE COARSE 04-MAY-07
2 264 .20 DATAFILE COARSE 04-MAY-07
2 265 .00 DATAFILE COARSE 04-MAY-07
2 266 .05 ONLINELOG FINE 24-MAY-07
2 267 .05 ONLINELOG FINE 25-MAY-07
18 rows selected.

More detailed information:

SQL> select group_number, file_number, compound_index, incarnation, block_size, bytes/1024/1024/1024 GB, type, striped,
2 creation_date, modification_date
3 from v$asm_file
4 where TYPE != ‘ARCHIVELOG’
5 /
GRP_NUM FILE_NUM COMPOUND_INDEX INCARNATION BLOCK_SIZE GB TYPE STRIPE CREATION_ MODIFICAT
——- ——– ————– ———– ———- ——– ————— —— ——— ———
1 256 16777472 620732691 16384 .01 CONTROLFILE FINE 24-APR-07 04-MAY-07
1 257 16777473 620732699 512 .05 ONLINELOG FINE 24-APR-07 25-MAY-07
1 258 16777474 620732707 512 .05 ONLINELOG FINE 24-APR-07 24-MAY-07
1 259 16777475 620737533 512 .05 ONLINELOG FINE 24-APR-07 24-MAY-07
1 260 16777476 620737539 512 .05 ONLINELOG FINE 24-APR-07 25-MAY-07
1 261 16777477 620737547 512 .00 PARAMETERFILE COARSE 24-APR-07 24-MAY-07
2 256 33554688 620732689 16384 .01 CONTROLFILE FINE 24-APR-07 04-MAY-07
2 257 33554689 620732695 512 .05 ONLINELOG FINE 24-APR-07 25-MAY-07
2 258 33554690 620732703 512 .05 ONLINELOG FINE 24-APR-07 24-MAY-07
2 259 33554691 620732719 8192 .49 DATAFILE COARSE 24-APR-07 04-MAY-07
2 260 33554692 620732753 8192 .20 DATAFILE COARSE 24-APR-07 04-MAY-07
2 261 33554693 620732767 8192 .65 DATAFILE COARSE 24-APR-07 23-MAY-07
2 262 33554694 620732779 8192 .03 TEMPFILE COARSE 24-APR-07 04-MAY-07
2 263 33554695 620732791 8192 .15 DATAFILE COARSE 24-APR-07 04-MAY-07
2 264 33554696 620732801 8192 .20 DATAFILE COARSE 24-APR-07 04-MAY-07
2 265 33554697 620732817 8192 .00 DATAFILE COARSE 24-APR-07 04-MAY-07
2 266 33554698 620737527 512 .05 ONLINELOG FINE 24-APR-07 24-MAY-07
2 267 33554699 620737535 512 .05 ONLINELOG FINE 24-APR-07 25-MAY-07
18 rows selected.

Conclusion


You need more than a bunch of scripts to monitor your RAC database. Normally in most shops managing and monitoring the RAC from the DB console or even a grid console makes the task much easier. However, you can become a RAC expert if you can practice and play with your RAC with various scenarios. We will start playing with our RAC and attempt to simulate all kinds of crash scenarios. In addition, with 11G coming up we will try to migrate our RAC to 11G (whenever it is publicly available).


» See All Articles by Columnist Tarry Singh

Tarry Singh
I have been active in several industries since 1991. While working in the maritime industry I have worked for several Fortune 500 firms such as NYK, A.P. Møller-Mærsk Group. I made a career switch, emigrated, learned a new language and moved into the IT industry starting 2000. Since then I have been a Sr. DBA, (Technical) Project Manager, Sr. Consultant, Infrastructure Specialist (Clustering, Load Balancing, Networks, Databases) and (currently) Virtualization/Cloud Computing Expert and Global Sourcing in the IT industry. My deep understanding of multi-cultural issues (having worked across the globe) and international exposure has not only helped me successfully relaunch my career in a new industry but also helped me stay successful in what I do. I believe in "worknets" and "collective or swarm intelligence". As a trainer (technical as well as non-technical) I have trained staff both on national and international level. I am very devoted, perspicacious and hard working.

Latest Articles