Oracle 10g Backup Guide: A Small County Government Approach

by

Kevin E. Medlin

Abstract

Database backups are one of the
most important parts of a database administrator’s job. Backup strategies need
to be reviewed on a regular basis. Backups themselves need to be tested
frequently. This document offers one approach to database backups for Oracle
10g databases on Windows 32 and 64 bit servers. Recovery Manager (RMAN) is
employed as the primary backup application. RMAN is fast, flexible, and can
compress the sometimes-large backup files. A Recovery Catalog is also used.
Data Pump exports are used as a secondary backup application. All steps are
automated, and scripts are provided with explanations in the document.

We have Oracle database 10g
installed on Windows 2003, 32 and 64 bit servers. We keep our databases up
almost 24×7, but do perform server re-boots in the early Sunday morning hours. We
also occasionally take the servers down at scheduled times for maintenance on
some of those Sunday mornings for things such as Windows patching and Oracle
patching. We use Recovery Manager (RMAN) to perform online backups of our
databases. RMAN is the Oracle suggested method for performing backup and
recovery on databases (Chien, 2005, p. 3). RMAN is a great tool, but since we
use Oracle standard edition not all of the benefits are available (parallelism,
block media recovery, mean time to recover (MTTR), among others). Still, we are
able to work RMAN into our backup strategy in a major way. With RMAN backups,
we have been able to “refresh” our test databases upon request from developers
or management. We also perform data pump exports on Oracle database 10g
databases as additional safety measures. We use a combination of Windows batch
scripts, SQL scripts, RMAN scripts, and scheduled tasks to automate these
operations. Our main goal is to keep things as uniform as possible across the
servers in an attempt to keep things simple. For the most part, we have been
successful.

Our backup strategy is simple; take
a weekly RMAN backup and archivelogs the rest of the week. Take a secondary
backup of an export or data pump. Of course, there is much more to it than
that. I will give the list of steps and then explain each one. In the
explanations, there will be setup information, scripts, suggestions, etc. I
recommend reading through this document entirely before
beginning/updating/altering any current backup plans you currently have in
place.

Our 9 steps for a great 10g nightly backup strategy are;

1.  Delete
old log files and rename current logs.

2.  Delete
all RMAN backup files.

3.  Perform
a level 0 (zero) RMAN backup.

4.  Create
clone files.

5.  Create
archivelog backup, which includes Recovery Catalog housekeeping.

6.  Delete
data pump export files.

7.  Perform
data pump export.

8.  Check
logs for errors.

9.  Page
and/or email short error description.

1. Delete old log files and rename current logs

This is performed every day. It is
good practice to create a log file for all scripts. In step 8, I check all the
logs for any errors so all current logs are renamed. When they are renamed, it
is easy to tell which errors refer to old jobs. These all need to be deleted eventually
so as to not create a space issue.

Code Listing 1:


qgrep -l rman D:oracleadmincommonbackuplogs* >> %LOGFILE%
del /Q D:oracleadmincommonbackuplogs*.oldlog3 >> %LOGFILE%
ren D:oracleadmincommonbackuplogs*.oldlog2 *.oldlog3 >> %LOGFILE%
ren D:oracleadmincommonbackuplogs*.oldlog1 *.oldlog2 >> %LOGFILE%
ren D:oracleadmincommonbackuplogs*.log *.oldlog1 >> %LOGFILE%

Tip: Try to automate log file
cleanup. It’s hard to remember everything!

2. Delete all RMAN backup files

This is only performed before a level
0 RMAN backup. Our RMAN backups are performed to the X: drive, X:RMAN to be
exact. We size this drive to hold our level 0 backups, the archivelogs, the
archivelog backups, and the data pump exports. Clearing out the RMAN files on a
weekly basis assures that there will be enough space on the drive for the next
week for backups.

Code Listing 2:


# Running these commands will show the files that will be deleted in the next step.
FORFILES /p x:rman /m *.bks /d -0 /c “CMD /C echo @PATH@FILE @FDATE.@FTIME” >> %logfile%
FORFILES /p x:rman /m *cfile* /d -0 /c “CMD /C echo @PATH@FILE @FDATE.@FTIME” >> %logfile%

# These commands perform the actual deletion.
FORFILES /p x:rman /d -0 /c “CMD /C del *.bks” >> %logfile%
FORFILES /p x:rman /d-0 /c “CMD /C del *cfile*” >> %logfile%

Tip: Definitely automate deletion
of large files on disk. You will surely run out of space at a bad time.

The X factor

The X: drive is a critical piece of
this backup ballet. We regularly clone our production databases to test and
development databases on other servers, or alternate nodes. RMAN likes to get
its files from where it backed them up. The easiest way to do this is back up
to tape. This way, it makes no difference what server you are on when you
perform the clone. When you tell RMAN your target database, it goes straight to
the media management layer (MML) for the files it needs. Our problem with tape
was unreliability with tapes or tape drives. Our solution was to back up to
disk. The problem was going to be copying files back and forth from server to
server to the same drive mapping. What we needed to do was map a SAN drive to
our production server and then have our test server map to the same SAN drive
at boot time. There was no way to have our SAN do this, but we could do it with
a command at the command line from the server. We were able to solve this issue
with a service from the Windows Resource Kit called AutoExNT. It basically
allows us to run a batch file at boot time (Fenner, 2007). We are able to put
the command in there and now production and test are mapped to the same X:
drive.

The X-files factor

AutoExNT works awesomely for the
32-bit servers. The problem comes when you have a 64-bit server. Unfortunately,
there are no Windows Resource Kits available for 64-bit Windows, so no
AutoExNT. Luckily, we were able to persuade our development staff to create a
“Launcher” service for us that works on 64-bit Windows. It is the same thing as
AutoExNT. Whatever we put in the batch file is executed when the server boots.

3. Perform a level 0 (zero) RMAN backup

This step is usually performed once
per week. We have some larger, more active databases that create huge amounts
of archivelogs. In a recovery or cloning scenario, the time to restore the
database will take too long. In these instances, we will take more than one
level 0 backup during the week. In the level 0 RMAN backup, we perform no
Recovery Catalog maintenance. The script is called DBname_lvl_0_disk.rcv.

Code Listing 3:


#************* Configure controls for Recovery Manager *******************
#***************** in case they have been changed ************************

configure controlfile autobackup on;
configure controlfile autobackup format for device type disk to ‘X:rmanCFILE_%F.bks’;
configure channel 1 device type disk format ‘X:rmanBACKUPPIECE_%d_%U_%T_%s.bks’ maxpiecesize 2048M;
configure retention policy to recovery window of 34 days;
#************* Perform weekly RMAN level 0 Database Backup ***************
backup as compressed backupset incremental level = 0 database;
#************* Confirm level 0 Backup is valid ***************************
restore database check logical validate;

The first command configures RMAN so
the control file and spfile will be backed up automatically. The second command
instructs RMAN to name the file in a particular format. In this case, when
backing up to disk call the file ‘X:rmanCFILE_F%.bks’. Line three says create
a disk channel and call it “1”. Name the backup pieces
‘X:rmanBACKUPPIECE_%d_%U_%T_%s.bks’ and any backup piece created should be
no larger than 2 GB. The fourth line tells the Recovery Catalog that the
backups for the target database should be considered good for 34 days. The
fifth line actually performs the backup. It tells RMAN to create a compressed
backup set, which really means one or more compressed backup pieces. It also
says take a full backup of only the database. No archivelogs will be backed
up. Since storage is a major issue, compressing backups has really helped out
(Freeman, 2004, p. 90). We have found this to be one of the most useful RMAN
features. The last line performs a check on the backup that just finished. It
reads the backup files and validates them to make sure they are not corrupt or
damaged in some way. I highly advise NOT skipping this step. Nothing could be
worse than needing to restore a database and finding out too late that one or
more of the files are no good!

Tip: You can use the files while
they are being validated. Yes, it’s true! I have begun a clone or restore many
times after the backup was completed but before the validation was done.

I call the RMAN level 0 backup script
using a Windows batch file. The batch file is called DBname_lvl_0_disk.bat.

Code Listing 4:


set NLS_DATE_FORMAT=DD-MON-YYYY HH24:MI:SS
set NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
set CURDIR=D:oracleadmincommonbackup

cd %CURDIR%

rman target ID/pword@DBname catalog rcatID/rcatpword@rcatname log=logsDBname_lvl_0_disk.log @DBname_lvl_0_disk.rcv

page_on_backup.vbs DBname_level_0_disk.log page_DBname_level_0.log DBname

The first two lines set operating
system environment variables. We prefer the more detailed date mask of "05-DEC-2007
23:59:59" rather than "05-DEC-2007". The date format becomes
more important during times of recovery. Setting the NLS_LANG variable removes
any doubt about which character set the database is using (Bersinic &
Watson, 2005, ch. 23:p. 8). The third and fourth lines are important for using
scheduled tasks. Windows needs to be directed to where the RMAN script is, so set
the directory and then move there. Next, RMAN is actually called. The target
and catalog are both logged into. A log file is created in a separate directory
inside the current directory called “logs” and the script in code listing 4 is
called. If there are any errors, a Visual Basic script is called that pages
support personnel. If there are no errors then an email of the log file is
sent. There will be more details on paging in section 9.

4. Create clone files

This is a pivotal step to automating
the “refresh” for test databases using RMAN backups. The clone files batch jobs
create the actual “duplicate database” statements we use to clone our
production databases to our test areas. This is a SQL script called
create_TESTDB_clone_files.sql that runs on the production server.

Code Listing 5:


set echo off;
set feedback off;
set linesize 140;
SET PAGESIZE 0;
set trimspool on;
ALTER SESSION SET NLS_DATE_FORMAT=’YYYY-MON-DD HH24:MI:SS’;
select checkpoint_change# from v$database;
alter system archive log current;
select sysdate from dual;
— **************************************************************************
— **************************************************************************
— **************************************************************************
— ******************************* TESTDBSERVER *****************************
— ******************************* TESTDBSERVER *****************************
— ******************************* TESTDBSERVER *****************************
— **************************************************************************
— **************************************************************************
— **************************************************************************
–TESTDB1

spool \TESTDBSERVERd$oracleadmincommoncloneclone_to_TETSTDB1.rcv
select ‘duplicate target database to TESTDB1 until time ‘ ||””|| sysdate ||”’;’ from dual;
spool off;
–TESTDB2
spool \TESTDBSERVERd$oracleadmincommoncloneclone_to_TETSTDB2.rcv
select ‘duplicate target database to TESTDB2 until time ‘ ||””|| sysdate ||”’;’ from dual;
spool off;
–TESTDB3
spool \TESTDBSERVERd$oracleadmincommoncloneclone_to_TETSTDB3.rcv
select ‘duplicate target database to TESTDB23until time ‘ ||””|| sysdate ||”’;’ from dual;
spool off;
— **************************************************************************
— **************************************************************************
— **************************************************************************
— ********************************* THE END ********************************
— ********************************* THE END ********************************
— ********************************* THE END ********************************
— **************************************************************************
— **************************************************************************
— **************************************************************************

alter system archive log current;
select sysdate from dual;
select checkpoint_change# from v$database;
select sysdate from dual;
exit;

This script runs on the production
server and spools the output to the test server. The first thing you notice is
the NLS_DATE_FORMAT setting. This is being set the same way it was set in the
level 0 backup. Next, you see that we have selected the system change number or
SCN. Databases can also be cloned and/or recovered by using the SCN (Greenwald,
Stackowiak & Stern, 2004, p. 151). We used to duplicate using the SCN but
no longer do. We didn’t remove this step because we like to see the SCN in the
log file. In case of a production recovery scenario, the SCN is available in
one additional location. In the next statement, we archive the current redo
log. We have been performing RMAN duplications since Oracle 8i and always had
issues with the logs. This was the only sure fire way we could make it work
every time. Next, we select the sysdate, we like to see it under the SCN.
Dropping down to the first spool statement, you see that an RMAN script is
being written to TESTDBSERVER called clone_to_TESTDB1.rcv. There will be only
one line in the script and when written, will look like this:

Code Listing 6:

duplicate target database to TESTDB1 until time '2007-DEC-05 19:55:00';

You can write a separate clone
script for each test database on your test database server. As you can see,
this is what we have done. We have some production database servers with more than
one production database. For those we just have two of these scripts we run,
one against each production database creating a cloning script for each test
database.

Code Listing 7:


set NLS_DATE_FORMAT=DD-MON-YYYY HH24:MI:SS
set NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
set CURDIR=D:oracleadmincommonbatch

cd %CURDIR%

sqlplus system/pword@DBNAME @create_TESTDB_clone_files.sql > logscreate_TESTDB_clone_files.log

The batch file for this script is
simple. The first two lines set your environment. The third and fourth lines
are important for using scheduled tasks. Windows needs to be directed to where
the SQL script is, so set the directory and then move there. Next, SQLPlus is
actually called and runs the script to create the clone files. A log file is
created in a separate directory inside the current directory called “logs”.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles