RMAN Tablespace Point in Time Recovery

A user deletes data from a
table and commits it. How do you retrieve that data? If using a version of
Oracle with flashback technology – AND you are made aware of the error while
the undo information is still retained – that’s not so much of a problem. If
running in noarchivelog mode, and given that you have a cold backup or export
lying around, the recovery process is fairly cut and dry: restore the entire
database or import the table from the export.

One check to make in the
import approach is for a referential integrity “action,” that is, is there a
“on delete cascade” minefield waiting for you to step in? And don’t forget
about triggers. In other words, more than one table may need to be recovered. Even
a standby database can leave you in the hurt locker. If the transported redo
has been applied, you now have the problem (i.e., the missing data) in two
places.

One of the best situations
to be in is running in archivelog mode and using RMAN as your backup mechanism
or process. RMAN tablespace point in time recovery (TSPITR) can be used to
restore the data.

Background
Nailing Down the Setup Steps
4 Initialization Parameters
3 Tablespaces
2 Net8/Net Services configuration files
1 parameter file (and maybe one password file)
Almost ready…
Create the test environment
In Closing

Background

This is one of those
critical skills where you will be glad you have put your hands on the keyboard
and practiced this several times ahead of the time when you need to do this for
real.

When it comes to backup and
recovery, with the emphasis on recovery, Oracle documentation (to include notes
on MetaLink) is full of sage advice. One such warning (going back as far as 8i)
states the following:

Do not perform RMAN
TSPITR for the first time on a production system or when you have a time
constraint.

Another classic one is about
not putting yourself into a situation worse than you already are in. Unintended
“change vectors” to data are one thing; your mission is to prevent that from
turning into a job change vector because you trashed the production database
and cannot recover it.

If you use a recovery
catalog, you have unlimited attempts to get things right. If not using one, you
have one shot at getting the recovery point correct. Once recovered (but you
didn’t go far enough back), the backup you were using cannot be used again for
that tablespace.

The root process of an RMAN
TSPITR is based on creating a clone of the production database. This is where
some of the existing documentation gets murky. You’ll see references to a term
called auxiliary set, which includes a backup control file, the system
tablespace, datafiles containing rollback (or undo) segments, and optionally, a
temporary tablespace. The lifespan of the clone is what separates how TSPITR
can be done. And what about redo logs? How do they factor in?

In the “official” RMAN
TSPITR process (fully automated), the clone exists but for a short time. Once
it has served its purpose (as a temporary repository/instance used to
restore/recover a tablespace to a point in time), it dies in place. In fact,
Oracle kills it for you. In a variation of the “official” process, you create a
clone database (using RMAN) whose end state is as of whatever point in time
(obviously in the past) you desire. The clone lives on in this case. The official
TSPITR process recovers the affected tablespace in its entirety. The other
process creates the tablespace in a clone database, and from there, you can
single out the affected table. From that point, export/import, CTAS, or insert
into via selecting across a database link are three ways to get the table’s
data restored.

Nailing Down the Setup Steps

Let’s create a 4-3-2-1 model
for the setup steps. The steps pertain to older versions of Oracle, but will
work in at least up to10gR2. Much of the setup is taken care of for you in 10g;
just tell Oracle where the auxiliary instance work area is located. The steps
include editing or identifying:

4 initialization parameters

3 tablespaces (possibly
more)

2 Net8/Net Services configuration
files

1 parameter file (and maybe
one password file)

4 Initialization Parameters

The four init.ora parameters
are lock_name_space, db_file_name_convert, log_file_name_convert, and
control_files. The three tablespaces are the one you need to recover, System,
and Rollback or Undo. The two Net Services files requiring editing are tnsnames.ora
and listener.ora. The one copy of a parameter file is a copy of the production
database’s initialization parameter file.

You use the same db_name
value as what is in production. The lock_name_space parameter value is a name
you can give the auxiliary instance (“clone” works well enough) and is what
distinguishes the clone from production.

You do not need to copy a
control file into the clone working directory. One will be created for you, but
you must specify a different name than what production uses.

Here is an interesting
question, somewhat hard to sort out in the documentation. Do you, or do you not,
need to include the datafiles for SYSTEM and RBS/UNDO? In the official method,
you DO NOT need to specify these tablespaces and their associated datafiles if
they’re going to reside in a translated path. What is a translated path?

The name of the production
database used in this example is db10, and the name of the auxiliary
instance/database is clone. Suppose your SYSTEM datafile lives in this path and
is named:

D:\oracle\product\10.2.0\oradata\db10\system01.dbf

Let’s translate db10 into
clone, so that the to-be-restored datafile is:

D:\oracle\product\10.2.0\oradata\clone\system01.dbf

How is this translation
accomplished? One way is to use the DB_FILE_NAME_CONVERT parameter, and it
would be specified by:

db_file_name_convert=(‘db10’,’clone’)

Wherever Oracle finds “db10”
in a path to a datafile, it will translate db10 into clone. This is where using
optimal flexible architecture pays off. Suppose your SYSTEM datafiles are
spread out like so:

/u001/oradata/db10/system_01.dbf
/u002/oradata/db10/system_02.dbf
/u003/oradata/db10/system_03.dbf

Create corresponding
directories for clone and when all goes well, the clone will have its system
datafiles automatically created for you as:

/u001/oradata/clone/system_01.dbf
/u002/oradata/clone/system_02.dbf
/u003/oradata/clone/system_03.dbf

The same thing will take
place for rollback/undo and for the errant tablespace’s files. A similar
parameter takes care of the redo log files:

log_file_name_convert=(‘db10’,’clone’)

Log files can also be
specified/created in the RMAN run block. However, log files have to be
specified somewhere, because even if using Oracle managed files in the
auxiliary instance, they will not be created for you.

Other parameters such as
dump locations and memory settings can be changed as well. The file renaming
can be done other ways, either using a paired old versus new path in the
convert parameters, or by explicitly setting a new name in the RMAN run block.

To summarize the parameters going
from db10 to clone:

db_name=’db10′
lock_name_space=’clone’
db_file_name_convert=("db10","clone")

log_file_name_convert=("db10","clone")

control_files=’D:\oracle\product\10.2.0\oradata\clone\clone_control01.ctl’

3 Tablespaces

The minimum set is the one
you need to recover, plus system and rollback/undo. As mentioned, system and
rollback/undo can be handled for you. If you want to restore the datafiles
elsewhere other than under a translated path, use the set newname clause in the
RMAN run block. You can explicitly identify the target tablespace’s datafiles
when renaming them, or use their respective file IDs. The order of precedence
(i.e., the file name conversion versus using newname) is listed in the
documentation as:

1.
SET NEWNAME

2.
CONFIGURE AUXNAME

3.
DB_FILE_NAME_CONVERT

4.
AUXILIARY DESTINATION argument to
RECOVER TABLESPACE

In line with this
identification, it is handy to have a listing of the file ID numbers and the
filenames. Error messages such as the one below, are not hard to decipher as
far as the affected tablespace is concerned, but what if the file is an Oracle
managed file named O1_MF_USERS_3PCS61ON_.DBF?


SQL> select * from emp;
select * from emp
              *
ERROR at line 1:
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: ‘D:\ORACLE\PRODUCT\10.2.0\ORADATA\DB10\USERS01.DBF’

We got lucky with this one
because the tablespace name is clear, but that may not be the case. Oracle will
use the first eight characters of the tablespace name. Given there are two
tablespaces, one named EIGHTLONG1 and the other EIGHTLONG2, which file (shown
below) belongs to which tablespace?

Are other tablespaces
needed? Run a query against SYS.TS_PITR_CHECK using the specific columns or
select all columns like so:


select *
from sys.ts_pitr_check
where (ts1_name = ‘USERS’ and ts2_name != ‘USERS’)
or (ts1_name != ‘USERS’ and ts2_name = ‘USERS’);

Resolve dependencies as
needed. Indexes, for example, can be dropped and rebuilt, so you don’t necessarily need to take
along an index tablespace. Be sure to get the DDL for re-creating them, and
also for any constraints that may be dropped.

2 Net8/Net Services configuration files

This step is quite easy –
add entries into the tnsnames.ora and listener.ora files for the auxiliary
instance. If sqlnet.ora has names.default_domain in it, then the entry in
tnsnames.ora must account for that parameter.

1 parameter file (and maybe one password file)

Already covered from the “4”
part, but you need an editable version of the parameter file. The production
database only needs to be mounted in order to create a pfile from spfile. Of
course, verify beforehand how the production database was started to begin
with. What’s in the spfile may not be what’s in the pfile, and vice versa.
Overwriting the pfile via the create pfile from spfile command may cause
recent/needed parameters to be removed.

There is no need to copy a
control file from production; RMAN is going to create one for you. Create a new
password file for the clone, or copy/rename one from production. RMAN always
expects you’re connecting as SYS, but when starting the clone (as in SQL*Plus),
connecting as sys as sysdba needs to be authenticated.

Almost ready…

Since RMAN is being used,
two or three connections need to be made. One is to target (i.e., production)
and the other is to the auxiliary instance. The auxiliary or clone is started
using NOMOUNT, and when connecting to auxiliary, that state will be reflected
in RMAN.


RMAN> connect auxiliary sys/oracle@clone
connected to auxiliary database: DB10 (not mounted)

The third connection is
based on using a recovery catalog. If using one, that connection needs to be
made. If not using one, then connecting to target and auxiliary is all that is
needed.

Create the test environment

I used a bare bones database
– all of the demo schemas have been removed except for SCOTT. If you leave OE
in place, there will be a dependency listed by the output from sys.ts_pitr_check.
Run the database in archivelog mode, and take a backup using RMAN. Switch some
log files, and then delete from emp (and commit). Note the time when the commit
took place. That will be the time (something before then) to recover the
tablespace to. As an alternative, identify a log sequence number (or SCN) prior
to where/when the data was deleted.

If you get stuck on the
NLS_DATE_FORMAT and language settings (RMAN will output errors about the format
picture ending, encountered a “}” or expecting a format other than what is set),
use sysdate. You can truncate sysdate and add that part of a day to get to the
just before delete time (which is more re-runnable) or figure sysdate minus
some number of minutes (but has to be adjusted each time to make sure the time
goes back far enough).

Startup nomount the
auxiliary instance. If that is successful, you’re ready to enter RMAN. The first
part of an RMAN session is shown below. No requirement to list backups, and
the absence of any information is a good sign something is wrong with your RMAN
setup, that is, where did the backups disappear to (or were there any in the
first place?).


Recovery Manager: Release 10.2.0.1.0 – Production on Fri Dec 14 13:47:12 2007
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
RMAN> connect target sys/oracle
connected to target database: DB10 (DBID=1211451278)
RMAN> list backup;
using target database control file instead of recovery catalog

The output of list backup is
not shown. The connection to auxiliary and the run block used to issue the
recovery are shown next.


RMAN> connect auxiliary sys/oracle@clone
connected to auxiliary database: DB10 (not mounted)
RMAN> run {
2>   allocate auxiliary channel dev1 type disk;
3>   recover tablespace users until time “to_date(’14-dec-2007 13:43:00′,’dd-mon-rrrr hh24:mi:ss’)”;
4> }
allocated channel: dev1
channel dev1: sid=155 devtype=DISK
Starting recover at 14-DEC-07
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point in time
List of tablespaces expected to have UNDO segments
tablespace SYSTEM
tablespace UNDOTBS1

Other output will be shown,
and towards the end, you’ll see an export being taken. The output shown below
is the very end of the RMAN recovery process.


Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 – Production
With the Partitioning, OLAP and Data Mining options
Export file created by EXPORT:V10.02.01 via conventional path
About to import Tablespace Point-in-time Recovery objects…
import done in WE8MSWIN1252 character set and AL16UTF16 NCHAR character set
. importing SYS’s objects into SYS
. importing SCOTT’s objects into SCOTT
. . importing table                         “DEPT”
. . importing table                          “EMP”
. . importing table                        “BONUS”
. . importing table                     “SALGRADE”
. importing SYS’s objects into SYS
Import terminated successfully without warnings.
host command complete
sql statement: alter tablespace  USERS online
sql statement: alter tablespace  USERS offline
sql statement: begin dbms_backup_restore.AutoBackupFlag(TRUE); end;
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\CLONE_CONTROL01.CTL deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\SYSTEM01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\UNDOTBS01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\TEMP01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO01.LOG deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO02.LOG deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO03.LOG deleted
Finished recover at 14-DEC-07
RMAN>

Note how Oracle has removed
the files created for the clone/auxiliary instance. Is the data restored yet?
What is the status of the recovered tablespace? The answers are no and offline.
You have to online the production tablespace before the data will be available.
Prior to onlining the tablespace, take a backup of that tablespace. The data
should now be restored as it was prior to the erroneous delete statement.

In Closing

If the setup seems like a
lot of work, it’s because it may be unfamiliar to you. Following the 4-3-2-1
countdown model helps to frame what tasks are needed. Once everything is
correctly configured and the run command is issued, you’re pretty much home
free. After the data has been restored, rebuild any dependencies and indexes.
As mentioned, you can take an index (or any other, for that matter) tablespace
along for the ride. Export/import may not be the best course of action when
compared to creating an index with parallel execution operations enabled.

RMAN has changed a lot from
version 8 to what it is now. The syntax shown in the example looks different
than what is shown in the 10gR2 documentation. The commands below are different
only in where the auxiliary destination is specified.

Oracle8i	Oracle10gR2
RMAN> run { allocate auxiliary channel dev1 type disk; recover tablespace users until LOGSEQ 1300 THREAD 1; }	RMAN> RECOVER TABLESPACE users UNTIL LOGSEQ 1300 THREAD 1 AUXILIARY DESTINATION '/disk1/auxdest';

Oracle8i

Oracle10gR2

RMAN> run {

allocate auxiliary channel dev1 type disk;

recover tablespace users until LOGSEQ 1300 THREAD 1;

}

RMAN> RECOVER TABLESPACE users
UNTIL LOGSEQ 1300 THREAD 1
AUXILIARY DESTINATION '/disk1/auxdest';

The more streamlined
approach in 10g starts as shown below (using another ad hoc tablespace named
USERS2, same delete/commit of data in one table, and still requires a backup
and onlining of the tablespace).


Recovery Manager: Release 10.2.0.1.0 – Production on Mon Dec 17 02:06:21 2007
Copyright (c) 1982, 2005, Oracle.  All rights reserved.
RMAN> connect target sys/oracle
connected to target database: DB10 (DBID=1173937598)
RMAN> RECOVER TABLESPACE users2
2> UNTIL time “to_date(’17-dec-2007 02:04:00′,’dd-mon-rrrr hh24:mi:ss’)”
3> AUXILIARY DESTINATION ‘C:\oracle\product\10.2.0\flash_recovery_area\CLONE’;
Starting recover at 17-DEC-07
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=138 devtype=DISK
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point in time
List of tablespaces expected to have UNDO segments
tablespace SYSTEM
tablespace UNDOTBS1
Creating automatic instance, with SID=’Aqhp’
initialization parameters used for automatic instance:
db_name=DB10
compatible=10.2.0.1.0
db_block_size=8192
db_files=200
db_unique_name=tspitr_DB10_Aqhp
large_pool_size=1M
shared_pool_size=110M
#No auxiliary parameter file used
db_create_file_dest=C:\oracle\product\10.2.0\flash_recovery_area\CLONE
control_files=C:\oracle\product\10.2.0\flash_recovery_area\CLONE/cntrl_tspitr_DB10_Aqhp.f

Syntax differences should
also be taken into account when documenting a recovery plan. Find at least one
way that works in your environment and has been tested. When it comes time to
do this for real, you’ll be glad you practiced it.

»

See All Articles by ColumnistSteve Callan