RMAN Tablespace Point in Time Recovery

December 31, 2007

A user deletes data from a table and commits it. How do you retrieve that data? If using a version of Oracle with flashback technology – AND you are made aware of the error while the undo information is still retained – that’s not so much of a problem. If running in noarchivelog mode, and given that you have a cold backup or export lying around, the recovery process is fairly cut and dry: restore the entire database or import the table from the export.

One check to make in the import approach is for a referential integrity “action,” that is, is there a “on delete cascade” minefield waiting for you to step in? And don’t forget about triggers. In other words, more than one table may need to be recovered. Even a standby database can leave you in the hurt locker. If the transported redo has been applied, you now have the problem (i.e., the missing data) in two places.

One of the best situations to be in is running in archivelog mode and using RMAN as your backup mechanism or process. RMAN tablespace point in time recovery (TSPITR) can be used to restore the data.

Background

This is one of those critical skills where you will be glad you have put your hands on the keyboard and practiced this several times ahead of the time when you need to do this for real.

When it comes to backup and recovery, with the emphasis on recovery, Oracle documentation (to include notes on MetaLink) is full of sage advice. One such warning (going back as far as 8i) states the following:

Do not perform RMAN TSPITR for the first time on a production system or when you have a time constraint.

Another classic one is about not putting yourself into a situation worse than you already are in. Unintended “change vectors” to data are one thing; your mission is to prevent that from turning into a job change vector because you trashed the production database and cannot recover it.

If you use a recovery catalog, you have unlimited attempts to get things right. If not using one, you have one shot at getting the recovery point correct. Once recovered (but you didn’t go far enough back), the backup you were using cannot be used again for that tablespace.

The root process of an RMAN TSPITR is based on creating a clone of the production database. This is where some of the existing documentation gets murky. You’ll see references to a term called auxiliary set, which includes a backup control file, the system tablespace, datafiles containing rollback (or undo) segments, and optionally, a temporary tablespace. The lifespan of the clone is what separates how TSPITR can be done. And what about redo logs? How do they factor in?

In the “official” RMAN TSPITR process (fully automated), the clone exists but for a short time. Once it has served its purpose (as a temporary repository/instance used to restore/recover a tablespace to a point in time), it dies in place. In fact, Oracle kills it for you. In a variation of the “official” process, you create a clone database (using RMAN) whose end state is as of whatever point in time (obviously in the past) you desire. The clone lives on in this case. The official TSPITR process recovers the affected tablespace in its entirety. The other process creates the tablespace in a clone database, and from there, you can single out the affected table. From that point, export/import, CTAS, or insert into via selecting across a database link are three ways to get the table’s data restored.

Nailing Down the Setup Steps

Let’s create a 4-3-2-1 model for the setup steps. The steps pertain to older versions of Oracle, but will work in at least up to10gR2. Much of the setup is taken care of for you in 10g; just tell Oracle where the auxiliary instance work area is located. The steps include editing or identifying:

4 initialization parameters

3 tablespaces (possibly more)

2 Net8/Net Services configuration files

1 parameter file (and maybe one password file)

4 Initialization Parameters

The four init.ora parameters are lock_name_space, db_file_name_convert, log_file_name_convert, and control_files. The three tablespaces are the one you need to recover, System, and Rollback or Undo. The two Net Services files requiring editing are tnsnames.ora and listener.ora. The one copy of a parameter file is a copy of the production database’s initialization parameter file.

You use the same db_name value as what is in production. The lock_name_space parameter value is a name you can give the auxiliary instance (“clone” works well enough) and is what distinguishes the clone from production.

You do not need to copy a control file into the clone working directory. One will be created for you, but you must specify a different name than what production uses.

Here is an interesting question, somewhat hard to sort out in the documentation. Do you, or do you not, need to include the datafiles for SYSTEM and RBS/UNDO? In the official method, you DO NOT need to specify these tablespaces and their associated datafiles if they’re going to reside in a translated path. What is a translated path?

The name of the production database used in this example is db10, and the name of the auxiliary instance/database is clone. Suppose your SYSTEM datafile lives in this path and is named:

D:\oracle\product\10.2.0\oradata\db10\system01.dbf

Let’s translate db10 into clone, so that the to-be-restored datafile is:

D:\oracle\product\10.2.0\oradata\clone\system01.dbf

How is this translation accomplished? One way is to use the DB_FILE_NAME_CONVERT parameter, and it would be specified by:

db_file_name_convert=(‘db10’,’clone’)

Wherever Oracle finds “db10” in a path to a datafile, it will translate db10 into clone. This is where using optimal flexible architecture pays off. Suppose your SYSTEM datafiles are spread out like so:

/u001/oradata/db10/system_01.dbf
/u002/oradata/db10/system_02.dbf
/u003/oradata/db10/system_03.dbf

Create corresponding directories for clone and when all goes well, the clone will have its system datafiles automatically created for you as:

/u001/oradata/clone/system_01.dbf
/u002/oradata/clone/system_02.dbf
/u003/oradata/clone/system_03.dbf

The same thing will take place for rollback/undo and for the errant tablespace’s files. A similar parameter takes care of the redo log files:

log_file_name_convert=(‘db10’,’clone’)

Log files can also be specified/created in the RMAN run block. However, log files have to be specified somewhere, because even if using Oracle managed files in the auxiliary instance, they will not be created for you.

Other parameters such as dump locations and memory settings can be changed as well. The file renaming can be done other ways, either using a paired old versus new path in the convert parameters, or by explicitly setting a new name in the RMAN run block.

To summarize the parameters going from db10 to clone:

db_name='db10'
lock_name_space='clone'
db_file_name_convert=("db10","clone")
log_file_name_convert=("db10","clone")
control_files='D:\oracle\product\10.2.0\oradata\clone\clone_control01.ctl'

3 Tablespaces

The minimum set is the one you need to recover, plus system and rollback/undo. As mentioned, system and rollback/undo can be handled for you. If you want to restore the datafiles elsewhere other than under a translated path, use the set newname clause in the RMAN run block. You can explicitly identify the target tablespace’s datafiles when renaming them, or use their respective file IDs. The order of precedence (i.e., the file name conversion versus using newname) is listed in the documentation as:

1.      SET NEWNAME

2.      CONFIGURE AUXNAME

3.      DB_FILE_NAME_CONVERT

4.      AUXILIARY DESTINATION argument to RECOVER TABLESPACE

In line with this identification, it is handy to have a listing of the file ID numbers and the filenames. Error messages such as the one below, are not hard to decipher as far as the affected tablespace is concerned, but what if the file is an Oracle managed file named O1_MF_USERS_3PCS61ON_.DBF?

SQL> select * from emp;
select * from emp
              *
ERROR at line 1:
ORA-00376: file 4 cannot be read at this time
ORA-01110: data file 4: 'D:\ORACLE\PRODUCT\10.2.0\ORADATA\DB10\USERS01.DBF'

We got lucky with this one because the tablespace name is clear, but that may not be the case. Oracle will use the first eight characters of the tablespace name. Given there are two tablespaces, one named EIGHTLONG1 and the other EIGHTLONG2, which file (shown below) belongs to which tablespace?

Are other tablespaces needed? Run a query against SYS.TS_PITR_CHECK using the specific columns or select all columns like so:

select *
from sys.ts_pitr_check
where (ts1_name = 'USERS' and ts2_name != 'USERS')
or (ts1_name != 'USERS' and ts2_name = 'USERS');

Resolve dependencies as needed. Indexes, for example, can be dropped and rebuilt, so you don't necessarily need to take along an index tablespace. Be sure to get the DDL for re-creating them, and also for any constraints that may be dropped.

2 Net8/Net Services configuration files

This step is quite easy – add entries into the tnsnames.ora and listener.ora files for the auxiliary instance. If sqlnet.ora has names.default_domain in it, then the entry in tnsnames.ora must account for that parameter.

1 parameter file (and maybe one password file)

Already covered from the “4” part, but you need an editable version of the parameter file. The production database only needs to be mounted in order to create a pfile from spfile. Of course, verify beforehand how the production database was started to begin with. What’s in the spfile may not be what’s in the pfile, and vice versa. Overwriting the pfile via the create pfile from spfile command may cause recent/needed parameters to be removed.

There is no need to copy a control file from production; RMAN is going to create one for you. Create a new password file for the clone, or copy/rename one from production. RMAN always expects you’re connecting as SYS, but when starting the clone (as in SQL*Plus), connecting as sys as sysdba needs to be authenticated.

Almost ready...

Since RMAN is being used, two or three connections need to be made. One is to target (i.e., production) and the other is to the auxiliary instance. The auxiliary or clone is started using NOMOUNT, and when connecting to auxiliary, that state will be reflected in RMAN.

RMAN> connect auxiliary sys/oracle@clone

connected to auxiliary database: DB10 (not mounted)

The third connection is based on using a recovery catalog. If using one, that connection needs to be made. If not using one, then connecting to target and auxiliary is all that is needed.

Create the test environment

I used a bare bones database – all of the demo schemas have been removed except for SCOTT. If you leave OE in place, there will be a dependency listed by the output from sys.ts_pitr_check. Run the database in archivelog mode, and take a backup using RMAN. Switch some log files, and then delete from emp (and commit). Note the time when the commit took place. That will be the time (something before then) to recover the tablespace to. As an alternative, identify a log sequence number (or SCN) prior to where/when the data was deleted.

If you get stuck on the NLS_DATE_FORMAT and language settings (RMAN will output errors about the format picture ending, encountered a “}” or expecting a format other than what is set), use sysdate. You can truncate sysdate and add that part of a day to get to the just before delete time (which is more re-runnable) or figure sysdate minus some number of minutes (but has to be adjusted each time to make sure the time goes back far enough).

Startup nomount the auxiliary instance. If that is successful, you’re ready to enter RMAN. The first part of an RMAN session is shown below. No requirement to list backups, and the absence of any information is a good sign something is wrong with your RMAN setup, that is, where did the backups disappear to (or were there any in the first place?).

Recovery Manager: Release 10.2.0.1.0 - Production on Fri Dec 14 13:47:12 2007

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

RMAN> connect target sys/oracle

connected to target database: DB10 (DBID=1211451278)

RMAN> list backup;

using target database control file instead of recovery catalog

The output of list backup is not shown. The connection to auxiliary and the run block used to issue the recovery are shown next.

RMAN> connect auxiliary sys/oracle@clone

connected to auxiliary database: DB10 (not mounted)

RMAN> run {
2>   allocate auxiliary channel dev1 type disk;
3>   recover tablespace users until time "to_date('14-dec-2007 13:43:00','dd-mon-rrrr hh24:mi:ss')";
4> }

allocated channel: dev1
channel dev1: sid=155 devtype=DISK

Starting recover at 14-DEC-07
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point in time

List of tablespaces expected to have UNDO segments
tablespace SYSTEM
tablespace UNDOTBS1

Other output will be shown, and towards the end, you’ll see an export being taken. The output shown below is the very end of the RMAN recovery process.

Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Production
With the Partitioning, OLAP and Data Mining options

Export file created by EXPORT:V10.02.01 via conventional path
About to import Tablespace Point-in-time Recovery objects...
import done in WE8MSWIN1252 character set and AL16UTF16 NCHAR character set
. importing SYS's objects into SYS
. importing SCOTT's objects into SCOTT
. . importing table                         "DEPT"
. . importing table                          "EMP"
. . importing table                        "BONUS"
. . importing table                     "SALGRADE"
. importing SYS's objects into SYS
Import terminated successfully without warnings.
host command complete

sql statement: alter tablespace  USERS online

sql statement: alter tablespace  USERS offline

sql statement: begin dbms_backup_restore.AutoBackupFlag(TRUE); end;
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\CLONE_CONTROL01.CTL deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\SYSTEM01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\UNDOTBS01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\TEMP01.DBF deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO01.LOG deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO02.LOG deleted
auxiliary instance file D:\ORACLE\PRODUCT\10.2.0\ORADATA\CLONE\REDO03.LOG deleted
Finished recover at 14-DEC-07

RMAN>

Note how Oracle has removed the files created for the clone/auxiliary instance. Is the data restored yet? What is the status of the recovered tablespace? The answers are no and offline. You have to online the production tablespace before the data will be available. Prior to onlining the tablespace, take a backup of that tablespace. The data should now be restored as it was prior to the erroneous delete statement.

In Closing

If the setup seems like a lot of work, it’s because it may be unfamiliar to you. Following the 4-3-2-1 countdown model helps to frame what tasks are needed. Once everything is correctly configured and the run command is issued, you’re pretty much home free. After the data has been restored, rebuild any dependencies and indexes. As mentioned, you can take an index (or any other, for that matter) tablespace along for the ride. Export/import may not be the best course of action when compared to creating an index with parallel execution operations enabled.

RMAN has changed a lot from version 8 to what it is now. The syntax shown in the example looks different than what is shown in the 10gR2 documentation. The commands below are different only in where the auxiliary destination is specified.

Oracle8i

Oracle10gR2

RMAN> run {
allocate auxiliary channel dev1 type disk;
recover tablespace users until LOGSEQ 1300 THREAD 1;
}
  RMAN> RECOVER TABLESPACE users
    UNTIL LOGSEQ 1300 THREAD 1
	  AUXILIARY DESTINATION '/disk1/auxdest';

The more streamlined approach in 10g starts as shown below (using another ad hoc tablespace named USERS2, same delete/commit of data in one table, and still requires a backup and onlining of the tablespace).

Recovery Manager: Release 10.2.0.1.0 - Production on Mon Dec 17 02:06:21 2007

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

RMAN> connect target sys/oracle

connected to target database: DB10 (DBID=1173937598)

RMAN> RECOVER TABLESPACE users2
2> UNTIL time "to_date('17-dec-2007 02:04:00','dd-mon-rrrr hh24:mi:ss')"
3> AUXILIARY DESTINATION 'C:\oracle\product\10.2.0\flash_recovery_area\CLONE';

Starting recover at 17-DEC-07
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=138 devtype=DISK
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point in time

List of tablespaces expected to have UNDO segments
tablespace SYSTEM
tablespace UNDOTBS1

Creating automatic instance, with SID='Aqhp'

initialization parameters used for automatic instance:
db_name=DB10
compatible=10.2.0.1.0
db_block_size=8192
db_files=200
db_unique_name=tspitr_DB10_Aqhp
large_pool_size=1M
shared_pool_size=110M
#No auxiliary parameter file used
db_create_file_dest=C:\oracle\product\10.2.0\flash_recovery_area\CLONE
control_files=C:\oracle\product\10.2.0\flash_recovery_area\CLONE/cntrl_tspitr_DB10_Aqhp.f

Syntax differences should also be taken into account when documenting a recovery plan. Find at least one way that works in your environment and has been tested. When it comes time to do this for real, you’ll be glad you practiced it.

» See All Articles by Columnist Steve Callan








The Network for Technology Professionals

Search:

About Internet.com

Legal Notices, Licensing, Permissions, Privacy Policy.
Advertise | Newsletters | E-mail Offers