Recovering from Loss of All Control Files

In a prior article about backup and recovery exercises, one of the scenarios dealt with recovering from the loss of a control file. In that scenario, the database was running with more than one control file, so recovering from the (self-induced, but for instructional purposes) media failure was pretty simple. When installing Oracle and creating a seed database, few DBAs have missed the incongruence between Oracle’s advice to multiplex – specifically in the sense of using more than one disk – control files and the installer’s creation of the control files in the same directory.

It’s easy to understand why the files are created in the same directory. It’s a trade-off between installing the software in a relatively simple manner versus requiring users, most of which are probably quite new at this, to have more than one disk. In other words, how many of you bang on Oracle with your work or home PC, and of those computers, how many have more than one disk/drive? The installer gives DBAs a good running start on getting a database created, and having it up and running in short order. The installation is not perfect, but on the other hand, it’s not that bad either.

Given that virtually all OUI/standard template creation of databases will have the control files in the same location, it is not surprising to see questions on the various Oracle question and answer forums in the nature of, “I’ve lost all of my control files, and I don’t have a backup. Urgently waiting your reply.” Despite the best advice, it happens. In fact, it happens in production environments per what the forum posters claim.

For those of you who find yourself in this situation, the good news is that yes, you can recover the database. Even better, the recovery steps are not that hard to perform. The bad news is that you may lose some data. “Data,” in this sense, is data within the prior version of the control files. If you were using the control file as the RMAN repository, the current set of backup information will be lost. However, you can manually add what appears to be orphaned backup sets/pieces to the repository (and should this happen to you, it will also serve as a good example of why the repository should be stored in a recovery catalog as opposed to solely within the target’s control files). Let’s start the example by removing/deleting the control files.

Assuming you are doing this on Windows, the files will be locked. This exercise requires a shutdown of the database. Take a cold backup so you can recover (technically, restore) in case something in your environment goes awry. I’ve performed this several times on my at work “bang around” database. Really, this works.

Shown below are the current files of my DB10 database, plus a Backup folder with copies of all files. The database is shut down.

The control files will be renamed and then a startup will be attempted.

--Clean shutdown to release the locks on the files
 
SQL> conn sys/oracle as sysdba
Connected.
SQL> shutdown
Database closed.
Database dismounted.
ORACLE instance shut down.
 
--Startup issued with the control files missing
 
SQL> startup
ORACLE instance started.
Total System Global Area  289406976 bytes
Fixed Size                  1290184 bytes
Variable Size             130023480 bytes
Database Buffers          150994944 bytes
Redo Buffers                7098368 bytes
ORA-00205: error in identifying control file, check alert log for more info

Now we can treat this situation as having to create control files. What is the current state of the instance/database? If the control files are missing, then the database cannot be mounted, so at most, the database state is nomount. Any attempt to mount will fail because of the missing files.

SQL> select status from v$instance;
STATUS
------------
STARTED
SQL> alter database mount;
alter database mount
*
ERROR at line 1:
ORA-00205: error in identifying control file, check alert log for more info

No problem, we’ll use the command to create a control file from trace.

SQL> ALTER DATABASE BACKUP CONTROLFILE TO TRACE;
ALTER DATABASE BACKUP CONTROLFILE TO TRACE
*
ERROR at line 1:
ORA-01507: database not mounted

As you can see, not being mounted is a show stopper for this approach. What is required is to manually create a statement which when run, creates a control file. We need the names of all of the datafiles in the database for this step, so it is helpful to have those listed ahead of time (which I did in this case, see the first diagram). Using what Windows is reporting as the file sizes (in KB), we can construct a CREATE CONTROLFILE statement as shown:

CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG 
 MAXLOGFILES 50 
 MAXLOGMEMBERS 3 
 MAXDATAFILES 300 
 MAXINSTANCES 8 
 MAXLOGHISTORY 500 
 LOGFILE 
  GROUP 7 'D:\oracle\product\10.2.0\oradata\db10\redo07.log' SIZE 5121K, 
  GROUP 8 'D:\oracle\product\10.2.0\oradata\db10\redo08.log' SIZE 5121K, 
  GROUP 9 'D:\oracle\product\10.2.0\oradata\db10\redo09.log' SIZE 5121K 
 DATAFILE 
  'D:\oracle\product\10.2.0\oradata\db10\users01.dbf' SIZE 79368K, 
  'D:\oracle\product\10.2.0\oradata\db10\undotbs01.dbf' SIZE 128008K, 
  'D:\oracle\product\10.2.0\oradata\db10\system01.dbf' SIZE 614408K, 
  'D:\oracle\product\10.2.0\oradata\db10\sysaux01.dbf' SIZE 593928K, 
  'D:\oracle\product\10.2.0\oradata\db10\example01.dbf' SIZE 174088K;

Here is where the old match-the-filename-to-the-tablespace-name naming convention comes in handy. We don’t have to match the datafile names to the tablespace names, but we do have to match the redo log filename to the redo log group. That can be tricky because with N groups, there will be N! ways of making those assignments. In this example, we are fortunate because the log member has a name indicative of the log group to which it belongs. Let’s issue the statement and see what happens.

SQL> CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG 
  2          MAXLOGFILES 50 
  3          MAXLOGMEMBERS 3 
  4          MAXDATAFILES 300 
  5          MAXINSTANCES 8 
  6          MAXLOGHISTORY 500 
  7          LOGFILE 
  8                  GROUP 7 'D:\oracle\product\10.2.0\oradata\db10\redo07.log'  SIZE 5121K,
  9                  GROUP 8 'D:\oracle\product\10.2.0\oradata\db10\redo08.log'  SIZE 5121K,
 10                  GROUP 9 'D:\oracle\product\10.2.0\oradata\db10\redo09.log'  SIZE 5121K 
 11          DATAFILE 
 12                  'D:\oracle\product\10.2.0\oradata\db10\users01.dbf' SIZE 79368K, 
 13                  'D:\oracle\product\10.2.0\oradata\db10\undotbs01.dbf' SIZE 128008K, 
 14                  'D:\oracle\product\10.2.0\oradata\db10\system01.dbf' SIZE 614408K, 
 15                  'D:\oracle\product\10.2.0\oradata\db10\sysaux01.dbf' SIZE 593928K, 
 16                  'D:\oracle\product\10.2.0\oradata\db10\example01.dbf' SIZE 174088K;
 CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01163: SIZE clause indicates 9921 (blocks), but should match header 9920
ORA-01110: data file 4: 'D:\oracle\product\10.2.0\oradata\db10\users01.dbf'

This error looks pretty serious – we have to start figuring out the number of blocks for at least this file and all others? Yes, but it is easy. The size can be figured as follows:

Expected size = Expected # of blocks * db_block_size / 1024

Extract the db_block_size from “show parameter db_block” and see (in my database) 8192. The USERS datafile size in the CREATE CONTROLFILE statement should then be 9920 * 8192 / 1024 = 79360K. Replace the OS reported value of 79368 with 79360 in the CREATE CONTROLFILE statement and re-issue it:

CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01163: SIZE clause indicates 16001 (blocks), but should match header 16000
ORA-01110: data file 2: 'D:\oracle\product\10.2.0\oradata\db10\undotbs01.dbf'

The datafile for the UNDO tablespace has the same problem USERS did, but we can see a trend: the number of blocks is off by one, so let’s try downsizing the size the same way (subtract 8K) for this and the remaining datafiles.

SQL> CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG 
  2          MAXLOGFILES 50 
  3          MAXLOGMEMBERS 3 
  4          MAXDATAFILES 300 
  5          MAXINSTANCES 8 
  6          MAXLOGHISTORY 500 
  7          LOGFILE 
  8                  GROUP 7 'D:\oracle\product\10.2.0\oradata\db10\redo07.log'  SIZE 5121K,
  9                  GROUP 8 'D:\oracle\product\10.2.0\oradata\db10\redo08.log'  SIZE 5121K,
 10                  GROUP 9 'D:\oracle\product\10.2.0\oradata\db10\redo09.log'  SIZE 5121K 
 11          DATAFILE 
 12                  'D:\oracle\product\10.2.0\oradata\db10\users01.dbf' SIZE 79360K, 
 13                  'D:\oracle\product\10.2.0\oradata\db10\undotbs01.dbf' SIZE 128000K, 
 14                  'D:\oracle\product\10.2.0\oradata\db10\system01.dbf' SIZE 614400K, 
 15                  'D:\oracle\product\10.2.0\oradata\db10\sysaux01.dbf' SIZE 593920K, 
 16                  'D:\oracle\product\10.2.0\oradata\db10\example01.dbf' SIZE 174080K;
CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG
*
ERROR at line 1:
ORA-01503: CREATE CONTROLFILE failed
ORA-01163: SIZE clause indicates 10242 (blocks), but should match header 10240
ORA-01517: log member: 'D:\oracle\product\10.2.0\oradata\db10\redo07.log'

It appears the datafiles part worked, and now the redo log sizes need some adjustment. Using a value of 5120K instead of the OS reported value of 5121K yields:

SQL> CREATE CONTROLFILE REUSE DATABASE "DB10" NORESETLOGS NOARCHIVELOG 
  2          MAXLOGFILES 50 
  3          MAXLOGMEMBERS 3 
  4          MAXDATAFILES 300 
  5          MAXINSTANCES 8 
  6          MAXLOGHISTORY 500 
  7          LOGFILE 
  8            GROUP 7 'D:\oracle\product\10.2.0\oradata\db10\redo07.log'  SIZE 5120K,
  9            GROUP 8 'D:\oracle\product\10.2.0\oradata\db10\redo08.log'  SIZE 5120K,
 10            GROUP 9 'D:\oracle\product\10.2.0\oradata\db10\redo09.log'  SIZE 5120K 
 11           DATAFILE 
 12                'D:\oracle\product\10.2.0\oradata\db10\users01.dbf' SIZE 79360K, 
 13                'D:\oracle\product\10.2.0\oradata\db10\undotbs01.dbf' SIZE 128000K, 
 14                'D:\oracle\product\10.2.0\oradata\db10\system01.dbf' SIZE 614400K, 
 15                'D:\oracle\product\10.2.0\oradata\db10\sysaux01.dbf' SIZE 593920K, 
 16                'D:\oracle\product\10.2.0\oradata\db10\example01.dbf' SIZE 174080K;
Control file created.

Why did the redo log size get reduced by 1K instead of 8K as with the datafiles? Or, for what are approximately 5MB in size files, why are there so many more blocks (over 10,000) when compared to what the approximately 78MB USERS datafile has (9,900)? The answer lies within the size of the redo log blocks. On Windows, these blocks can be 512 bytes. For the redo logs then:

Expected size = Expected # of blocks * 512 / 1024

The 1K reduction comes from the difference of two blocks at 512 bytes each.

Now that the control file(s) has been created, what is the state of the database? Verify that your files have been created, and then select the status from V$INSTANCE and see MOUNTED. If the database is mounted, can it be opened? The answer is yes.

SQL> select status from v$instance;
STATUS
------------
MOUNTED
SQL> alter database open;
Database altered.

Connect as a user and perform any operation which requires the temporary tablespace (or group, in 10g and above). What do you see? Up to a point, the database may appear to fine, but what does the following query and result tell you?

SQL> select * from dba_temp_files;
no rows selected

This presents an interesting situation: the temp tablespace is online and at the same time, there are no datafiles associated with it. In fact, you can perform a shutdown and startup and see that the temporary tablespace temp file is untouched. If the query or statement you used to test the database required some temp space for sorting, you will see the ORA-25153 error.

[/home/oracle]$ oerr ora 25153
25153, 00000, "Temporary Tablespace is Empty"
// *Cause: An attempt was made to use space in a temporary tablespace with
//         no files.
// *Action: Add files to the tablespace using ADD TEMPFILE command.

Very easy fix, and once that is complete, your database is fully functional.

In Closing

It is interesting how the loss of a special file is easily restored or fixed when compared to what it takes to restore a datafile. The example presented here should give you some confidence that losing all of your control files isn’t the end of the world. Nonetheless, keep in mind that if your RMAN repository was in the control files, you will need to re-create it.

» See All Articles by Columnist Steve Callan

Steve Callan
Steve Callan
Steve is an Oracle DBA (OCP 8i and 9i)/developer working in Denver. His Oracle experience also includes Forms and Reports, Oracle9iAS and Oracle9iDS.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles