Oracle: Preventing Corruption Before it’s Too Late: Part 1

Every
DBA has a personal horror story about Oracle database corruption. A few DBAs,
those that never taste this experience are missing the most challenging part of
the DBA profession.

Those
mind-jarring calls in the middle of the night to come and repair a database
problem only to discover upon arrival that the last good backups was made 3
days ago, hardware has error and above all the project manager is screaming
from his warm bed: " The database must be recovered by tomorrow morning!"

Have
you ever experienced this? Well, if not, you don’t know what you are missing.

Luckily,
database corruptions are very rare, but when they happen, they can have a
catastrophic effect on the whole company’s business. The first occurrence of a database
problem may not be detected until much later, after the data corruption has actually
occurred. One
can use several techniques to determine whether the database is corrupt, but the
main thing is also to understand the nature of the corruption. In this article,
I will discuss Oracle data-block corruption and will describe several features
of Oracle that may be used at an early stage to detect database corruption.

Part 1

  • Oracle
    Block Structure

  • System
    Components and Corruption Type

  • Oracle
    Hard Corruption

Part 2

  • Oracle
    Soft Corruption

  • Oracle
    Bug Corruption

  • Database
    Recommendations

  • Oracle
    Hardware Assisted Resilient Data (HARD)

  • Conclusion

Oracle Block Structure

Oracle
block is a main database element, composed of several operating system blocks. The
data block size is defined during database creation with initialization
parameter db_block_size. Assuming a standard UNIX configuration, with regular file
system structure we have the following physical block structure:

By
default, Oracle extent has 5 Oracle blocks and each Oracle block has several
operating system blocks. On the above picture is shown an Oracle block, which
has 5 operating system blocks. Every OS block has a standard structure
consisting of a block header and footer and a portion with the actual data.

Every
Oracle data block has an internal, binary format with a defined structure
consisting of a fixed block header, block directory, ITL (transaction list), block
free space and data portion. Inside the fixed block header is stored data
block address (DBA). DBA is a 48 bit integer block address, used for checking block
integrity. The block address has two parts: the file number and the relative
block number.

The
relative block number describes block positioning relative to the tablespace to
which the block belongs and the absolute file number describes block
positioning globally in the database (FILE# in V$DATAFILE). Two different blocks
in the database might have the same relative number, but they always have a different
absolute block number. Using dbms_utility package we can get a block’s actual,
absolute address using the block’s relative address:


SQL> variable dba varchar2;
SQL> exec :dba := dbms_utility.make_data_block_address(101,5); -> relative address, file number
SQL> print dba
10059 -> absolute block address

Each
formatted block has a block header. Some blocks also have a footer in case that
block is a split block. The block header and footer, together make the "Block
Wrapper".

Inside
the Oracle block, Oracle save rows of data from user database objects. Every
row, saved inside the block has a unique, ROWID address. The ROWID address is
an 18-digit number with the following format

object_number.relative_file_number.block_number.row_number.

System Components and
Corruption Type

A modern computer system has many different system layers.
Several of them interact in takeover and transport of the Oracle data block
from memory to the disk.

Every new component expands the possibility of a new
error. An Oracle
database block has to pass several communication points on the way to the
physical storage:

  • Oracle SGA and
    sessions’ PGA (System RAM)

  • UNIX
    buffer cache (System RAM)

  • disk
    controller cache (on-board external)

  • Storage
    Area Network cache (SAN box)

  • Physical
    disk spindle

Oracle
data is being transferred from the database buffers (Oracle SGA or sessions’ PGA), to the UNIX buffer cache,
and then using ASYNC or SYNC IO UNIX system calls and OS device drivers, over the
disk controller to the hard disk. For a storage implementation with SAN
(Storage Area Network), data is buffered in the SAN cache and transferred in the
background to the physical device.

During Oracle block transport, a block
can be corrupted due to the following:

  • System Memory
    (memory or paging problems, caching problems)

  • Disk Controller
    (Bad I/O hardware or firmware, caching problems )

  • Disk Device (
    problem with device driver or mechanical problem)

  • Others ( operating
    system bugs, oracle bugs, disk repair utilities )

An
Oracle block can be soft or hard corrupted.

Hard
corruption can occur due to a hardware or software error, in the memory or on
the disk. The block is not readable at all, or block content has no known Oracle
block format. An Oracle
block is soft corrupted if the block has a format similar to Oracle block
format, but content does not fit the physical or logical Oracle database
structure. .

Marin Komadina
Marin Komadina
Marin was born June 27, 1968 in Zagreb, Croatia. He graduated in 1993 form The Faculty for Electrotechnology and Computer Sciences, University of Zagreb in Croatia. He started his professional career as a System specialist and DBA for the Croatian company Informatika System. His most important project was the development and implementation of the enterprise, distributed point of sales solution, based on the Oracle technology. In 1999, Marin became the company CTO, where he played an active role in company development and technical orientation. After Informatika System, Marin worked as an IT Manager Assistant for the Austrian international retail company "Segro," on location in Graz (Austria) and Zagreb (Croatia). He was responsible for the company's technical infrastructure and operational support. Segro used IBM technology, OS/400 operating system and DB2 database. In 1998, Marin joined the international telecommunication company VIPNet GSM that was a part of greater concern, Mobilkom Austria& Western Wireless Int. USA. After one year, Marin took over the IT System Manager position, where he managed many multi-platform, telecommunication projects and was leading the IT system department. In 2001, Marin started to work in Germany as a senior system architect. He is currently working for German banks on different banking projects.

Latest Articles