Oracle: Preventing Corruption Before it’s Too Late – Part 2

Oracle Soft and Bug Corruption


One block is soft corrupted when a block format is
different from the Oracle default block format. Oracle soft data corruption
(logical, software) is usually detected while reading some data from the disk
to the database buffer cache. In the buffer cache, Oracle kernel investigate
block content, reading block information about type, incarnation, version,
sequence number, checksum and data block address (DBA) depending of the
database settings. The same way, whenever Oracle modifies a data block a health check is
performed on the block to check it is fully consistent. Any errors found cause
an internal error to be signalled.

Oracle
by default will not dig deeply into block content; rather it just does a quick
look in the block header. If the header does not conform to standard rules,
and the block structure is not regular, then the block is considered corrupt.
However, this does not always mean that the block on disk is truly corrupt.
That fact needs to be confirmed.

Oracle
error indicating soft corruption:

ORA-00600: internal error code, arguments: [3339], [RBA1], [RBA2], [], [], [], [], []

Where
RBA1 is the block address reread from the block header and RBA2 is the real
physical block address in the database. The Oracle database engine will make a block
check for a every block read into the database buffer cache. If the block
content is incorrect, an error message will be generated. This type of block
Oracle will mark as soft corrupted, changing several bytes in the block header.
Oracle will skip the soft corrupted blocks, regardless of readable information they
contain. Let’s
look at several different situations:

ORA-00600: internal error code, arguments: [3339], [0], [15742], [], [], [], [], []

The
above error occurs when
the calculated DBA (real physical block location) and the block header read DBA
do not mach. The reason for these differences can result from an operating system
repair attempt after a system crash, or by faulty ASYNC I/O processing.

ORA-00600: internal error code, arguments: [3339], [12222222], [144665742], [], [], [], [], []

This error occurs when both addresses, read
and calculated, contain some large numbers. Possible reasons are an incorrect
entry in the block header (pointing to non-existent block) due to faulty memory
modules, or the block is part of a large database file (greater than 2GB) and the
block is written in the wrong place.

The message, "write blocks out of
sequence" for files greater than 4.3 GB indicate this kind of corruption.
Since Oracle supports only 2GB, the operating system has to translate the
address and positioning blocks inside large files to the correct location.

ORA-00600: internal error code, arguments: [3339], [14237], [15742], [], [], [], [], []

In
the above example, both block addresses are real; one from the block header and
one calculated by the Oracle. The problem is that DBA from the block header has
offset from the real, true address in the database. The reason for this could
be that the operating system has had a failure, writing in the block header the
address of the previous block that was last read into database memory.

ORA-00600: internal error code, arguments: [3339], [14237], [15742], [], [], [], [], []

In
the above example, Oracle considered the block it was reading from disk to be
soft corrupted since it had a different DBA address in the header than the one
requested. The reason for this behaviour may be an extremely high stress load
on the system causing the operating system to have a read failure, and thus retrieving
the wrong block from the disk.

Errors in file /opt/oracle/admin/ALSY1/bdump/ckpt_5514_alsy1.trc:
ORA-01242: data file suffered media failure:

ORA-01122: database file 13 failed verification check 
ORA-01110: data file 13: '/u04/oradata/ALSY1/cwrepo01.dbf' 
ORA-01251: Unknown File Header Version read for file number 13 

If
Oracle fails to verify block header content the operation will finish with an
error message indicating an Unknown File Header Version. Again, this can be
result of Oracle memory mishandling.

Bug Corruption

Due
to the bugs in the Oracle code or perhaps because of imperfect behaviour
between Oracle code and the underlying operating system, diverse block
corruptions occur. Oracle is fixing bugs with every new version. Unfortunately,
new ones appear. Here are just a few examples of Oracle bug corruption:

  • a
    corrupted database due the auto extended bug


    Corrupt block dba: 0x24024cd0 file=5. blocknum=150736.
    found during buffer read on disk type:0. ver.
    dba:0x00000000 inc:0x00000000 seq:0x00000000 incseq:0x00000000
    Entire contents of block is zero – block never written
    Reread of block=24024cd0 file=9. blocknum=150736. found same corupted Data

  • Mishandled block information (in certain
    conditions), upon reading the database block, which indicates that a good block
    is corrupted. This is the bug in the Oracle version 8.1.x – 9.x, where the
    Oracle will raise the error:

    ORA-600 [kcoapl_blkchk][ABN][RFN][INT CODE] 

    Pointing to failure condition during block check. This only
    happens when block checking is enabled.

  • A problem with
    a faulty database trigger operation, causing data block corruptions


    ORA-01115: IO error reading block from file 6 (block # 14873)
    ORA-01110: data file 6: ‘/oracle/artist/artist01.dbf’
    ORA-27091: skgfqio: unable to queue I/O
    IBM AIX RISC System/6000 Error: 9: Bad file number

  • A bug in the
    operating system, making the system check, corrupting a good Oracle block and
    causing a database file to be offline.


    SQL> SELECT * FROM ARTIST_HISTORY;
    ERROR:ORA-01115: IO error reading block from file 12 (block # 2342)
    ORA-01110: data file 12:’/oracle/artist/artist01.dbf’
    ORA-27091: skgfqio: unable to queue I/O
    OSD-04006: ReadFile() failure, unable to read from file
    O/S-Error: (OS 23) Data error (cyclic redundancy check)

Marin Komadina
Marin Komadina
Marin was born June 27, 1968 in Zagreb, Croatia. He graduated in 1993 form The Faculty for Electrotechnology and Computer Sciences, University of Zagreb in Croatia. He started his professional career as a System specialist and DBA for the Croatian company Informatika System. His most important project was the development and implementation of the enterprise, distributed point of sales solution, based on the Oracle technology. In 1999, Marin became the company CTO, where he played an active role in company development and technical orientation. After Informatika System, Marin worked as an IT Manager Assistant for the Austrian international retail company "Segro," on location in Graz (Austria) and Zagreb (Croatia). He was responsible for the company's technical infrastructure and operational support. Segro used IBM technology, OS/400 operating system and DB2 database. In 1998, Marin joined the international telecommunication company VIPNet GSM that was a part of greater concern, Mobilkom Austria& Western Wireless Int. USA. After one year, Marin took over the IT System Manager position, where he managed many multi-platform, telecommunication projects and was leading the IT system department. In 2001, Marin started to work in Germany as a senior system architect. He is currently working for German banks on different banking projects.

Latest Articles