Oracle: Preventing Corruption Before it's Too Late - Part 2October 30, 2003 Oracle Soft and Bug Corruption
Oracle by default will not dig deeply into block content; rather it just does a quick look in the block header. If the header does not conform to standard rules, and the block structure is not regular, then the block is considered corrupt. However, this does not always mean that the block on disk is truly corrupt. That fact needs to be confirmed. Oracle error indicating soft corruption: ORA-00600: internal error code, arguments: [3339], [RBA1], [RBA2], [], [], [], [], [] Where RBA1 is the block address reread from the block header and RBA2 is the real physical block address in the database. The Oracle database engine will make a block check for a every block read into the database buffer cache. If the block content is incorrect, an error message will be generated. This type of block Oracle will mark as soft corrupted, changing several bytes in the block header. Oracle will skip the soft corrupted blocks, regardless of readable information they contain. Let's look at several different situations: ORA-00600: internal error code, arguments: [3339], [0], [15742], [], [], [], [], [] The above error occurs when the calculated DBA (real physical block location) and the block header read DBA do not mach. The reason for these differences can result from an operating system repair attempt after a system crash, or by faulty ASYNC I/O processing. ORA-00600: internal error code, arguments: [3339], [12222222], [144665742], [], [], [], [], [] This error occurs when both addresses, read and calculated, contain some large numbers. Possible reasons are an incorrect entry in the block header (pointing to non-existent block) due to faulty memory modules, or the block is part of a large database file (greater than 2GB) and the block is written in the wrong place. The message, "write blocks out of sequence" for files greater than 4.3 GB indicate this kind of corruption. Since Oracle supports only 2GB, the operating system has to translate the address and positioning blocks inside large files to the correct location. ORA-00600: internal error code, arguments: [3339], [14237], [15742], [], [], [], [], [] In the above example, both block addresses are real; one from the block header and one calculated by the Oracle. The problem is that DBA from the block header has offset from the real, true address in the database. The reason for this could be that the operating system has had a failure, writing in the block header the address of the previous block that was last read into database memory. ORA-00600: internal error code, arguments: [3339], [14237], [15742], [], [], [], [], [] In the above example, Oracle considered the block it was reading from disk to be soft corrupted since it had a different DBA address in the header than the one requested. The reason for this behaviour may be an extremely high stress load on the system causing the operating system to have a read failure, and thus retrieving the wrong block from the disk. Errors in file /opt/oracle/admin/ALSY1/bdump/ckpt_5514_alsy1.trc: ORA-01122: database file 13 failed verification check ORA-01110: data file 13: '/u04/oradata/ALSY1/cwrepo01.dbf' ORA-01251: Unknown File Header Version read for file number 13 If Oracle fails to verify block header content the operation will finish with an error message indicating an Unknown File Header Version. Again, this can be result of Oracle memory mishandling. Bug CorruptionDue to the bugs in the Oracle code or perhaps because of imperfect behaviour between Oracle code and the underlying operating system, diverse block corruptions occur. Oracle is fixing bugs with every new version. Unfortunately, new ones appear. Here are just a few examples of Oracle bug corruption:
|