“The Chain Gang”, a Riveting Tale of Chained and Migrated Rows in Oracle

Chained and migrated rows in an Oracle database can seriously impact I/O performance, especially with Exadata. In a ‘normal’ Oracle database [read that as “non-Exadata, non-ASM” database] chained rows, in small numbers where that usually means 1% or less of the total rows in a table, are generally a nuisance. The performance hit they generate in such a situation is small and may not even be noticed by the end users. Add more chained/migrated rows to that mix and the I/O work increase starts becoming noticeable. How do these rows get created, what is the problem they cause and how can you mitigate the issue? Let’s look into what chained/migrated rows are, what can cause them and how to deal with them.

Chained and migrated rows are a unique lot — they exist in two or more data blocks and are linked by the “head” rowid, usually the rowid of the original row. The chained piece or pieces come into being when a row is updated and the data block containing it can’t store the entire row any more. A migrated row is one that is moved entirely to a new block with a ‘pointer’ left behind associated with the original rowid. Let’s try to draw a rather simplified picture to illustrate that concept:

Original row -- Data fits into the available space in the block

       ------------------------------------------------------
000001:|Bonsai Bob|14734|23-NOV-2013|343.27|Duluth|Dept 300 |
       ------------------------------------------------------

Update is made -- Data now takes more space than block has available

       ------------------------------------------------------
000001:|Samurai Sam The Singing Elf|14734|23-NOV-2013|343.27|Duluth|Dept 300 |
       ------------------------------------------------------

Data no longer 'fits' in one block, it's divided between blocks --

       ------------------------------------------------------
000001:|Samurai Sam The Singing Elf|14734|23-NOV-2013|000017|
       ------------------------------------------------------
000017:|343.27|Duluth|Dept 300                              |
       ------------------------------------------------------

Notice there is a ‘pointer’ to the next piece of the row at the end of the original row. This occurs for every ‘split’ this row takes to get the data to ‘fit’ in the table. It’s also possible for a row to be migrated, in its entirety, to an empty block to minimize the trips Oracle has to take to retrieve the data; this occurs when the original row was small and occupied minimal space in a data block. Updates cause it to expand in length and make Oracle relocate the data to another block or blocks with more available space:

Original row -- Data occupies available space

       ------------------------------------------------------
000009:|< more data in block  >|Bonsai Bob|14734|23-NOV-2013|
       ------------------------------------------------------

Update is made -- Data no longer fits

       ------------------------------------------------------
000009:|Samurai Sam The Singing Elf|14734|23-NOV-2013|343.27|Duluth|Dept 300 |
       ------------------------------------------------------

Data is divided between blocks --

       ------------------------------------------------------
000009:||000019|
       ------------------------------------------------------
000019:|Samurai Sam The Singing Elf|14734|23-NOV-2013|000037|
       ------------------------------------------------------
000037:|343.27|Duluth|Dept 300                              |
       ------------------------------------------------------

The original rowid now has a pointer to the new location of the modified data. That migrated location may also be chained to an additional block because the length of the modified data eventually exceeds the data block size.

What does this mean for performance? It depends, really, on the overall number of rows in the table or schema that are chained or migrated. For a database not on Exadata, or one not using ASM, each chained/migrated row requires at least two fetches to return the data, which at least doubles the work Oracle has to do to process that row. For one, ten or one hundred rows this may be a very small decrease in performance. When 10% or more of the data in a table is chained/migrated I/O processing slows down, and this can happen for tables where the row length can exceed the block size. Without intervention Oracle usually sets the block size to 8196, or 8K, bytes. After taking into consideration any ‘overhead’ introduced by Oracle the maximum available space in an 8K block is roughly 7900 to 8000 bytes. Any table row longer than that will create a chained row even for an empty block. This can be changed at database creation or by creating tablespaces using a larger block size. [Smaller block sizes can also be used but would not benefit any process to reduce row chaining.] The largest possible block size supported by Oracle is 32K so if a table row exceeds that length when fully populated chained rows are guaranteed to exist. The more ‘passes’ Oracle must make to fetch a row, the longer the I/O waits can be, resulting in diminished performance so it’s best to minimize their occurrence.

Exadata and ASM present a different scenario. Since a diskgroup is striped across all available disks associated with that group the pieces of a chained row can exist on different disks, and, for Exadata, different storage cells. For a ‘regular’ database using ASM the performance can be slower because not only is Oracle needing to visit different blocks to return data it may also have to visit different disks as well, introducing additional latency. On top of that for an Exadata database chained rows, if not on the same disk or accessed by the same storage cell, causes Oracle to revert to regular block I/O, eliminating any chance of using a Smart Scan and any of the optimizations a Smart Scan would provide.

The question to answer is “How do I eliminate, or at least minimize, chained and migrated rows?” The first, and most effective, way is to design tables with the database block size in mind, taking care to not exceed the available space in the block when a row is fully populated and all variable-length columns contain data sized at the maximum defined length. It is possible to increase that available space somewhat, by setting two parameters, PCTFREE and PCTUSED, to tell Oracle how to allocate the datablock space. PCTUSED defines how much of the block can be used for data and governs when a block is taken off of the ‘available’ list. PCTFREE reserves the declared percentage for updates to existing data in the block. If rows are being constantly updated, such as in a financial system, it might be best to set PCTUSED to a smaller value than the default (which is 60) and PCTFREE to 100-PCTUSED, to ensure more space is available for updates to existing data. I have seen at least one system where PCTUSED was set to 1 and PCTFREE set to 99, allowing 1 row per block with plenty of room for updates. In that system the existence of chained rows dropped significantly. Please note that making such a change to an existing table using ‘alter table … ‘ won’t affect the currently populated data blocks. To do that in Oracle releases 9.0.1.x and later you’ll need to perform a move operation, to rebuild the table with the new PCTUSED and PCTFREE settings:


SQL> alter table empdrop move pctfree 99 pctused 1;

Table altered.

SQL>

Unfortunately such a move can increase the table size considerably since the rows per block have now been reduced to a much smaller value. [If you’re still in the ‘dark ages’, using Oracle 8.1 or earlier, you’ll need to export the table, pre-create it with a script modified to set the PCTUSED and PCTFREE ‘properly’ then import the data ignoring the table create error.] This is not a task to be taken lightly, nor one to use ‘canned’ scripts and ‘best practice’ values as each implementation is different. Simply because one table benefitted from such drastic settings for PCTUSED and PCTFREE doesn’t mean EVERY table will need such a dramatic change. Base your changes on the incidence of chained rows and the maximum row length as compared to the database block size.

“How do I know if I have chained rows?”, you ask. Oracle keeps a system statistic, ‘table fetch continued row’, that increments each time a continued/migrated row is fetched. It doesn’t tell you WHERE those rows are, however the ANALYZE TABLE command is still available to provide that information. It can take a while to run on a large table but by default it will populate a table named, brilliantly enough, CHAINED_ROWS, designed to contain all of the information you could need to identify which rows are chained in a table. There is no way to gather this information at the schema level (DBMS_STATS doesn’t have any procedure or function to gather that data) but you can create a script to do the deed. There is some preparation involved, namely creating the CHAINED_ROWS table, but Oracle has conveniently provided a script for that named utlchain.SQL, located in $ORACLE_HOME/rdbms/admin:


rem
rem $Header: utlchain.sql 07-may-96.19:40:01 sbasu Exp $
rem
Rem Copyright (c) 1990, 1995, 1996, 1998 by Oracle Corporation
Rem NAME
REM    UTLCHAIN.SQL
Rem  FUNCTION
Rem    Creates the default table for storing the output of the
Rem    analyze list chained rows command
Rem  NOTES
Rem  MODIFIED
Rem     syeung     06/17/98  - add subpartition_name
Rem     mmonajje   05/21/96 -  Replace timestamp col name with analyze_timestam
Rem     sbasu      05/07/96 -  Remove echo setting
Rem     ssamu      08/14/95 -  merge PTI with Objects
Rem     ssamu      07/24/95 -  add field for partition name
Rem     glumpkin   10/19/92 -  Renamed from CHAINROW.SQL
Rem     ggatlin    03/09/92 -  add set echo on
Rem     rlim       04/29/91 -         change char to varchar2
Rem   Klein      01/10/91 - add owner name for chained rows
Rem   Klein      12/04/90 - Creation
Rem

create table CHAINED_ROWS (
  owner_name         varchar2(30),
  table_name         varchar2(30),
  cluster_name       varchar2(30),
  partition_name     varchar2(30),
  subpartition_name  varchar2(30),
  head_rowid         rowid,
  analyze_timestamp  date
);

Oracle also provides another version of that table in the $ORACLE_HOME/rdbms/admin/utlchn1.sql script, that uses the UROWID datatype, which supports both Oracle and non-Oracle database rowid values:


Rem
Rem $Header: utlchn1.sql 24-jun-99.07:57:57 echong Exp $
Rem
Rem utlchn1.sql
Rem
Rem  Copyright (c) Oracle Corporation 1998, 1999. All Rights Reserved.
Rem
Rem    NAME
Rem      utlchn1.sql - 
Rem
Rem    DESCRIPTION
Rem      
Rem
Rem    NOTES
Rem      
Rem
Rem    MODIFIED   (MM/DD/YY)
Rem    echong      06/24/99 - rename
Rem    syeung      06/22/98 - add subpartition_name
Rem    echong      06/05/98 - chained rows table with urowid type
Rem    echong      06/05/98 - Created
Rem

create table CHAINED_ROWS (
  owner_name         varchar2(30),
  table_name         varchar2(30),
  cluster_name       varchar2(30),
  partition_name     varchar2(30),
  subpartition_name  varchar2(30),
  head_rowid         urowid,
  analyze_timestamp  date
);

Generally speaking you probably won’t need the table using the UROWID data type but either table can be used to report chained rows in Oracle databases.

Once the CHAINED_ROWS table is created it’s a simple task to generate a script to analyze all of the tables in a given schema for chained rows:


select 'analyze table '||owner||'.'||table_name||' list chained rows;'
from dba_tables
where owner = upper('&1');

Spool the output from that query to a file then execute it from the SQL> prompt. Remember that by default the command inserts data into the CHAINED ROWS table. You can, if you’re feeling adventurous, create a similar table with a different name, and have ANALYZE TABLE insert the chained row data into your table by telling Oracle where you want the data to go. For example if you decide to create a table named BAD_JUJU with the same structure as the CHAINED_ROWS table you can do this:


select 'analyze table '||owner||'.'||table_name||' list chained rows into bad_juju;'
from dba_tables
where owner = upper('&1');

You can then query your BAD_JUJU table for chained row information.

If the ‘table fetch continued row’ statistic reports 0 then you have nothing to worry about and nothing to ‘fix’:


SQL> select value from v$sysstat where name = 'table fetch continued row';

     VALUE
----------
         0

SQL>

In reality that isn’t likely to happen as some of the SYS-owned tables can, and do, contain chained rows. This is the result from a small, personal database I have on my laptop:


SQL> select value from v$sysstat where name = 'table fetch continued row';

     VALUE
----------
       778

SQL>

Let’s see where those chained rows are:


SQL> select owner_name, table_name, count(*) chain_ct
  2  from bad_juju
  3  group by owner_name, table_name;

OWNER_NAME                     TABLE_NAME                       CHAIN_CT
------------------------------ ------------------------------ ----------
SYS                            SMON_SCN_TIME                           1
SYS                            TABPART$                                1
SYS                            WRI$_DBU_FEATURE_USAGE                  5
APEX_030200                    WWV_FLOW_STEPS                         10
SYS                            IDL_CHAR$                             165
SYS                            METASTYLESHEET                        113
APEX_030200                    WWV_FLOW_TEMPLATES                     18
MDSYS                          SDO_DATUMS                              2
SYS                            COLLECTION$                             2
SYS                            IDL_UB1$                             6683
APEX_030200                    WWV_FLOW_STEP_PROCESSING               57
MDSYS                          SDO_ELLIPSOIDS                          8
SYS                            HIST_HEAD$                            265
SYS                            USER$                                   1
MDSYS                          SDO_COORD_OPS                          45
SYS                            IDL_SB4$                               31
SYS                            IND$                                    4
SYS                            METHOD$                                 3
SYS                            RESULT$                                 3
SYS                            TRIGGER$                                1
APEX_030200                    WWV_FLOW_PAGE_PLUG_TEMPLATES            1
SYS                            PARAMETER$                             13
SYS                            TAB$                                    2
SYS                            VIEW$                                  67
APEX_030200                    WWV_FLOW_PAGE_PLUGS                    36
MDSYS                          SDO_COORD_OP_PARAM_VALS               116
MDSYS                          SDO_COORD_OP_PATHS                     53
SYS                            ATTRIBUTE$                             29
SYS                            IDL_UB2$                              837
SYS                            JAVA$MC$                              124

30 rows selected.

SQL>

Notice that no user tables or vendor-supplied application schemas have chained rows, only Oracle-supplied tables, so do not be confused if you analyze the tables in a given schema and find there are no chained rows to address. Is it a good idea to change PCTFREE and PCTUSED on those Oracle-supplied tables? It’s your decision to make; if recursive sql calls to these tables are consuming large amounts of time it may be beneficial to change those values. I really suspect it’s not that big of a performance hit and you’d be better off addressing chained rows, should they exist, in any application schemas present in your database.

With Exadata chained rows do more than simply increase the I/O workload, they can shift it from the storage cells to the database server by reverting to regular block I/O rather than using a Smart Scan. Since Smart Scans can’t process chained rows that span disks and/or storage cells this eliminates any use of Storage Indexes, Predicate Filtering and Column Projection. There may be chained rows that can be processed by a Smart Scan (where the row is chained on the same disk accessed by the same storage cell) but those are very likely rare occurrences. Exadata provides three statistics regarding chained rows and their processing — ‘chained rows processed by cell’, ‘chained rows rejected by cell’ and ‘chained rows skipped by cell’. These can report the number of rows in each ‘category’ from the time the system was started. Each statistic, and what it reports, is listed below:


chained rows processed by cell

This statistic, chained rows processed by cell, reports on the chained rows that were processed within a storage cell, that is, where the row pieces (or at least the ones Oracle needed to return results) are located within the storage accessed by a given storage cell.  This is known as inter-block chaining, where the row piece is located in the same data block as the head row.  This is a special case of row chaining for rows with more than 255 columns; the next piece is in the same block as the head and can be fetched with no additional effort.

chained rows rejected by cell

This counter records the chained rows where the next piece is not in the same block or cell, as described in the previous section.  This statistic is incremented in, apparently, special cases as it is not incremented very often in the systems I have dealt with.  When this is incremented the Smart Scan fell back to regular block I/O to process the row.

chained rows skipped by cell

This is the statistic most often incremented when a Smart Scan needs to revert back to regular block I/O.  Again it's incremented when a chained row is found and the remaining pieces reside across the entire storage stripe, spanning storage cells in the process.  It isn't clear when the previous counter or this counter should be incremented in such situations; I have found that this counter is the one most often incremented when Smart Scans revert to regular block I/O to process chained rows.

The same methods for non-Exadata databases can be applied to those on Exadata; since ASM stripes the diskgroup across all available disks it may not be possible to correct every chained row that exists. That does not mean that the chained/migrated row situation should not be addressed, but remember that the storage volume on Exadata systems is often times far larger than on conventional databases using commodity hardware and that fact can make any process intended to reduce or eliminate chained rows take longer. If the performance hit from the chained rows is large enough any effort to reduce that I/O workload is worth the time and the trouble it may take to implement.

Chained and migrated rows can be a bother, and can introduce performance issues when dealing with large volumes of them. Sometimes it’s not possible to eliminate them because the row length exceeds even the largest available database block size. But there are ways to mitigate their creation for tables where the row length easily fits within the database block, either by altering how the block space is allocated using PCTUSED and PCTFREE, moving the table or tables to a tablespace having a larger block size or by moving the table, in place, to relocate the rows and reduce, or eliminate, the existing chaining. This isn’t a quick operation, and it certainly should not be done online especially in production databases, but the problem, in many cases, can be successfully addressed.

See all articles by David Fitzjarrell

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Get the Free Newsletter!

Subscribe to Cloud Insider for top news, trends & analysis

Latest Articles