Using Index-Organized Tables in Oracle

Since version 9.2, Oracle has provided an interesting construct called the Index-Organized Table, or IOT. Such tables are basically indexes with the leaf nodes as the actual table data. Under the right conditions using IOTs can be very beneficial, and under the wrong conditions an IOT can be more of a problem than a solution. Let’s look at what an IOT is, the proper conditions for its use and what can happen when those conditions aren’t followed.

The IOT was designed for use with short rows accessed solely through range scans of the primary key index. Since an IOT orders the table data by the primary key values this speeds up read access considerably, and the short rows ensure that small amounts of data are all that are fetched. Of course the down side to this is that inserts behave like index inserts, not table inserts, as the key order must be maintained. This increases the work an insert must perform but if the rows are short the benefits can outweigh this cost. An example, by Jonathan Lewis, highlights how the cost of using an IOT can be a problem when the right conditions aren’t met.

Jonathan’s script is shown below; it creates an IOT with an overflow segment and places all columns except the primary key into that overflow segment:

create table iot1 (
        id1     number(7.0),
        id2     number(7.0),
        v1      varchar2(10),
        v2      varchar2(10),
        padding varchar2(500),
        constraint iot1_pk primary key(id1, id2)
organization index
including id2

insert into iot1
with generator as (
        select  --+ materialize
                rownum id
        from dual
        connect by
                level <= 1e4
        mod(rownum,311)                 id1,
        mod(rownum,337)                 id2,
        to_char(mod(rownum,20))         v1,
        to_char(trunc(rownum/100))      v2,
        rpad('x',500,'x')               padding
        generator       v1,
        generator       v2
        rownum <= 1e5 ; commit; begin dbms_stats.gather_table_stats( ownname => user,
                tabname          => 'IOT1'
                method_opt       => 'for all columns size 1'

alter system flush buffer_cache;

select table_name, blocks from user_tables where table_name = 'IOT1' or table_name like 'SYS_IOT_OVER%';
select index_name, leaf_blocks from user_indexes where table_name = 'IOT1';

set autotrace traceonly
select max(v2) from iot1;
set autotrace off

Once the table is created, statistics are collected and a query is run that would generate a full table scan plan for a heap table. Since, for an IOT, there is no table the query generates an index fast full scan but the statistics reported don’t match up with those an index fast full scan would produce:

SQL> select table_name, blocks from user_tables where table_name = 'IOT1' or table_name like 'SYS_IOT_OVER%';

TABLE_NAME                 BLOCKS
-------------------------- ----------
SYS_IOT_OVER_133724IOT1          8074

SQL> select index_name, leaf_blocks from user_indexes where table_name = 'IOT1';

-------------------- -----------
IOT1_PK                      504

SQL> set autotrace traceonly
SQL> select max(v2) from iot1;

| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT      |         |     1 |     4 | 99793   (1)| 00:00:04 |
|   1 |  SORT AGGREGATE       |         |     1 |     4 |            |          |
|   2 |   INDEX FAST FULL SCAN| IOT1_PK |   100K|   390K| 99793   (1)| 00:00:04 |

     100376  consistent gets
       8052  physical reads

SQL> set autotrace off

Oracle reports the index segment has 504 leaf blocks and the overflow segment contains 8074 blocks. The physical reads for this IOT pretty much equal the number of blocks in the overflow segment, which indicates that more is going on than a simple index fast full scan. The cause of this is the overflow segment, since that is where the V2 values reside. Oracle can’t simply scan the overflow segment due to the design of the IOT so it needs to visit the index then scan the overflow segment for the desired value or values. This is supported by the “table fetch continued row” statistic, which was close to the value reported for “consistent gets”; the overflow segment essentially contains chained rows (chained to the primary key value) and therefore it’s logical to see a “table fetch continued row” statistic that large for an IOT constructed like the one in the example shown here. This increases the cost of using an IOT, sometimes considerably.

When used under the proper conditions an IOT can be a performance booster, especially for small lookup tables with short rows. Trying to shoehorn an OLTP application table into the IOT configuration with an overflow segment may prove less than desirable as performance could suffer due to Oracle having to scan that overflow segment to return the desired values.

See all articles by David Fitzjarrell

David Fitzjarrell
David Fitzjarrell
David Fitzjarrell has more than 20 years of administration experience with various releases of the Oracle DBMS. He has installed the Oracle software on many platforms, including UNIX, Windows and Linux, and monitored and tuned performance in those environments. He is knowledgeable in the traditional tools for performance tuning – the Oracle Wait Interface, Statspack, event 10046 and 10053 traces, tkprof, explain plan and autotrace – and has used these to great advantage at the U.S. Postal Service, American Airlines/SABRE, ConocoPhilips and SiriusXM Radio, among others, to increase throughput and improve the quality of the production system. He has also set up scripts to regularly monitor available space and set thresholds to notify DBAs of impending space shortages before they affect the production environment. These scripts generate data which can also used to trend database growth over time, aiding in capacity planning. He has used RMAN, Streams, RAC and Data Guard in Oracle installations to ensure full recoverability and failover capabilities as well as high availability, and has configured a 'cascading' set of DR databases using the primary DR databases as the source, managing the archivelog transfers manually and montoring, through scripts, the health of these secondary DR databases. He has also used ASM, ASMM and ASSM to improve performance and manage storage and shared memory.

Latest Articles