An interesting question was posted on the Oracle General Database forums recently inquiring about creating a B-Tree index on a BLOB to speed queries looking for NULLs in that BLOB. The conventional wisdom states that, because it’s a BLOB, NULLs shouldn’t be present. And because it’s a BLOB, containing binary data, creating such an index shouldn’t be possible, but through the magic of Oracle (and a possible bug) that ‘wisdom’ was proven wrong. Let’s look at that situation and see how someone can make an ‘end run’ around Oracle and do the seemingly impossible.
For those who don’t use them or who may not be familiar with the term, a BLOB is a Binary Large OBject, a datatype that stores, well, binary data. As such there should be no NULL values in binary data (it’s all 0s and 1s and the character(?) associated with NULL has a binary representation of 0 [looking at an ASCII table]) the benefit of having such an index is questionable, at best. It IS possible to set a BLOB to NULL, however, as the following example proves:
SQL> update blobbo set yorm = null;
1 row updated.
SQL>
Even with that it can make someone wonder what actual benefit a B-Tree index can provide on binary data. Looking at the complete example that was run may provide some additional insight. It begins, simply enough, by creating a table to ‘experiment on’ and then trying to create a B-Tree index on all columns, including the BLOB:
SQL>
SQL>
SQL> set echo on linesize 150 pagesize 60
SQL>
SQL> create table blobbo(
2 yorm blob,
3 qwert number default 4);
Table created.
SQL>
SQL> create index blobbo_blob_ix on blobbo(yorm, qwert);
create index blobbo_blob_ix on blobbo(yorm, qwert)
*
ERROR at line 1:
ORA-02327: cannot create index on expression with datatype LOB
SQL>
The initial attempt to create the index failed, even with the non-BLOB column specified. One can create a B-Tree index by specifying a constant value along with the column name, which is the workaround for non-BLOB columns to get NULLs in the index [since the entire key is not NULL] so let’s try that avenue of attack:
SQL>
SQL> create index blobbo_blob_ix on blobbo(yorm, 'X');
Index created.
SQL>
And we have a B-Tree index on a BLOB. As shown above, the BLOB can be set to NULL by assignment:
SQL>
SQL> insert into blobbo(yorm) values(empty_blob());
1 row created.
SQL>
SQL> update blobbo set yorm = null;
1 row updated.
SQL>
Since a NULL and the LOB locator established by using the empty_blob() function look the same to the naked eye it’s difficult to tell if the query output shows a NULL exists:
SQL>
SQL> select * from blobbo;
YORM
------------------------------------------------------------------------------------------------------------------------------------------------------
QWERT
----------
4
SQL>
The execution plan shows what is expected, a full table scan for the unqualified query:
SQL>
SQL> select * from table(dbms_xplan.display_Cursor());
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4gcbmd14ap73j, child number 0
-------------------------------------
select * from blobbo
Plan hash value: 925165977
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
| 1 | TABLE ACCESS FULL| BLOBBO | 1 | 2015 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
18 rows selected.
SQL>
Writing a qualified query, to find records where our BLOB is NULL, shows a different, but expected, plan:
SQL>
SQL> select * from blobbo where yorm is null;
YORM
------------------------------------------------------------------------------------------------------------------------------------------------------
QWERT
----------
4
SQL>
SQL> select * from table(dbms_xplan.display_Cursor());
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID cz0x5cbkwqkbu, child number 0
-------------------------------------
select * from blobbo where yorm is null
Plan hash value: 1649995771
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1 (100)| |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BLOBBO | 1 | 2015 | 1 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | BLOBBO_BLOB_IX | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("YORM" IS NULL)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
24 rows selected.
SQL>
Setting the BLOB to an actual, non-NULL value and querying the table shows the expected non-index plan, because we have no function-based index on the BLOB (and actually, would have no reason to attempt to create one):
SQL>
SQL> update blobbo set yorm = hextoraw(rpad('EAEAEAFF',32764, 'CDEF')) where qwert = 4;
1 row updated.
SQL>
SQL> commit;
Commit complete.
SQL>
SQL> select * from blobbo where yorm is not null;
YORM
------------------------------------------------------------------------------------------------------------------------------------------------------
QWERT
----------
EAEAEAFFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCD
EFCDEFCDEF
4
SQL>
SQL> select * from table(dbms_xplan.display_Cursor());
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 3d24twzgz2zry, child number 0
-------------------------------------
select * from blobbo where yorm is not null
Plan hash value: 925165977
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | TABLE ACCESS FULL| BLOBBO | 1 | 2015 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("YORM" IS NOT NULL)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
23 rows selected.
SQL>
SQL> select * from blobbo where dbms_lob.instr(yorm,hextoraw('EAEAEAFF')) > 0;
YORM
------------------------------------------------------------------------------------------------------------------------------------------------------
QWERT
----------
EAEAEAFFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCDEFCD
EFCDEFCDEF
4
SQL>
SQL> select * from table(dbms_xplan.display_Cursor());
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID bj30wyv2sf5jg, child number 0
-------------------------------------
select * from blobbo where dbms_lob.instr(yorm,hextoraw('EAEAEAFF')) > 0
Plan hash value: 925165977
----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 3 (100)| |
|* 1 | TABLE ACCESS FULL| BLOBBO | 1 | 2015 | 3 (0)| 00:00:01 |
----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DBMS_LOB"."INSTR"("YORM",HEXTORAW('EAEAEAFF'))>0)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
23 rows selected.
SQL>
As mentioned earlier the ability to create such an index on a BLOB is likely a bug that should be addressed, but likely won’t be because it’s not common to want a B-Tree index on a BLOB, much less common to create one. One hint that this might be a bug is the fact that an index key can only be as long as 80% of the column length, and since a BLOB easily exceeds the maximum length of a VARCHAR2 field, problems can arise from such an index. With the index in place on our BLOB, trying to get the length of it throws an error:
SQL> select dbms_lob.getlength(yorm) from blobbo;
select dbms_lob.getlength(yorm) from blobbo
*
ERROR at line 1:
ORA-01555: snapshot too old: rollback segment number 1 with name "_SYSSMU1_1607784644$" too small
ORA-06512: at "SYS.DBMS_LOB", line 837
SQL>
Dropping the ‘errant’ index solves the problem:
SQL>
SQL> drop index blobbo_blob_ix;
Index dropped.
SQL>
SQL> select dbms_lob.getlength(yorm) from blobbo;
DBMS_LOB.GETLENGTH(YORM)
------------------------
16382
SQL>
Even though a B-Tree index on a BLOB column can be created it seems to be of limited, if not questionable, use since NULLs (as ‘seen’ in character data) can’t exist in the binary data stream. It also appears to create problems when performing standard operations on a BLOB, such as finding the length.
Sometimes what