NULL values can be a performance issue as, absent an index of some sort, searching for them involves a full table scan. For small tables (less than 10,000 rows) this usually isn’t a time sink; however, in enterprise tables (with millions of rows) a full table scan can consume considerable time. Oracle provides two ways to create an index that can reliably be used with nulls; a bitmap index and a function-based index. Let’s look at these two in action and see when each is applicable.
Start by creating at table with nullable and non-nullable columns:
SQL> --
SQL> -- Create a table with nullable columns
SQL> --
SQL>
SQL> create table nulltest(
2 my_id number not null,
3 could_be_null varchar2(40),
4 is_not_null varchar2(40) not null);
Table created.
NULL values are selectively loaded into the table using the mod() function, and statistics are gathered:
SQL>
SQL> --
SQL> -- Load data so that indexes
SQL> -- will be able to find NULLs
SQL> --
SQL> -- Only FBIs and bitmap indexes
SQL> -- do this
SQL> --
SQL>
SQL> begin
2 for i in 1..1000 loop
3 if mod(i,73) = 0 then
4 insert into nulltest
5 values(i, null, 'This is not a null value');
6 else
7 insert into nulltest
8 values(i, 'In among the NULLs','This is still not a null value');
9 end if;
10 end loop;
11
12 commit;
13 end;
14 /
PL/SQL procedure successfully completed.
SQL>
SQL> --
SQL> -- Gather stats
SQL> --
SQL>
SQL> exec dbms_stats.gather_table_stats(null, 'NULLTEST');
PL/SQL procedure successfully completed.
A common query for NULL values is executed against the table with the expected results:
SQL>
SQL> --
SQL> -- Query for NULL values
SQL> --
SQL> -- Autotrace to get the plan
SQL> --
SQL> -- No surprise, it will be a full table scan
SQL> -- and the my_id values are not in order from
SQL> -- low to high
SQL> --
SQL>
SQL> set autotrace on
SQL> select * From nulltest where could_be_null is null;
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
511 This is not a null value
584 This is not a null value
657 This is not a null value
73 This is not a null value
146 This is not a null value
219 This is not a null value
292 This is not a null value
365 This is not a null value
438 This is not a null value
730 This is not a null value
803 This is not a null value
876 This is not a null value
949 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3036052294
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 702 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| NULLTEST | 13 | 702 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COULD_BE_NULL" IS NULL)
SQL>
A function-based index can be used to speed access to rows containing NULL values, which is a very good plan if the application code is already using a NVL() construct to filter NULL values:
SQL>
SQL> --
SQL> -- Let's try to fix that with an FBI
SQL> --
SQL> -- Works but the function found in the index
SQL> -- MUST be used in the query
SQL> --
SQL> -- Good for application code already using functions
SQL> -- to manage NULL values as the function in the code
SQL> -- will be used in the index
SQL> --
SQL> -- The bad part of that is additional application
SQL> -- code that doesn't use exactly the same function
SQL> -- call won't use the index
SQL> --
SQL>
SQL> create index nulltest_fbi on nulltest(nvl(could_be_null, 'X'));
Index created.
SQL>
SQL> --
SQL> -- Query for NULL values
SQL> --
SQL> -- Autotrace to get the plan
SQL> --
SQL> -- Existing application code will work
SQL> --
SQL> -- New application code won't if the function
SQL> -- call isn't exactly the same
SQL> --
SQL>
SQL> --
SQL> -- Works, matches application code
SQL> --
SQL> select * From nulltest where nvl(could_be_null, 'X') = 'X';
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
511 This is not a null value
584 This is not a null value
657 This is not a null value
73 This is not a null value
146 This is not a null value
219 This is not a null value
292 This is not a null value
365 This is not a null value
438 This is not a null value
730 This is not a null value
803 This is not a null value
876 This is not a null value
949 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2411740502
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 540 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| NULLTEST | 10 | 540 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | NULLTEST_FBI | 4 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(NVL("COULD_BE_NULL",'X')='X')
SQL> --
SQL> -- Doesn't 'work', doesn't match application code
SQL> --
SQL> select * From nulltest where nvl(could_be_null, 'Y') = 'Y';
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
511 This is not a null value
584 This is not a null value
657 This is not a null value
73 This is not a null value
146 This is not a null value
219 This is not a null value
292 This is not a null value
365 This is not a null value
438 This is not a null value
730 This is not a null value
803 This is not a null value
876 This is not a null value
949 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3036052294
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 756 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| NULLTEST | 14 | 756 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(NVL("COULD_BE_NULL",'Y')='Y')
SQL>
Hopefully the same developer that wrote the code originally is still around managing the package or procedure so that additional code, filtering NULL values, will use the same function originally called. If, on the other hand, a new developer adds code that doesn’t match the original NVL() call (a real possibility with modular code), then the index will, unfortunately, not be used, as illustrated above. One way around this issue is to use a bitmap index, an index structure that does index NULL values. If you drop the original function-based index and replace it with a bitmap version you see:
SQL>
SQL> --
SQL> -- Drop the FBI and start over
SQL> --
SQL>
SQL> drop index nulltest_fbi;
Index dropped.
SQL>
SQL> --
SQL> -- Create a bitmap index instead
SQL> --
SQL> -- Works without code modification
SQL> --
SQL>
SQL> create bitmap index nulltest_bmp_idx on nulltest(could_be_null);
Index created.
SQL>
SQL> --
SQL> -- Query for NULL values
SQL> --
SQL> -- Autotrace to get the plan
SQL> --
SQL> -- Since it's a 'straight' "IS NULL" condition
SQL> -- the index works
SQL> --
SQL> -- Not dependent on application coding
SQL> --
SQL>
SQL> --
SQL> -- Expected to work here
SQL> --
SQL> select * From nulltest where could_be_null is null;
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
511 This is not a null value
584 This is not a null value
657 This is not a null value
73 This is not a null value
146 This is not a null value
219 This is not a null value
292 This is not a null value
365 This is not a null value
438 This is not a null value
730 This is not a null value
803 This is not a null value
876 This is not a null value
949 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2563750542
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 702 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| NULLTEST | 13 | 702 | 3 (0)| 00:00:01 |
| 2 | BITMAP CONVERSION TO ROWIDS | | | | | |
|* 3 | BITMAP INDEX SINGLE VALUE | NULLTEST_BMP_IDX | | | | |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("COULD_BE_NULL" IS NULL)
SQL> --
SQL> -- Even works here
SQL> --
SQL> select count(*) from nulltest where could_be_null is null;
COUNT(*)
----------
13
Execution Plan
----------------------------------------------------------
Plan hash value: 4083363316
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 19 | | |
| 2 | BITMAP CONVERSION COUNT | | 13 | 247 | 1 (0)| 00:00:01 |
|* 3 | BITMAP INDEX FAST FULL SCAN| NULLTEST_BMP_IDX | | | | |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("COULD_BE_NULL" IS NULL)
SQL>
With a bitmap index, the plan shows the index was accessed to find the NULLs; however, the predicate information shows that Oracle filtered the results, rather than use them to directly access the rows as in the prior query. The fast full scan returns all of the index entries, and Oracle then filters out the null-referencing bits. It’s still much faster than a full table scan to find NULL values.
A ‘traditional’ B-Tree index can also include NULLs if the entire key isn’t NULL, but the nullable column must be first in the column list or else Oracle won’t use it. Create an index with the nullable column second in the list and see what happens:
SQL>
SQL> --
SQL> -- Let's try to index more than one column
SQL> --
SQL> -- Drop the bitmap index first
SQL> --
SQL>
SQL> drop index nulltest_bmp_idx;
Index dropped.
SQL>
SQL> --
SQL> -- Create a plain-vanilla index with
SQL> -- a not-null and nullable column
SQL> --
SQL>
SQL> create index nulltest_idx on nulltest(is_not_null, could_be_null);
Index created.
SQL>
SQL> --
SQL> -- Let's see if Oracle uses it
SQL> --
SQL>
SQL> set autotrace on
SQL> select * From nulltest where could_be_null is null;
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
511 This is not a null value
584 This is not a null value
657 This is not a null value
73 This is not a null value
146 This is not a null value
219 This is not a null value
292 This is not a null value
365 This is not a null value
438 This is not a null value
730 This is not a null value
803 This is not a null value
876 This is not a null value
949 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3036052294
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 702 | 5 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| NULLTEST | 13 | 702 | 5 (0)| 00:00:01 |
------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("COULD_BE_NULL" IS NULL)
SQL>
Now reverse the column order and try again:
SQL>
SQL> --
SQL> -- Let's try again, reversing the column order
SQL> --
SQL>
SQL> drop index nulltest_idx;
Index dropped.
SQL>
SQL> create index nulltest_idx on nulltest(could_be_null, is_not_null);
Index created.
SQL>
SQL> --
SQL> -- Let's see if Oracle uses it
SQL> --
SQL>
SQL> set autotrace on
SQL> select * From nulltest where could_be_null is null;
MY_ID COULD_BE_NULL IS_NOT_NULL
---------- ---------------------------------------- ----------------------------------------
146 This is not a null value
219 This is not a null value
292 This is not a null value
438 This is not a null value
511 This is not a null value
584 This is not a null value
657 This is not a null value
365 This is not a null value
73 This is not a null value
876 This is not a null value
949 This is not a null value
730 This is not a null value
803 This is not a null value
13 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3191058504
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 13 | 702 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| NULLTEST | 13 | 702 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | NULLTEST_IDX | 13 | | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COULD_BE_NULL" IS NULL)
SQL>
Remember, a traditional B-Tree index may not be the most reliable index to create for NULL values; the nullable column must be the leading column and the entire index key cannot be NULL. Given those constraints the best choices are to use a function-based index (especially good when application code already uses a NVL() call as the exact call in the code can be used to build the index) or to use a bitmap index (which indexes NULLs absent any function calls). Either one can improve application or query performance by eliminating a full table scan.