Oracle Indexing - What, Where, When?
April 10, 2008
If you are new to databases, or perhaps new to Oracle, you may find the discussion on indexes and indexing strategy complicated. Don't fret. To get started it's fairly straightforward, and as long as you pay attention to the options relevant to day-to-day dba needs, it should remain fairly simple.
Common Usage Indexes
The most common index type is the b-tree index. It is named b-tree after a computer science construct of the same name. Whenever you issue the basic CREATE INDEX statement without further modifications, you're creating a b-tree index. Without going into a lot of depth about b-trees which you can investigate on your own, basically these store the values of the column you have created the index on, and pointers to the actual table data to find the row itself. Keep in mind that also means multiple lookups, one for various nodes and the leaf node of the index, and then the table row itself. That's why Oracle's optimizer will choose in certain circumstances to do full table scans rather than index lookup, because it may actually be faster. Also note that if your index is on more than one column, the leading column is very important. For example if you have a multi-column (called concatenated) index on columns c and d in that order, you can also do a query on column c alone and use that index. In some other cases using Oracle's skip- scan technology, one can do a query on non-leading columns as well. Do a google search on "skip scan index access" for details.
We mentioned that Oracle would choose not to use an index sometimes, if you're reading a lot of rows, or your index is not selective, or you're using a column other than the leading one in a concatenated index. What about if you want to do a case-insensitive search? Something like:
WHERE UPPER(first_name) = 'JOHN'
This won't use an index on first_name. Why? Because Oracle would have to go and apply the UPPER function on ALL values in the index, so it might as well do the full table scan. This was such a common need that Oracle created the function-based index for this purpose.
reverse key indexes
You also may see these indexes, or want to use them from time to time. Consider a column, which includes names like "restaurant A", "restaurant B", "restaurant C" and so on. Perhaps a not very glamorous example, but the point is a column with many unique values but not much variation at the front. Using a reverse-key index would be ideal here, because Oracle will simple REVERSE the string before throwing it into the b-tree. So, the result will be a more balanced, useful, and ultimately fast index.
More Exotic Index Types on Offer
Oracle offers quite a few more sophisticated types of indexes as well. Please note, these should be used after you've fully read the docs as they fill very specific niches.
Have a column, which is not very selective, such as gender? You might consider using a bitmap index on it. That's what they were created for. But also consider what's happening behind the scenes. Generally bitmap indexes become useful when you have a whole bunch of them on different columns so that they can all be used together to be more selective on rows that otherwise you'd need a full table scan for. So one, use them when you can have quite a few on different columns. Secondly, these indexes were designed for data warehouses, so the presumption is data that does not change much. They are not meant for transactional or high update databases. Updates on tables with bitmap indexes are, shall we say, less than efficient.
bitmap join indexes
These indexes take bitmap indexes one step further. They completely take the bitmapped columns out of the table data, and store it in the index. The presumption is that those sets of columns will always be queried together. Again, these are meant for data warehousing databases. The create statement looks like a CREATE BITMAP INDEX except it has a WHERE clause at the end!
This is really an option to a normal b-tree index. It results in fewer leaf nodes, so less overall I/O and less to cache. All of this means Oracle's optimizer will be just a little more likely to use these than a normal uncompressed index. There is a cost to all of this, in CPU to uncompress when you access these. Also, in reading about how the optimizer uses these, and choosing the proper compression level, it begins to look daunting. Your mileage may vary.
These are a special type of function-based index. They are obviously optimized for ORDER BY x, y, z DESC clauses.
If you have a partitioned table, a whole world of new index types opens up to you, from ones that index across all the partitions (global) to ones that are focused on each partition individually (local). Check the documentation for details.
index organized tables
Imagine you take your concatenated index, and extend it to all the columns in the table. Then you can remove the table itself. That's what an index organized table is.
I personally have never seen these in the wild. All I've read is they have performance issues left and right. Basically, you take two tables with one column in common, and that column has a cluster index on it.
These indexes are used when creating custom indextypes for user defined datatypes.
These are new in 11g. They are created as a normal index, but invisible to the cost based optimizer. This can allow you to test out performance of large queries, without impacting the live running application.
virtual (no segment) indexes
Another tool for the testers and developers. They allow you to test new indexes and their effect on query plans, without actually building them. On gigabyte tables, the index build can be very resource intensive, and take a lot of time. See also the Virtual Index Wizard of OEM.
There are other types of indexes as well, such as Oracle TEXT for indexing CLOB or other large text data, and Oracle Spatial. Investigation of those is left as an exercise to the reader.
It's All About the Optimizer
Having worked extensively with MySQL, and some other databases, I can tell you it is not Oracle's user-friendliness that makes it the world leader. I could go digress on this point, but primarily Oracle's bread and butter is it's optimizer. This is the special sauce. And it keeps getting better and better. There are whole books written on the topic of the Oracle's CBO (Cost Based Optimizer) discussing hints (comments embedded in SQL to push the optimizer one way or another), strategies for analyzing your tables and indexes, and histograms for those finicky columns where data distribution is not balanced.
Besides keeping your statistics up to date, you'll want to always test your new queries. Use the explain plan mechanism, and optimize to reduce overall I/O and computational sorting and merging of data, and you will be on the path to better performance.
Although the Oracle landscape of index types can be intimidating, there are really only a few that you'll use most often day-to-day. Furthermore, the optimizer has gotten so good that despite what the naysayers may go on about; on the whole Oracle is good at getting your data efficiently. That doesn't mean you don't need to tune your SQL, but it does mean if you keep your statistics up to date, and ask Oracle for the minimal dataset you need, it'll probably get it for you lickety-split!