Getting Inside the Optimizer
June 27, 2007
Everyone has them, even you. Dont be afraid to admit you have some less than optimal code consuming resources in one or more of your production databases. The tuning advisors output of, Hey DBA, you should do a better job tuning statement with a SQL ID 8ycxdfgyre2 is useful, but what happens when the session information is gone? The output shown below isnt exactly very helpful either.
When Oracle is looking at other (and hopefully better) plans, what does the optimizer look at? Put another way, if you had a tool that could display different execution plans, wouldnt it be useful to see what those plans are, and even better, pick up some ideas on how to improve the efficiency of your SQL code? Toads SQL Optimizer tool can do just that, and its interesting to see what Toad (or Oracle behind the scene) comes up with in terms of alternative execution plans.
Ill offer up some code in a production environment Im familiar with (its a data warehouse) and toss it into Toads SQL Optimizer. Our guinea pig code selects several items from a joined view and table, and is written as shown below.
Select Product_id, Email_Format_id, Build_id, Email_Address_id, Email_id, 'V', Build_Date, trunc(build_date), Email_Message_id, Content_Sub_Product, Dw_Unique_ID, Domain_Id FROM Build_Stage_View a, Email_Address_Dim b WHERE a.email_address = b.email_address AND build_date = :i_build_date AND product_id = :i_product_id AND build_id = :i_build_id AND email_format_code = :i_email_format_code AND log_id is null;
The steps to get to the different execution plans (and their associated costs) are to open a SQL Editor window in a session as the owner of the code/objects, paste in the code, click on the tuning fork icon, wait for the initial analysis to be completed, and then click Optimize in the icon array within the SQL Optimizer tabbed canvas.
There are two options you can set, and these are analogous to setting a value for deep thinking in a chess program. In the top right area of the Toad interface, there are what appear to be hyperlinks for setting the optimization and index expert intelligence levels. I set the levels to 5 for this example, and Toad originally had them at 4. The higher the level, the longer it will take to get results.
Shown below is what Toad returns to you.
The first item that stands out is that the Original SQL was (relatively speaking) good to begin with as it shares the lowest overall cost with another option. Is this outcome due to luck, or was some thought put into how the original statement was crafted to begin with? Having the advantage of knowing the schema, the original SQL was written following two very basic, and as it turns out, effective guidelines found in many books and in Oracles documentation: apply the most restrictive filtering early on in the WHERE clause, and use indexes on columns involved in joins.
In the seven alternatives presented by Toad, none of them involved rewriting of the original statement. All of the alternatives used hints, and just using hints alone made the cost vary from 376 to 6180. The use of a hint in newer versions of Oracle should be avoided unless you know something the CBO doesnt or wont know. In RBO (rule based optimizer) days, using hints was pretty much required if you wanted to really tweak performance.
The hints used in the alternatives are (in order of lowest to highest resulting cost):
Hints are explained in Chapter 16 of the Performance Tuning Guide, and the details about the hints are contained in Chapter 2 (under Comments) in the SQL Reference Guide. Using the USE_MERGE hint is fairly common when joined tables are involved, as are USE_NL and USE_HASH. In this example, however, tossing in a USE_MERGE hint made that alternative the more costly plan.
To Index or Not?
Aside from cost, seeing the other execution plans is valuable in that you can see where, and just as importantly, where/when not, indexes are used.
Putting an index on a table does not mean that index will be used. It all boils down to selectivity. If Oracle determines that a full table scan is better (or required), then the index is not used. Step 6 shows that a full table scan was used, and the comment about that step (also shown in Toad in a dialog frame) affirms that output by stating Every row in the table <OWNER.TABLE_NAME> is read.
This brings up another point: if the index was not used, is it ever used? After all, whats the point of creating and maintaining an index if it is never used? A quick and easy way to determine whether or not an index is being used is to turn on monitoring, and yes, you can have Oracle monitor more than one index at a time. ALTER INDEX <INDEX_NAME> MONITORING USAGE turns it on, and NOMONITORING USAGE turns it off. To view the usage (in terms of yes or no), run a query against the V$OBJECT_USAGE dynamic view.
It should be obvious that just because a particular statement (i.e., one that you are investigating/analyzing) doesnt use an index that no statements will not or are not at some point in the future. It may be days before an index is used, so give the monitoring enough time to watch the index usage. A week should be long enough to be able to make that determination without having to hunt down every job or cron. On the other hand, you may need to selectively target when monitoring takes place if the statement in question is related to an end of month type of scenario.
Another factor in all of whats been presented so far concerns statistics, and if you want to be more sophisticated, toss in stored outlines. A discussion about gathering statistics and using outlines goes beyond the scope of this article, but as a minimum, feeding fresh information (statistics) into the optimizer is an important consideration.
In a follow up article to this one, well take a look at query rewrites. Alternative scenarios involving query rewrites generated by Toads SQL Optimizer are also quite informative in that they can show you better ways or different methods of coding a statement. From an instructional standpoint, the query rewrites can help reinforce best practices and help clear up questions about how to approach common situations. One such situation is the EXISTS versus IN question. Seeing a rewrite of an in-line query makes the rule easy to follow: larger table on the outside uses IN on the smaller table, and smaller table on the outside uses EXISTS on the larger table.