Optimizer hints are an
interesting feature of Oracle. On one hand, the documentation goes out of its
way to tell you to use them sparingly, but on the other hand, you can choose
from more than 60 hints in 10g and more than 70 in 11g (64 and 71, to be
precise). That’s a lot to choose from, and even with a decent working knowledge
of them, you probably cannot accurately describe more than 15 to 20 of them.
You would be correct to
assume that the hints are categorized, but even with that assumption, can you
name the categories? If not by the category name Oracle uses, then if asked
about different tuning or usage scenarios, a moderately experienced DBA can hit
upon the purpose or function of a category. These questions aren’t meant to be “Trivia
Pursuit, the Oracle edition,” but rather, a means to identify and clarify some
aspects of optimizer hints.
The optimizer goals are also
values for the OPTIMIZER_MODE initialization parameter, so there is a minor bit
of crossover between a statement-level hint and a system-level setting.
The table below shows the
hints (from 11g) by category and name (and number). Italicized and asterisked
(and red if reading on the Web) hints are deprecated and are not counted in the
71 mentioned earlier. Bold font hints are new in 11g, and RULE is no longer
supported.
Optimization |
Access |
Other |
Join |
ALL_ROWS FIRST_ROWS
|
CLUSTER FULL HASH INDEX NO_INDEX INDEX_ASC INDEX_DESC INDEX_COMBINE INDEX_JOIN INDEX_FFS INDEX_SS INDEX_SS_ASC INDEX_SS_DESC NATIVE_FULL_OUTER_JOIN NO_NATIVE_FULL_OUTER_JOIN NO_INDEX_FFS NO_INDEX_SS |
APPEND NOAPPEND CACHE NOCACHE CURSOR_SHARING_EXACT DRIVING_SITE DYNAMIC_SAMPLING MODEL_MIN_ANALYSIS MONITOR NO_MONITOR OPT_PARAM PUSH_PRED NO_PUSH_PRED PUSH_SUBQ NO_PUSH_SUBQ PX_JOIN_FILTER NO_PX_JOIN_FILTER QB_NAME RESULT_CACHE NO_RESULT_CACHE |
USE_HASH NO_USE_HASH USE_MERGE NO_USE_MERGE USE_NL USE_NL_WITH_INDEX NO_USE_NL |
Join |
Query |
XML (2) |
Parallel |
ORDERED LEADING |
FACT NO_FACT MERGE NO_MERGE NO_EXPAND USE_CONCAT REWRITE NO_REWRITE NOREWRITE* UNNEST NO_UNNEST STAR_TRANSFORMATION NO_STAR_TRANSFORMATION NO_QUERY_TRANSFORMATION |
NO_XMLINDEX_REWRITE NO_XML_QUERY_REWRITE |
PARALLEL NOPARALLEL* NO_PARALLEL PARALLEL_INDEX NO_PARALLEL_INDEX NOPARALLEL_INDEX* PQ_DISTRIBUTE |
The deprecated hints are
just name changes to match other NO_WHATEVER formatting.
Several changes are found in
the installation footprint for 11g. One major addition is that of Oracle
Warehouse Builder. Another is the inclusion of SQL Developer. After starting
SQL Developer, look over on the top right part of the window and click the
“Snippets” button. If the button is not present, make it visible via the View
> Snippets menu. Change the drop-down selector to Optimizer Hints. The
default list contains 56 hints. Several index hints, MONITOR, the NATIVE ones,
OPT_PARAM, to name a few, are not listed. In other words, none of the new hints
in 11g are included in SQL Developer. That doesn’t mean they can’t be used –
the point is that the syntax is not automatically included for you. However,
you do have the option of adding more to the library.
Hint Overkill
Let’s come back to the
admonition about using hints sparingly. Are you a better coder than what Oracle
comes up with via an execution plan? I would say there are plenty of developers
and DBAs (i.e., whomever is coding SQL) who can match what Oracle comes up with
without ever having to use a hint. Any why should they have to? Having
recent/valid statistics and following best practices or recommendations for
crafting SQL statements more often than not will not need any pushing or
nudging of the optimizer.
Here is an extract from an
AWR report covering a 24-hour period. It’s more of a big picture across the day
as opposed to something more granular like an hour or so. The top five timed
events are pretty typical (as in not obscure), but how much of them is good (or
bad)?
This is an order of
magnitude difference between db file sequential read (index usage) and its
counterpart, db file scattered read (table scan). Is that good or bad? Let’s
see what’s up with background wait events. Why would a background wait even be
of interest here?
This system has millions of
waits related to log files. What’s happening with those? Using Toad (or a query
where you can get the same information), how frequently are the log files
switching?
During business hours, this
database is being slammed with log file switches. What generates input into the
redo logs? Not only DML on tables, but also what takes place with indexes, that
is, index maintenance.
The investigative path, so
far, started with a report. In actuality, it could have started with complaints
from users about an application being slow, or appearing to have become slower
lately. There was a big number for a common event. The big number by itself
doesn’t necessarily mean anything until it is put into context with related
events, items, or statistics. Sequential read and heavy redo log switching
frequency sounds like there could be a problem with too many indexes being
used, or used in the sense that redundant indexes are also being updated along
with the essential or set-covering ones. To confirm this hypothesis, we need to
look at the SQL statements, and this is where we come back to the use of hints.
In SQL ordered by Gets, the top lines bear investigating.
And further down there is a
lot of the same SQL text (a lot more than what the picture shows, but you get
the idea).
Drilling down to the SQL ID,
the jobs all point back to the same package and subprograms. Crack open the
code and it becomes almost a foregone conclusion where part of the problem lay:
lots and lots of hints, and bad ones at that (cursor name is partially obscured
for privacy).
Using the INDEX (table name
or alias, or indexspec) hint, now formally knowing it is an access path
hint, is based on what knowledge? That you want to tell Oracle to do what it
wants to do in the first place (find the best execution plan for you) or tell
Oracle that it must use an index (if it exists) when, in fact, a full table
scan would be more efficient? Or consider this: some developer is coding in
index hints that are essentially useless. Case in point: explain plans with and
without an index hint are shown below.
Base table is MY_OBJECTS,
which is a CTAS from ALL_OBJECTS (and table is analyzed after creating an index
name IDX_MY_OBJ).
SQL> explain plan for 2 select /*+ index (a) */ 3 object_name 4 from my_objects a 5 where object_type = 'INDEXTYPE'; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------- Plan hash value: 28651213 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 18 | 648 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| MY_OBJECTS | 18 | 648 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_MY_OBJ | 18 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_TYPE"='INDEXTYPE') 14 rows selected. SQL> explain plan for 2 select object_name 3 from my_objects 4 where object_type = 'INDEXTYPE'; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT -------------------------------------------------------------------------------------------- Plan hash value: 28651213 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 18 | 648 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| MY_OBJECTS | 18 | 648 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | IDX_MY_OBJ | 18 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_TYPE"='INDEXTYPE') 14 rows selected.
No surprise, Oracle wanted
to use that index anyway (it’s very selective). Let’s drop the index and run it
again.
SQL> drop index idx_my_obj; Index dropped. SQL> explain plan for 2 select object_name 3 from my_objects 4 where object_type = 'INDEXTYPE'; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT --------------------------------------------------------------------------------- Plan hash value: 880823944 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 18 | 648 | 118 (3)| 00:00:02 | |* 1 | TABLE ACCESS FULL| MY_OBJECTS | 18 | 648 | 118 (3)| 00:00:02 | -------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_TYPE"='INDEXTYPE')
We incurred a full table
scan, but it wasn’t too painful as the table is not that big. Now, create an
index where OBJECT_TYPE is at the end, and force Oracle to consider all indexes
on that table (which is only one, but one is enough).
SQL> create index idx_big on my_objects(owner, object_id, data_object_id,timestamp, object_name); Index created. SQL> exec dbms_stats.gather_table_stats('SCOTT','MY_OBJECTS',cascade => true); PL/SQL procedure successfully completed. SQL> explain plan for 2 select /*+ index (a) */ 3 object_name 4 from my_objects a 5 where object_type = 'INDEXTYPE'; Explained. SQL> select * from table(dbms_xplan.display); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------ Plan hash value: 85280455 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 36 | 1487 (1)| 00:00:18 | |* 1 | TABLE ACCESS BY INDEX ROWID| MY_OBJECTS | 1 | 36 | 1487 (1)| 00:00:18 | | 2 | INDEX FULL SCAN | IDX_BIG | 40859 | | 392 (1)| 00:00:05 | ------------------------------------------------------------------------------------------
Although cost isn’t always
an accurate discriminator between plans, when you consider rows evaluated,
bytes, cost and time, you can plainly see that forcing a relatively bad index
on a statement, especially by being lazy (more on that in a moment) is, well,
dumb.
Using INDEX by itself means
someone probably doesn’t know what they’re doing. In fact, this person may be
doing more damage than good by using an inappropriate hint. Damage in this case
refers to poor performance and unnecessary resource usage. Those archived redo
logs take up space too.
What’s even worse about this
database is that tables are over-indexed. There are tables with an index based
on columns A, B, and C, an index on C, B, A, an index on B, C, and yet another
on A, C and B. What does this represent? A DBA who has no idea how to tune and
thinks that matching an index to every WHERE clause contained in the code ready
to be put into production will make things better. The truth is, the DBA made a
significant contribution to the poor performance of this database.
In Closing
Hints, as Oracle recommends,
should be used sparingly. When and where is that? Sorry to be ambiguous, but
the answer is: it depends. For whatever reason, using hint X may make a
difference (for the better, obviously) in an execution plan. How do you
discover this? Under some narrow conditions, and also by trial and error. Maybe
the plan is based on bad statistics that cannot be changed, so try something
else hint-wise is one situation. It just depends.
The take-away here is this:
Oracle has lots of hints to choose from. Know what they are and how they are
different from one another before tossing them into production (plus don’t
forget to stay abreast of changes in what’s available in the first place). With
good statistics, you normally/generally/usually need not ever include them into
DML and select statements.