1. What is Optimizer?
In
Oracle, a query may be executed in more than one way. The execution plan that
has the best ranking or the lowest cost is the one that will return output with
the fastest rate and optimal utilization of resources. The execution plan is
generated by the Optimizer. Optimizer is an 'engine' running in the database
that is dedicated to deriving a list of execution paths based on various
conditions and then choosing the most efficient for running a query. Once an
execution plan choice is made, it is then carried out to arrive at the output.
In
Oracle, Optimizer relates to DML statements.
2. Why Optimize?
You
know it! Optimizing a query aims at executing it in the shortest time and with
optimal use of resources, thus making it fast and efficient. By resources, here
I mean CPU utilization, hard disk I/O, memory consumption and to some extent,
network operations. Irrespective of how big or rich your server is in terms of
these resources, improper or sub-optimal queries will always be expensive and
may drag your session or impact other process on the server.
The
extent to which a query is expensive will depend on lot of factors, including
the size of the result set to be fetched, the size of the data being scanned to
retrieve the result set and the load on the system at that point in time.
Proper optimization of statements will save your users lot of runtime wastage
and unwanted resource utilization.
3. Available Optimizers
Oracle
has two modes for Optimizer to decide on the best execution plan, Rule based
and Cost based. This article concentrates on Cost Based Optimizer and Rule
based is described in brief.
3.1 Rule Based Optimizer (RBO)
RBO
follows a simple ranking methodology. Fifteen ranking points are designed in
this optimizer. When a query is received, the optimizer evaluates the number of
points that are satisfied. The execution path with the best rank (lowest
number) is then chosen for executing the query. The fifteen-point ranking is
mentioned below.
- Single row by ROWID
- Single row by cluster join
- Single row by hash cluster with unique or primary key
- Single row by unique or primary key
- Cluster join
- Hash cluster key
- Indexed cluster key
- Composite key
- Single column indexes
- Bounded range on index columns
- Unbounded range on indexed columns
- Sort merge join
- MAX or MIN on indexed column
- ORDER BY on indexed columns
- Full table scan
For
example, If I fire a query on a table that has two columns that are searched
for exact match (equal-to) in the where clause condition, one being the primary
key and the other column has a non-unique key, RBO will prefer the primary key
(rank 4) to the non-unique key (rank 9).
When
more than one table is accessed in a query, the optimizer needs to decide which
should be the driving table. The RBO generates a set of join orders, each with
a different table as the first table. Then the most optimal plan is chosen from
the resulting set of execution plans.
The
optimizer evaluates the execution plans for various conditions such as (fewest
nested-loop, fewest sort-merge joins, table with the best ranking access path,
etc.). If there is still a tie, the optimizer chooses the execution plan for
which the first table appears later in the query's FROM clause. Hence, it is a
conventional coding practice to put the driving table at the extreme right,
followed by other tables in order of access in the FROM clause, i.e., the
ordering of tables based on their access is from right to left.
Please
note that the operators being used for searching the columns also play a role
in deciding the ranking. Sometimes even the age of an index is considered for
ranking!
For
example the below table shows what index is used if column1 and column2 have
indexes on them and if both are being referred in the where clause with
"=" operator.
Example:
select * from am79 where col1 = 1 and col2 = 'amar';
-- here both col1 and col2 are indexed.
-------------------------------------------------------------------------------------
Normal index types | Index used in RBO
column1(a) column2(b) column1+column2(c) |
-------------------------------------------------------------------------------------
non-unique non-unique c
non-unique non-unique a + b
non-unique non-unique non-unique c
unique non-unique a
unique non-unique a
unique unique b (the most recent index created)
unique unique unique c
-------------------------------------------------------------------------------------
-The above is tested on Oracle 8.1.7.1.
-In case of non-unique single column indexes, both indexes are used.
-In case of unique indexes, they are not combined for execution plan, any one is taken.
-Preference is given to the index available with the "=" operator column, than with
others operators.
-Don't create bitmap & function-based indexes, these will not work in RBO.
-------------------------------------------------------------------------------------
RBO
was the preferred choice for most setups in earlier releases of oracle as the
execution paths were consistent and uniform. Queries would behave the same way
if run on different databases of the same application.