Course Review: Performance Optimization for Developers by Hotsos

In today’s economic conditions, many
companies have made significant cuts in training budgets. Significant even
means no budget. In some cases, if you’re willing to pay for training out of pocket,
you may get to attend on paid time as opposed to taking vacation days. This is
where I found myself in wanting to take a class presented by Hotsos. Training
courses typically run in the range of $500 or more per day, and if you are
paying this cost out of pocket, you want to get your money’s worth. I recently
attended the three day Performance Optimization for Developers – SQL and PL/SQL
class and this article is a review of the training provided by Hotsos.

If you register at Hotsos’ site, you
will receive emails announcing upcoming training events. Many, if not all, of
the class announcements have an incentive of giving you a significant discount
for early registration. The POD3 course lists at $1875, but early registration (if
offered) drives the fee down to $1500. Once you attend a particular course, you
can enroll in future versions using an alumni rate. The POD3 alumni cost is
currently listed at $600. The “parent” version of this course is Optimizing
Oracle SQL, Intensive (OPINT). This course is five days in length and differs
from POD3 in that POD3 is all lecture whereas OPINT is 40% lab exercises. At
five days, the list price is $3600 ($1000 alumni).

The syllabus shown at the Hotsos web
site lists what you will learn, and the course covered exactly that. The list
of topics for POD3 includes:

What is a Logical IO, and why is
it important?
SQL statement execution
fundamentals
Tools for SQL statement problem
diagnosis and repair
Access methods
Coding techniques that may prevent
index use
Join methods
Understanding complex Explain
Plans
What are things that I can do to
write better SQL?
Plus…optimizing PL/SQL

The last item is the third day as
another variation of this course is only two days (2 days of SQL, 1 day of
PL/SQL).

If you’ve tried to become better at
writing efficient SQL statements, chances are you’ve read several books or
online articles about what takes place within the RDBMS when a statement is
issued. Understanding wait events and system events are key to writing
efficient statements. POD3 takes you through an introduction of database
architecture and the internals of the buffer cache. Data – and other bits of
information – is read into and out of the buffer cache. Data goes in via
physical IO, and data goes out via logical IO.

Overall, DBTIME consists of time waited
within the instance plus time executed. The time executed is driven by SQL
tuning or optimization and the main goal is to lower the number of logical IOs.
This leads to the utility of the Hotsos Test Harness.

Every assertion and claim about a
technique is backed up or demonstrated by measurement of results. In this case,
the results are a comparison between the numbers of logical IOs when using one
approach versus another. The harness is used extensively throughout the course
and what it can do is amazingly robust. If you have time before the course, it
would be worthwhile to become familiar with the main driver scripts. It is so
extensive that learning how to use it at a basic level will require several
hours of experimentation, and those are hours better spent on your time as
opposed to during expensive course time.

Hotsos provides the harness for free.
Additionally, all of the course examples are available for download.

downloads

What are some of the specific take-aways
in the course?

Reading an explain plan is covered in
detail. True or false: an explain plan outlines what the optimizer will do. If
you want to know what Oracle actually did, you need to trace the execution of a
statement. The explain plan is nothing more than an estimate of what the
optimizer will do, so the answer is false. In a building block approach, once
the fundamentals of reading and understanding an explain are explained, the
course moves on to a coverage of access and join methods.

Access and join methods go the heart of
getting data back to a user. How does Oracle access the data (by index or table
scan) and how does it join rowsets, or intermediate results? This section
covers how Oracle works in index access techniques (index unique scan, range
scan, skip scan, full scan, fast full scan and join). When looking at statistics, you will understand what the "buffer
is pinned count" statistic is and how it relates to table access by index
rowed. If you go to all the effort of creating indexes, then you should
also know what to look for in terms of operations that prevent or bypass the
use of them, and this is covered as well.

A very good explanation of join methods
includes coverage of nested loops, sort merge, hash join and Cartesian joins
and leads up what this is all about: writing SQL right.

The topics in this section include:

INLIST iterators
Expressions using CASE, coalesce
and nullif
Semi and anti joins
Scalar subqueries
Analytic functions and SQL
modeling
WITH clause subquery factoring
Using MERGE

For WHERE clauses (or predicates) using
IN, Oracle frequently transforms the list to a series of OR predicates. When
there are many items in the list, Oracle may decide to bypass an index and use
a full table scan. What is another way (at least two exist) to make using a
list more efficient? One method is to create a global temporary table populated
with the list items, and then use a join between the GTT and the table being
filtered.

Speaking of IN, we frequently have a
need to use NOT IN. Related to NOT IN is NOT EXISTS. Are these functionally
equivalent anti-join constructs? Here is a simple example to show what happens
when a null value exists in the column being evaluated. Without firing up the
test harness, this can be demonstrated using the EMP table in the SCOTT schema
(assuming the standard 14 row table exists and all employees have a department
number).

SQL> select * from dept;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        10 ACCOUNTING     NEW YORK
        20 RESEARCH       DALLAS
        30 SALES          CHICAGO
        40 OPERATIONS     BOSTON

SQL> select * from dept where deptno not in (select deptno from emp);

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON

In this case, NOT IN worked because all
records have a non-null value for DEPTNO. Insert a record, leaving DEPTNO null
and re-run the same query.

SQL> insert into emp (empno, ename) values (1234, 'NODEPT');

1 row created.

SQL> select * from dept where deptno not in (select deptno from emp);

no rows selected

What happens if NOT EXISTS is used, just
in case a null value exists?

SQL> select * from dept d where not exists 
  2  (select deptno from emp where deptno = d.deptno);

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON

Of course, the original query can be
rewritten to take care of a NULL value possibility.

SQL> select * from dept where deptno not in 
  2  (select deptno from emp where deptno is not null) ;

    DEPTNO DNAME          LOC
---------- -------------- -------------
        40 OPERATIONS     BOSTON

The NOT IN versus NOT EXISTS difference,
which works fine as long as no nulls exists, is subtle and can cause erroneous
results. It is quite likely that many ad hoc SQL writers are unaware of this
difference. Overall, this demonstration in the POD3 class may be one of those
“a-ha” moments for you.

The last section of the course, covering
the third day, is on optimizing PL/SQL. You should already know the basics of
PL/SQL if attending this part of the course. The list of topics includes:

Code logic order
Datatype conversions
Bind variables
All about cursors
Collections
Bulk operations
Table functions

The table functions section is
noteworthy as it presents a very good example of using a pipelined table
function. What can this be used for? It can return data while the function is
still executing, allows you to transform data without having to stop off in
working or temp tables, and can reduce memory requirements, to name a few
benefits. The transform data feature can be used to pivot a table. Within the
free code depot (the POP class files), take a look at the pipelined_table_fx_demo1.sql
file for a complete setup and demonstration.

In Closing

Overall, this was an excellent class,
well worth the money. Very well organized, professionally presented, and the
demonstrations reinforce the concepts and techniques presented. The course book
can be used as a reference book if need be, and all of the examples have
references to the demo code. In a follow-up article, we’ll take a look at using
the harness.

»

See All Articles by ColumnistSteve Callan

Course Review: Performance Optimization for Developers by Hotsos

In Closing

Steve Callan

Company

Categories