Course Review: Performance Optimization for Developers by Hotsos
June 24, 2009
In todays economic conditions, many companies have made significant cuts in training budgets. Significant even means no budget. In some cases, if youre willing to pay for training out of pocket, you may get to attend on paid time as opposed to taking vacation days. This is where I found myself in wanting to take a class presented by Hotsos. Training courses typically run in the range of $500 or more per day, and if you are paying this cost out of pocket, you want to get your moneys worth. I recently attended the three day Performance Optimization for Developers SQL and PL/SQL class and this article is a review of the training provided by Hotsos.
If you register at Hotsos site, you will receive emails announcing upcoming training events. Many, if not all, of the class announcements have an incentive of giving you a significant discount for early registration. The POD3 course lists at $1875, but early registration (if offered) drives the fee down to $1500. Once you attend a particular course, you can enroll in future versions using an alumni rate. The POD3 alumni cost is currently listed at $600. The parent version of this course is Optimizing Oracle SQL, Intensive (OPINT). This course is five days in length and differs from POD3 in that POD3 is all lecture whereas OPINT is 40% lab exercises. At five days, the list price is $3600 ($1000 alumni).
The syllabus shown at the Hotsos web site lists what you will learn, and the course covered exactly that. The list of topics for POD3 includes:
The last item is the third day as another variation of this course is only two days (2 days of SQL, 1 day of PL/SQL).
If youve tried to become better at writing efficient SQL statements, chances are youve read several books or online articles about what takes place within the RDBMS when a statement is issued. Understanding wait events and system events are key to writing efficient statements. POD3 takes you through an introduction of database architecture and the internals of the buffer cache. Data and other bits of information is read into and out of the buffer cache. Data goes in via physical IO, and data goes out via logical IO.
Overall, DBTIME consists of time waited within the instance plus time executed. The time executed is driven by SQL tuning or optimization and the main goal is to lower the number of logical IOs. This leads to the utility of the Hotsos Test Harness.
Every assertion and claim about a technique is backed up or demonstrated by measurement of results. In this case, the results are a comparison between the numbers of logical IOs when using one approach versus another. The harness is used extensively throughout the course and what it can do is amazingly robust. If you have time before the course, it would be worthwhile to become familiar with the main driver scripts. It is so extensive that learning how to use it at a basic level will require several hours of experimentation, and those are hours better spent on your time as opposed to during expensive course time.
Hotsos provides the harness for free. Additionally, all of the course examples are available for download.
What are some of the specific take-aways in the course?
Reading an explain plan is covered in detail. True or false: an explain plan outlines what the optimizer will do. If you want to know what Oracle actually did, you need to trace the execution of a statement. The explain plan is nothing more than an estimate of what the optimizer will do, so the answer is false. In a building block approach, once the fundamentals of reading and understanding an explain are explained, the course moves on to a coverage of access and join methods.
Access and join methods go the heart of getting data back to a user. How does Oracle access the data (by index or table scan) and how does it join rowsets, or intermediate results? This section covers how Oracle works in index access techniques (index unique scan, range scan, skip scan, full scan, fast full scan and join). When looking at statistics, you will understand what the "buffer is pinned count" statistic is and how it relates to table access by index rowed. If you go to all the effort of creating indexes, then you should also know what to look for in terms of operations that prevent or bypass the use of them, and this is covered as well.
A very good explanation of join methods includes coverage of nested loops, sort merge, hash join and Cartesian joins and leads up what this is all about: writing SQL right.
The topics in this section include:
For WHERE clauses (or predicates) using IN, Oracle frequently transforms the list to a series of OR predicates. When there are many items in the list, Oracle may decide to bypass an index and use a full table scan. What is another way (at least two exist) to make using a list more efficient? One method is to create a global temporary table populated with the list items, and then use a join between the GTT and the table being filtered.
Speaking of IN, we frequently have a need to use NOT IN. Related to NOT IN is NOT EXISTS. Are these functionally equivalent anti-join constructs? Here is a simple example to show what happens when a null value exists in the column being evaluated. Without firing up the test harness, this can be demonstrated using the EMP table in the SCOTT schema (assuming the standard 14 row table exists and all employees have a department number).
SQL> select * from dept; DEPTNO DNAME LOC ---------- -------------- ------------- 10 ACCOUNTING NEW YORK 20 RESEARCH DALLAS 30 SALES CHICAGO 40 OPERATIONS BOSTON SQL> select * from dept where deptno not in (select deptno from emp); DEPTNO DNAME LOC ---------- -------------- ------------- 40 OPERATIONS BOSTON
In this case, NOT IN worked because all records have a non-null value for DEPTNO. Insert a record, leaving DEPTNO null and re-run the same query.
SQL> insert into emp (empno, ename) values (1234, 'NODEPT'); 1 row created. SQL> select * from dept where deptno not in (select deptno from emp); no rows selected
What happens if NOT EXISTS is used, just in case a null value exists?
SQL> select * from dept d where not exists 2 (select deptno from emp where deptno = d.deptno); DEPTNO DNAME LOC ---------- -------------- ------------- 40 OPERATIONS BOSTON
Of course, the original query can be rewritten to take care of a NULL value possibility.
SQL> select * from dept where deptno not in 2 (select deptno from emp where deptno is not null) ; DEPTNO DNAME LOC ---------- -------------- ------------- 40 OPERATIONS BOSTON
The NOT IN versus NOT EXISTS difference, which works fine as long as no nulls exists, is subtle and can cause erroneous results. It is quite likely that many ad hoc SQL writers are unaware of this difference. Overall, this demonstration in the POD3 class may be one of those a-ha moments for you.
The last section of the course, covering the third day, is on optimizing PL/SQL. You should already know the basics of PL/SQL if attending this part of the course. The list of topics includes:
The table functions section is noteworthy as it presents a very good example of using a pipelined table function. What can this be used for? It can return data while the function is still executing, allows you to transform data without having to stop off in working or temp tables, and can reduce memory requirements, to name a few benefits. The transform data feature can be used to pivot a table. Within the free code depot (the POP class files), take a look at the pipelined_table_fx_demo1.sql file for a complete setup and demonstration.
Overall, this was an excellent class, well worth the money. Very well organized, professionally presented, and the demonstrations reinforce the concepts and techniques presented. The course book can be used as a reference book if need be, and all of the examples have references to the demo code. In a follow-up article, well take a look at using the harness.