Database Journal
MS SQL Oracle DB2 Access MySQL PostgreSQL Sybase PHP SQL Etc SQL Scripts & Samples Links Database Forum

» Database Journal Home
» Database Articles
» Database Tutorials
MS SQL
Oracle
DB2
MS Access
MySQL
» RESOURCES
Database Tools
SQL Scripts & Samples
Links
» Database Forum
» Sitemap
Free Newsletters:
DatabaseDaily  
News Via RSS Feed


follow us on Twitter
Database Journal |DBA Support |SQLCourse |SQLCourse2
 

Featured Database Articles

Oracle

Posted Nov 24, 2003

Oracle Optimizer: Moving to and working with CBO - Part 5 - Page 2

By Amar Kumar Padhi

Generating Histograms

These statistics could be generated using the following commands:

DBMS_STATS.GATHER_TABLE_STATS with METHOD_OPT => 'FOR COLUMNS <colname> SIZE xxx'

or

ANALYZE command for column statistics with SIZE option.

The SIZE option in the above command specifies the number of buckets to be stored

e.g.: Gather histogram statistics for AM20.ENAME. Since there are 17 distinct values in the column, I can choose to have about 1 to 17 buckets in the histogram. As the frequency of each value is less, an approximate of 5 buckets is chosen initially.

SQL> exec dbms_stats.gather_table_stats(ownname => 'SYS', tabname => 'AM20',-
> method_opt => 'FOR COLUMNS ENAME SIZE 5');

PL/SQL procedure successfully completed.


SQL> SELECT table_name, column_name,num_distinct,num_buckets,last_analyzed,
  2         avg_col_len
  3  from   dba_tab_col_statistics
  4  where  table_name = 'AM20';

TABLE_NAME COLUMN_NAM NUM_DISTINCT NUM_BUCKETS LAST_ANAL AVG_COL_LEN
---------- ---------- ------------ ----------- --------- -----------
AM20       ENAME                17           5 17-OCT-03           6


SQL> select owner, table_name, column_name,
  2         to_char(endpoint_number, '99999.99') EP_NO,
  3         trunc(endpoint_value) EP_VAL,
  4         endpoint_actual_value EP_A_VAL
  5  from   dba_tab_histograms
  6  where  table_name = 'AM20'
  7  and    column_name = 'ENAME'
  8  order by endpoint_number;

OWNER      TABLE_NAME COLUMN_NAM EP_NO         EP_VAL EP_A_VAL
---------- ---------- ---------- --------- ---------- -----------
SYS        AM20       ENAME            .00 3.3888E+35 ADAMS
SYS        AM20       ENAME           1.00 3.6507E+35 FORD
SYS        AM20       ENAME           2.00 4.0632E+35 NARAYAN
SYS        AM20       ENAME           3.00 4.3269E+35 SUNIL
SYS        AM20       ENAME           4.00 4.3788E+35 TURNER
SYS        AM20       ENAME           5.00 4.5305E+35 WARD

6 rows selected.

Again, if you are maintaining histograms for a frequently changing column, it is recommended to update the histograms more frequently to provide accurate statistics to the optimizer.

Histogram on indexed columns

Oracle provides a method of analyzing indexed columns that need to be provided with histogram statistics. This can be done by using the INDEX_HISTOGRAM view.

e.g.: Column ename is provided with an index in table AM20. This will be used to analyze the density of the values. As can be seen below, the INDEX_HISTOGRAM view shows the occurrences of the distinct values. This information can be analyzed to find out if the data is skewed.

SQL> create index am20_ind1 on am20(ename);

Index created.

SQL> analyze index am20_ind1 validate structure;

Index analyzed.

SQL> select * from index_histogram;

        REPEAT_COUNT KEYS_WITH_REPEAT_COUNT
-------------------- ----------------------
                   0                      0
                   1                     12
                   2                      0
                   3                      0
                   4                      1
                   5                      1
                   6                      0
                   7                      0
                   8                      1
                   9                      2
                  10                      0
                  11                      0
                  12                      0
                  13                      0
                  14                      0
                  15                      0

16 rows selected.

Histogram use

The following is a simple example of how the histogram is actually used by the optimizer to pick up an index based on the selectivity of the column present in the predicate.

e.g.: There are 600 records in the table AM20. ENAME data 'SUNIL' has 512 records and 'TURNER' has 9 records. Without the histogram statistics, the index is used to retrieve values for both the queries as shown below.

SQL> select * from am20 where ename = 'TURNER';
...
9 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=1 Bytes=11)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'AM20' (Cost=1 Card=6 Bytes=66)
   3    2       INDEX (RANGE SCAN) OF 'AM20_N1' (NON-UNIQUE) (Cost=1 Card=6)


SQL> select * from am20 where ename = 'SUNIL';
...
512 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=1 Bytes=11)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (BY INDEX ROWID) OF 'AM20' (Cost=1 Card=6 Bytes=66)
   3    2       INDEX (RANGE SCAN) OF 'AM20_N1' (NON-UNIQUE) (Cost=1 Card=6)

The use of index in retrieving 9 rows is fast, but to retrieve 512 records (85% of the table) with an index is overkill. The optimizer is not equipped with the data distribution statistics to make this decision. Histogram data is provided and the query is re-run as shown below, with the right execution plan this time.

SQL> exec dbms_stats.gather_table_stats(ownname => 'SYS', tabname => 'AM20',-
> method_opt => 'FOR COLUMNS ENAME SIZE 75');

PL/SQL procedure successfully completed.


SQL> select * from am20 where ename = 'TURNER';
...
512 rows selected.

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=FIRST_ROWS (Cost=1 Card=1 Bytes=10)
   1    0   SORT (AGGREGATE)
   2    1     TABLE ACCESS (FULL) OF 'AM20' (Cost=1 Card=9 Bytes=90)



Oracle Archives

Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 




Latest Forum Threads
Oracle Forum
Topic By Replies Updated
Oracle Data Mining: Classification jan.hasller 0 July 5th, 07:19 AM
Find duplicates - Unique IDs Lava 5 July 2nd, 08:30 AM
no matching unique or primary key rcanter 1 April 25th, 12:32 PM
Update values of one table based on condition of values in other table using Trigger Gladiator 3 February 29th, 06:01 PM