Oracle Optimizer: Moving to and working with CBO - Page 3
July 28, 2003
3.2 Cost based optimizer (CBO)
CBO follows Expense calculation methodology. All execution plans are tagged with a cost, the one with the lowest cost will be chosen. The higher the cost the more resources will be used by the execution plan, the lower the cost, the more efficient the query is.
CBO uses all available information-statistics and histograms stored in the dictionary, user provided hints and supplied parameter settings to arrive at the cost. CBO generates all possible permutations of access methods and then chooses what fits best. The number of permutations depends on the number of tables present in the query and can sometimes be around 80,000 permutations or even more! Please refer to the parameter section in part 2 of this series for setting related parameters.
CBO may also perform operations such as query transformation, view merging, OR transformation, push join predicates, etc. that would change the original statement and alter existing or add new predicates, all with the aim of deriving new access plans that could be better than the existing ones. Note that transformation does not affect the data that is returned, only the execution path. Please refer to the parameter section in part 2 of this series for information related to this.
Statistics provide critical input in order for CBO to work properly; these are generated for data storing objects and include information such as the number of rows in a table, distinct values in a column, number of leaf blocks in an index, etc. The more accurate the statistics, the more efficient the results provided by Optimizer. Please refer to the Generating statistics section in part 3 of this series for how this information is generated and how best we can maintain it.
Statistics may be exact or estimated. Statistics generated with a COMPUTE clause analyzes all of the data in the object. This gives the optimizer accurate information to work on and arrive at a good execution plan.
Statistics generated with an ESTIMATE clause analyzes data in the object to the extent of sample size mentioned. Sample size may be specified as number of rows or percentage of rows that should be randomly analyzed to generate the statistics. Optionally block sampling may also be specified. This saves on time if there are many huge tables in the system. The guarantee of good execution plans will depend on how close the estimated value is to the exact values. You can try out your setup at different sample sizes to arrive at an appropriate figure or have different estimation levels for different types of tables, but the idea is to get as close to accuracy as feasible.
Statistics are stored in a data dictionary in tables owned by SYS user. The following views display the statistics collected for tables, columns and indexes.
Statistics for individual partitions of a table can be seen from DBA_TAB_PARTITIONS. Cluster statistics is available from DBA_CLUSTERS. Object table statistics are present in DBA_OBJECT_TABLES.
DBA_TAB_COL_STATISTICS shows similar data. Partitioned Table column statistics can be viewed from DBA_PART_COL_STATISTICS and DBA_SUBPART_COL_STATISTICS.
Statistics for individual partitions of indexes can be seen from DBA_IND_PARTITIONS and DBA_IND_SUBPARTITIONS.
Dictionary tables related to Histogram information are discussed later.