CBO Cost Estimation

David Clement

October 16, 2007

Oracle version 10gR2


When evaluating the performance of an Oracle SQL statement, you often encounter a cost estimate from the cost-based optimizer (CBO). For example, issuing an 'explain plan' request from TOAD will get a plan that starts something like this:

Operation Object Name Rows Bytes Cost

SELECT STATEMENT Optimizer Mode=ALL_ROWS 11 6
FILTER
HASH GROUP BY 495 6

The cost column is in boldface above. What is it for, how is it used, and what does it mean?

The purpose of the cost is to direct the cost-based optimizer. The CBO always chooses the less costly of any two execution paths. The estimated costs are not a speedometer; the real performance may differ significantly from the estimate. 

The way that the cost estimates are used in an 'explain plan' is that they bubble up, that is, they add to each other from bottom to up. Consequently the estimated cost of the whole SQL statement is in the top line.

The following excerpt shows how each cost reflects the cost of its own processing step and the cost of the step that feeds into it from below.

Operation Object Name Rows Bytes Cost ... HASH JOIN 1 K 104 K 39544 TABLE ACCESS FULL SMALL_TABLE 1 12 3 HASH JOIN 27 K 1 M 39540 ...

The lowest line is a costly hash join; the line above it is an inexpensive full table scan of something called small_table; the top line is a hash join of the results of the full table scan and the results of the previous hash join. So the top hash join has a cost of 39544 = 39540 + 3 + 1, representing the cost of the lower hash join plus the cost of the full table scan plus the small cost of the top hash join itself.

As for what the cost estimates mean, they are based on the equation sreadtim = 1 where sreadtim means 'the cost of a single-block read.' The costing algorithm is described in Oracle documentation. 

Cost = ( #SRds * sreadtim + (#MRds / db_file_multiblock_read_count) * mreadtim + #CPUCycles / cpuspeed ) / sreadtim

where #SRds is the number of single-block reads, #MRds is the number of multiblock reads, db_file_multiblock_read_count is a database parameter, #CPUCycles is the number of CPU cycles, mreadtim is the cost of a multiblock read, and cpuspeed is CPU cycles per second (see Jonathan Lewis, Cost-Based Oracle Fundamentals, Apress:2006, page 4.)

The numbers of reads and cycles come from the execution plan. The CBO calculates that for (let's say) a given table access by row ID, a certain number of single-block reads have to be executed against the relevant index and table. It calculates similarly for other access methods.

The costs derive differently, and they change during the life of an Oracle instance. On installation, they are defaulted. Industry average values for the I/O costs can be stated as follows. 

In newer storage subsystems, an average single-block read shouldn't take more than 10ms (milliseconds) or 1cs (centisecond). You should expect an average wait time of 4 to 8ms (0.4 to 0.8cs) with SAN (storage area network) due to large caches.

(Shee, Deshpande, and Gopalakrishnan, Oracle Wait Interface, McGraw-Hill: 2004, page 107.) So, since the CBO algorithm estimates the cost of a query in units equivalent to single-block reads, and an average single-block read is said to consume 1 centisecond, you would expect that a SQL statement with an estimated cost of 100 will consume 1 full second in execution time.

That this is meant to be the case is shown, among other places, at the end of the standard text-based ADDM reports, where the following informational boilerplate appears.

The analysis of I/O performance is based on the default assumption that the average read time for one database block is 10000 micro-seconds.

A centisecond is 10000 microseconds, so this is the same assertion as Shee et al. made, quoted above. It is also consistent with the historical fact that Oracle's optimization and timing reports were always in units of 1 centisecond or higher until Oracle 9i began reporting in microseconds. It seems that out of the box, the CBO will estimate the cost of its execution plans in units of a centisecond, taken to be the time consumed by one single-block read. The results of the ADDM reports are based on the same numbers. 

However, this still does not tell us if the CBO's current estimates have changed from the default, or what the CBO's current estimates are. For that, it is necessary to take one more thing into account. Since Oracle 9i, the CBO has improved its metrics by issuing periodic test reads to gather system statistics. Because of this, the CBO at some point after installation may have better metrics available to it than the default estimate of 1 centisecond per single-block read.

The way to verify the CBO's current metrics is to issue this query.

select sname, pname, pval1 from sys.aux_stats$; SNAME PNAME PVAL1 ----- ----- ----- SYSSTATS_MAIN SREADTIM 7.618 SYSSTATS_MAIN MREADTIM 14.348 SYSSTATS_MAIN CPUSPEED 507

In the aux_stats$ view, pval1 from sreadtim gives the single-block read time in milliseconds. So, in the example, the single-block read time is much faster than the default estimate. The correct value is not sreadtim = 1 but sreadtim = .07618.

To understand optimizer costing on any system, it is necessary to go beyond the default values and to find out the current metrics. You can find more detailed discussion of the CBO's collection of statistics in an article by Kimberly Floss at oracle.com.

I hope this introduction to CBO cost estimation has been useful. If you have read this far, you might want to continue your investigation by following the references above.