Chapter 17. Cost-Based Optimizers

   

Supposedly, a rule-based optimizer (RBO) differs from a cost-based optimizer (CBO). Consider this SQL statement:

 SELECT * FROM Table1   WHERE column2 = 55 

Assume that column2 is an indexed, non-unique column. A rule-based optimizer will find column2 in the system catalog and discover that it is indexed, but not uniquely indexed. The RBO then combines this data with the information that the query uses the equals operator. A common assumption in the field of optimization is that " = <literal> " search conditions will retrieve 5% of all rows. (In contrast, the assumption for greater-than conditions is that they will retrieve 25% of all rows.) That is a narrow search, and usually it's faster to perform a narrow search with a B-tree rather than scan all rows in the table. Therefore the rule-based optimizer makes a plan: find matching rows using the index on column2 .

Notice that the rule-based optimizer is using a non-volatile datum (the existence of an index) and a fixed assumption (that equals searches are narrow).

A cost-based optimizer can go further. Suppose the system catalog contains three additional pieces of information: (1) that there are 100 rows in Table1 , (2) that there are two pages in Table1 , and (3) that the value 55 appears 60 times in the index for column2 . Those facts change everything. The equals operation will match on 60% of the rows, so it's not a narrow search. And the whole table can be scanned using two page reads, whereas an index lookup would take three page reads (one to lookup in the index, two more to fetch the data later). Therefore the cost-based optimizer makes a different plan: find matching rows using a table scan.

Notice that the cost-based optimizer is using volatile data (the row and column values that have been inserted) and an override (that the contents are more important than the fixed assumptions).

In other words, a cost-based optimizer is a rule-based optimizer that has additional, volatile information available to it so that it can override the fixed assumptions that would otherwise govern its decisions. The terminology causes an impression that one optimizer type is based on rules while the other is based on cost. That's unfortunate because both optimizer types use rules and both optimizer types have the goal of calculating cost. The reality is that cost-based is an extension of rule-based, and a better term would have been something like "rule-based++."

Most vendors claim that their DBMSs have cost-based optimizers, as you can see from Table 17-1. The claims don't mean much by themselves . What's important is whether the optimizer estimates cost correctly and how it acts on the estimate. In this chapter, we'll look at the actions DBMSs take to fulfill their claims.

Table 17-1. Cost-Based Optimizers
  Claims to be CBO "Explains" the Access Plan "Updates" Statistics for the Optimizer
IBM Yes EXPLAIN RUNSTATS
Informix Yes SET EXPLAIN UPDATE STATISTICS
Ingres Yes EXECUTE QEP optimizedb utility
InterBase Yes SELECT PLAN SET STATISTICS
Microsoft Yes EXPLAIN UPDATE STATISTICS
MySQL No EXPLAIN ANALYZE TABLE
Oracle Yes EXPLAIN PLAN FOR ANALYZE
Sybase Yes SET SHOWPLAN ON UPDATE STATISTICS

Notes on Table 17-1:

  • Claims to be CBO column

    This column is "Yes" if the DBMS's documentation makes the claim that it operates with a cost-based optimizer.

  • "Explains" the Access Plan column

    This column shows the non-standard statement provided by the DBMS so that you can examine the access plan the optimizer will use to resolve an SQL statement. For example, if you want to know how Oracle will resolve a specific SELECT statement, just execute an EXPLAIN PLAN FOR statement for the SELECT.

  • "Updates" Statistics for the Optimizer column

    This column shows the non-standard statement or utility the DBMS provides so that you can update volatile information, or statistics, for the optimizer. For example, if your DBMS is MySQL and you've just added many rows to a table and want the optimizer to know about them, just execute an ANALYZE TABLE statement for the table.

   


SQL Performance Tuning
SQL Performance Tuning
ISBN: 0201791692
EAN: 2147483647
Year: 2005
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net