Comparing the Cost of Performing Queries | Oracle Database 10g SQL (Osborne ORACLE Press Series)

The Oracle database software uses a subsystem known as the optimizer to generate the most efficient path to access the data stored in the tables. This path that the optimizer generates to access the data for each query is known as the execution plan. Oracle10 g automatically gathers statistics about the data in your tables and indexes in order to generate the most optimal execution plan; this is known as cost-based optimization.

Comparing the execution plans generated by the optimizer will allow you to judge the relative cost of one SQL statement versus another. You can use the results to make your statements optimal. In this section, you ll learn how to view and interpret a couple of example execution plans.

Note

Database versions prior to Oracle10 g don t automatically gather statistics, and the optimizer automatically defaults to rule-based optimization. Rule-based optimization uses syntactic rules to generate the execution plan. Cost-based optimization is typically better than rule-based optimization since the former uses actual information gathered from the data in the tables and indexes. If you re using Oracle9 i or below, you can gather statistics yourself and you ll learn how to do that later in the section, Gathering Table Statistics.

Examining Execution Plans

The optimizer generates an execution plan for a SQL statement. You can examine the execution plan using the SQL*Plus EXPLAIN PLAN command. The EXPLAIN PLAN command populates a table named plan_table with the SQL statement s execution plan; plan_table is often referred to as the plan table. You may then examine that execution plan by querying the plan table. The first thing you must do is to create the plan table.

Creating the Plan Table

To create the plan table you must run the SQL*Plus script utlxplan.sql contained in the directory ORACLE_HOME\rdbms\admin . ORACLE_HOME is the directory where the Oracle database software is installed. The following example shows the command to run the utlxplan.sql script:

 SQL>  @ e:\oracle\ora10\rdbms\admin\utlxplan.sql

Note	You ll need to replace the directory path with the path for your environment.

Creating a Central Plan Table

Your DBA could create one central plan table. That way, individual users wouldn't have to create their own plan tables. To do this, your DBA must perform the following steps:

Create the plan table in a schema of their choice by running the utlxplan.sql script.
Create a public synonym for the plan table.
Grant access on the plan table to the public.

For example:

@@ e:\oracle\ora10\rdbms\admin\utlxplan.sql
CREATE PUBLIC SYNONYM plan_table FOR plan_table;
GRANT SELECT, INSERT, UPDATE, DELETE ON plan_table TO PUBLIC;

The utlxplan.sql script contains the following statement that creates the plan table:

 create table PLAN_TABLE (statement_id varchar2(30),  timestamp date,  remarks varchar2(80),  operation varchar2(30),  options varchar2(255),  object_node varchar2(128),  object_owner varchar2(30),  object_name varchar2(30),  object_instance numeric,  object_type varchar2(30),  optimizer varchar2(255),  search_columns number,  id numeric,  parent_id numeric,  position numeric,  cost numeric,  cardinality numeric,  bytes numeric,  other_tag varchar2(255),  partition_start varchar2(255),  partition_stop varchar2(255),  partition_id numeric,  other long,  distribution varchar2(30),  cpu_cost numeric,  io_cost numeric,  temp_space numeric,  access_predicates varchar2(4000),  filter_predicates varchar2(4000),  projection varchar2(4000),  time numeric);

The most important columns in the plan table are shown in Table 16-1.

Table 16-1: Plan Table Columns
Column	Description
statement_id	Name you assign to the execution plan.
operation	Database operation performed, which can be scanning a table scanning an index accessing rows from a table by using an index joining two tables together sorting a row set For example, the operation for accessing a table is TABLE ACCESS .
options	Name of the option used in the operation. For example, the option for a complete scan is FULL .
object_name	Name of the database object referenced in the operation.
object_type	Attribute of object. For example, a unique index has the attribute of UNIQUE .
id	Number assigned to this operation in the execution plan.
parent_id	Parent number for the current step in the execution plan. The parent_id value relates to an id value from a parent step.
position	Processing order for steps that have the same parent_id .
cost	Estimate of units of work for operation. Cost-based optimization uses disk I/O, CPU usage, and memory usage as units of work. So, the cost is an estimate of the number of disk I/Os and the amount of CPU and memory used in performing an operation.

Generating an Execution Plan

Once you've created the plan table, you can use the EXPLAIN PLAN command to generate an execution plan for a SQL statement. The syntax for the EXPLAIN PLAN command is as follows :

 EXPLAIN PLAN SET STATEMENT_ID =  statement_id  FOR  sql_statement;

where

statement_id specifies the name that you assign to the execution plan. This can be alphanumeric text.
sql_statement specifies the SQL statement for which the execution plan is to be generated.

The EXPLAIN PLAN command in the following example populates the plan table with the execution plan for a query that retrieves all rows from the customers table (notice the statement _id is CUSTOMERS ):

 EXPLAIN PLAN SET STATEMENT_ID = 'CUSTOMERS' FOR SELECT customer_id, first_name, last_name FROM customers;

Once you ve run this command, you may examine the execution plan stored in the plan table.

Note	The SELECT in the EXPLAIN PLAN statement doesn t actually return rows retrieved from the customers table. The EXPLAIN PLAN statement simply generates the execution plan that would be used if the SELECT were to be performed.

Querying the Plan Table

For querying the plan table, I have provided a SQL*Plus script named explain_plan.sql in the SQL directory where you extracted this book s Zip file. The script will prompt you for the statement_id and will then display the execution plan for the associated statement.

The explain_plan.sql script is as follows:

 -- The explain_plan.sql script displays the -- execution plan for the specified statement_id UNDEFINE v_statement_id; SELECT  id   DECODE(id, 0, '', LPAD(' ', 2*(level - 1)))  ' '   operation  ' '   options  ' '   object_name  ' '   object_type  ' '   DECODE(cost, NULL, '', 'Cost = '  position) AS execution_plan FROM plan_table CONNECT BY PRIOR id = parent_id AND statement_id = '&&v_statement_id' START WITH id = 0 AND statement_id = '&v_statement_id';

An execution plan stored in the plan table is organized into a hierarchy of operations similar to a tree. The operation with the id column value of 0 is the root of the hierarchy, and all the other operations in the plan stem from this operation. The SELECT statement in the explain_plan.sq l script shows the operations performed in the execution plan, starting with the operation having an id value of 0, and navigating downward through the hierarchy of relationships between each of the operations.

The following example shows how to run the explain_plan.sql script to retrieve the plan created earlier for the query on the customers table (that plan had a statement_id of CUSTOMERS ):

 SQL>  @ e:\sql_book\sql\explain_plan.sql  Enter value for v_statement_id:  CUSTOMERS  old 12: statement_id = '&&v_statement_id' new 12: statement_id = 'CUSTOMERS' old 14: statement_id = '&v_statement_id' new 14: statement_id = 'CUSTOMERS' EXECUTION_PLAN ---------------------------------------------- 0 SELECT STATEMENT Cost = 4 1 TABLE ACCESS FULL CUSTOMERS TABLE Cost = 1

Operations are executed in the following order: from inside out, and from top to bottom. Each operation feeds its results back up the chain to its immediate parent operation; the parent is then executed. In the output from the explain_plan.sql script, the operation ID is shown on the far left.

In the example execution plan, operation 1 is run first, with the results of that operation being passed to operation 0. Operation 1 involves a full table scan ”indicated by the string TABL E ACCESS FULL ”on the customers table. A full table scan is performed because the original SELECT statement specified that all rows be retrieved from the table (i.e., no WHERE clause is used to restrict the rows). The total cost of the entire SELECT statement is four work units, as indicated in the cost part of operation 0 in the execution plan. A work unit is the amount of processing the software has to do to perform a given operation.

Note

If you re using a version of the database prior to Oracle10 g, then the output for the overall statement cost may be blank. That s because earlier database versions don t automatically collect table statistics. In order to gather statistics, you have to use the ANALYZE command. You ll learn how to do that later in the section, Gathering Table Statistics.

Execution Plans Involving Table Joins

Execution plans for table joins are more complex. The following example generates the execution plan for a query that joins the products and product_types tables:

 EXPLAIN PLAN SET STATEMENT_ID = 'PRODUCTS' FOR SELECT p.name, pt.name FROM products p, product_types pt WHERE p.product_type_id = pt.product_type_id;

The execution plan for this query is as follows:

 0 SELECT STATEMENT Cost = 7 1 MERGE JOIN Cost = 1 2 TABLE ACCESS BY INDEX ROWID PRODUCT_TYPES TABLE Cost = 1 3 INDEX FULL SCAN PRODUCT_TYPES_PK INDEX (UNIQUE) Cost = 1 4 SORT JOIN Cost = 2 5 TABLE ACCESS FULL PRODUCTS TABLE Cost = 1

Note	If you run the example, you may get a slightly different execution plan depending on the version of the database you are using and the settings of the parameters in the database s init.ora configuration file.

This time the execution plan is more complex, and you can see the hierarchical relationships between the various operations. As mentioned, operations are executed from inside out and from top to bottom. The order in which the operations are executed in the example is 3, 2, 5, 4, 1, and 0. Table 16-2 describes each operation in the order in which they are performed.

Table 16-2: Execution Plan Operations
Operation ID	Description
3	Full scan of the index product_types_pk (which is a unique index) to obtain the addresses of the rows in the product_types table. The addresses are in the form of ROWID values, which are passed to operation 2.
2	Accesses rows in the product_types table using the list of ROWID values passed from operation 3. The rows are passed to operation 1.
5	Accesses rows in the products table, which are passed to operation 4.
4	Sorts the rows passed from operation 5, which are passed to operation 1.
1	Merges the rows passed from operations 2 and 5. The merged rows are passed to operation 0.
	Returns the rows from operation 1 to the user . Total cost of the SELECT is 7 work units.

Gathering Table Statistics

If you re using a version of the database prior to Oracle10 g (such as Oracle9 i ), then you ll have to gather table statistics yourself using the ANALYZE command. By default, if no statistics are available then rule-based optimization is used. Rule-based optimization isn t usually as good as cost-based optimization.

The following examples use the ANALYZE command to gather statistics for the products and product_types tables:

 ANALYZE TABLE products COMPUTE STATISTICS; ANALYZE TABLE product_types COMPUTE STATISTICS;

Once the statistics have been gathered, cost-based optimization will be used rather than rule-based optimization.

Comparing Execution Plans

By comparing the total cost shown in the execution plan for different SQL statements, you can determine the value of tuning your SQL. In this section, you ll compare two execution plans to see the benefit of using EXISTS rather than DISTINCT (a tip I gave earlier). The following query uses EXISTS :

 -- GOOD (uses EXISTS rather than DISTINCT) SELECT product_id, name FROM products outer WHERE EXISTS  (SELECT 1  FROM purchases inner  WHERE inner.product_id = outer.product_id);

The execution plan for this query is as follows:

 0 SELECT STATEMENT Cost = 5 1 MERGE JOIN SEMI Cost = 1 2 TABLE ACCESS BY INDEX ROWID PRODUCTS TABLE Cost = 1 3 INDEX FULL SCAN PRODUCTS_PK INDEX (UNIQUE) Cost = 1 4 SORT UNIQUE Cost = 2 5 INDEX FULL SCAN PURCHASES_PK INDEX (UNIQUE) Cost = 1

As you can see, the total cost of the query is 5 work units. The next query uses DISTINCT :

 -- BAD (uses DISTINCT when EXISTS would work) SELECT DISTINCT pr.product_id, pr.name FROM products pr, purchases pu WHERE pr.product_id = pu.product_id;

The execution plan for this query is as follows:

 0 SELECT STATEMENT Cost = 6 1 SORT UNIQUE Cost = 1 2 MERGE JOIN Cost = 1 3 TABLE ACCESS BY INDEX ROWID PRODUCTS TABLE Cost = 1 4 INDEX FULL SCAN PRODUCTS_PK INDEX (UNIQUE) Cost = 1 5 SORT JOIN Cost = 2 6 INDEX FULL SCAN PURCHASES_PK INDEX (UNIQUE) Cost = 1

The cost for the query is 6 work units. This query is more costly than the first, which only had a cost of 5 work units. If the purchases table contained more rows, then the cost difference would be even higher.