6.4 Comparison Conditions | Oracle High Performance Tuning for 9i and 10g

< Day Day Up >

Different comparison conditions can have sometimes vastly different effects on the performance of SQL statements. Let's examine each in turn with various options and recommendations for potential improvement. The comparison conditions are listed here.

Equi, anti, and range
- expr { [!]= | > | < | <= | >= } expr
- expr [ NOT ] BETWEEN expr AND expr
LIKE pattern matching
- expr [ NOT ] LIKE expr
Set membership
- expr [ NOT ] IN expr
- expr [ NOT ] EXISTS expr
  
  Note
  Oracle Database 10 Grid IN is now called an IN rather than a set membership condition in order to limit confusion with object collection MEMBER conditions.
- Groups
- expr [ = | != | > | < | >= | <= ] [ ANY | SOME | ALL ] expr

6.4.1 Equi, Anti, and Range

Using an equals sign (equi) is the fastest comparison condition if a unique index exists. Any type of anti comparison such as != or NOT is looking for what is not in a table and thus must read the entire table; sometimes full index scans can be used. Range comparisons scan indexes for ranges of rows. Let's look at some examples.

This example does a unique index hit; using the equals sign an exact hit single row is found.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM generalledger WHERE generalledger_id =          100;     Query                                  Cost     Rows     Bytes ------------------------------------   ----   ------   ------- SELECT STATEMENT on                       3        1        26  TABLE ACCESS BY INDEX ROWID on     GENERALLEDGER                          3        1        26   INDEX UNIQUE SCAN on XPKGENERALLEDGER   2        1

The anti (!=) comparison finds everything but the single row specified and thus must read the entire table.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM generalledger WHERE generalledger_id !=          100;     Query                             Pos     Cost     Rows      Bytes ------------------------------   ----   ------   -------   ------- SELECT STATEMENT on                493     493    752739  19571214 TABLE ACCESS FULL on GENERAL         1     493    752739  19571214

In the next case using the range (<) comparison searches a range of index values rather than a single unique index value.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM generalledger WHERE generalledger_id < 10;     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- SELECT STATEMENT on                      4        1        26  TABLE ACCESS BY INDEX ROWID on     GENERALLEDGE                          4        1        26   INDEX RANGE SCAN on XPKGENERALLEDGER   3        1

In the next example the whole table is read rather than using an index range scan because most of the table will be read and thus the Optimizer considers reading the table as being faster.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM generalledger WHERE generalledger_id >=          100;     Query                             Pos   Cost     Rows      Bytes ------------------------------   ----   ----   -------   ------- SELECT STATEMENT on               493    493    752740  19571240  TABLE ACCESS FULL on GENERAL       1    493    752740  19571240

Here the BETWEEN comparison causes a range scan on an index because the range of rows is small enough to not warrant a full table scan.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM generalledger       WHERE generalledger_id BETWEEN 100 AND 200;     Query                                  Cost     Rows     Bytes ------------------------------------   ----   ------   ------- SELECT STATEMENT on                       4        1        26  TABLE ACCESS BY INDEX ROWID on     GENERALLEDGE                           4        1        26   INDEX RANGE SCAN on XPKGENERALLEDGER    3        1

6.4.2 LIKE Pattern Matching

The approach in the query plan used by the Optimizer will depend on how many rows are retrieved and how the pattern match is constructed.

This query finds one row.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM supplier WHERE name like '24/7 Real          Media, Inc.';     Query                                  Cost     Rows     Bytes ------------------------------------   ----   ------   ------- SELECT STATEMENT on                       2        1       142  TABLE ACCESS BY INDEX ROWID on SUPPLIER  2        1       142   INDEX UNIQUE SCAN on AK_SUPPLIER_NAME   1        1

This query also retrieves a single row but there is a wildcard pattern match and thus a full table scan is the result.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM supplier WHERE name LIKE '21st%';     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- SELECT STATEMENT on                     13      491     69722  TABLE ACCESS FULL on SUPPLIER          13      491     69722

The next query finds almost 3,000 rows and thus a full scan of the table results regardless of the exactness of the pattern match.

Tip

A pattern match using a % full wildcard pattern matching character anywhere in the pattern matching string will usually produce a full table scan.

SQL> SELECT COUNT(*) FROM supplier WHERE name LIKE '%a%';     COUNT(*) --------     2926     EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM supplier WHERE name LIKE '%a%';     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- SELECT STATEMENT on                     13      194     27548  TABLE ACCESS FULL on SUPPLIER          13      194     27548

In general, since LIKE will match patterns which are in no way related to indexes, LIKE will usually read an entire table.

6.4.3 Set Membership

IN should be used to test against literal values and EXISTS is often used to create a correlation between a calling query and a subquery. IN is best used as a pre-constructed set of literal values. IN will cause a subquery to be executed in its entirety before passing the result back to the calling query. EXISTS will stop once a result is found.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM coa WHERE type IN ('A','L','I','E');     Query                        Cost     Rows     Bytes --------------------------   ----   ------   ------- SELECT STATEMENT on             1       38       950  TABLE ACCESS FULL on COA       1       38       950

There are two advantages to using EXISTS over using IN. The first advantage is the ability to pass values from a calling query to a subquery, never the other way around, creating a correlated query. The correlation allows EXISTS the use of indexes between calling query and subquery, particularly in the subquery. The second advantage of EXISTS is, unlike IN, which completes a subquery regardless, EXISTS will halt searching when a value is found. Thus the subquery can be partially executed, reading fewer rows.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM coa WHERE EXISTS             (SELECT type FROM type WHERE type = coa.type);     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- SELECT STATEMENT on                      1       55      1485  NESTED LOOPS SEMI on                    1       55      1485   TABLE ACCESS FULL on COA               1       55      1375   INDEX UNIQUE SCAN on XPKTYPE                    6        12

Now let's compare the use of IN versus the use of EXISTS. The next two examples both use indexes and have the same result. The reason why IN is the same cost as EXISTS is because the query contained within the IN subquery matches an index based on the single column it selects.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT stock_id FROM stock s WHERE EXISTS       (SELECT stock_id FROM stockmovement WHERE stock_id =          s.stock_id);     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- SELECT STATEMENT on 119 118 944  NESTED LOOPS SEMI on 119 118 944   INDEX FULL SCAN on XPKSTOCK 1 118 472   INDEX RANGE SCAN on XFK_SM_STOCK 1 570175 2280700     EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT stock_id FROM stock WHERE stock_id IN       (SELECT stock_id FROM stockmovement);     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- NESTED LOOPS SEMI on                   119      118       944  SELECT STATEMENT on                   119      118       944   INDEX FULL SCAN on XPKSTOCK            1      118       472   INDEX RANGE SCAN on XFK_SM_STOCK       1   570175   2280700

Now let's do some different queries to show a very distinct difference between IN and EXISTS. Note how the first example is much lower in cost than the second. This is because the second option cannot match indexes and executes two full table scans.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM stockmovement sm WHERE EXISTS       (SELECT * FROM stockmovement             WHERE stockmovement_id = sm.stockmovement_id);     Query                            Cost     Rows     Bytes      Sort ------------------------------   ----   ------   -------   ------- SELECT STATEMENT on              8593   570175  16535075  MERGE JOIN SEMI on              8593   570175  16535075   TABLE ACCESS BY INDEX       ROWID on SM                 3401   570175  13684200    INDEX FULL SCAN on       XPKSTOCKMOVEMENT            1071   570175    SORT UNIQUE on                5192   570175   2850875  13755000      INDEX FAST FULL SCAN on       XPKSTMOVE                    163   570175   2850875      EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM stockmovement sm WHERE qty IN          (SELECT qty FROM stockmovement);     Query                            Cost     Rows     Bytes      Sort ------------------------------   ----   ------   -------   ------- SELECT STATEMENT on             16353   570175  15964900  MERGE JOIN SEMI on             16353   570175  15964900   SORT JOIN on                  11979   570175  13684200  45802000    TABLE ACCESS FULL on      STOCKMOVEMENT                 355   570175  13684200   SORT UNIQUE on                 4374   570175   2280700  13755000    TABLE ACCESS FULL on      STOCKMOVEMENT                 355   570175   2280700

Now let's go yet another step further and restrict the calling query to a single row result. What this will do is ensure that EXISTS has the best possible chance of passing a single row identifier into the subquery, thus ensuring a unique index hit in the subquery. The StockMovement table has been joined to itself to facilitate the demonstration of the difference between using EXISTS and IN. Note how the IN subquery executes a full table scan and the EXISTS subquery does not.

EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM stockmovement sm       WHERE EXISTS(             SELECT qty FROM stockmovement             WHERE stockmovement_id = sm.stockmovement_id)       AND stockmovement_id = 10;     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- 1. SELECT STATEMENT on                   2        1        29 2.  NESTED LOOPS SEMI on                 2        1        29 3.   TABLE ACCESS BY INDEX ROWID on        STOCKMOVEMENT                     2        1        24 4.   INDEX UNIQUE SCAN on        XPK_STOCKMOVEMENT                 1   570175 3.   INDEX UNIQUE SCAN on        XPK_STOCKMOVEMENT                 1        5     EXPLAIN PLAN SET statement_id='TEST' FOR       SELECT * FROM stockmovement sm       WHERE qty IN (SELECT qty FROM          stockmovement)       AND stockmovement_id = 10;     Query                                 Cost     Rows     Bytes -----------------------------------   ----   ------   ------- 1. SELECT STATEMENT on                 563        1        28 2.  NESTED LOOPS SEMI on               563        1        28 3.   TABLE ACCESS BY INDEX ROWID on       STOCKMOVEMENT                      2        1        24 4.   INDEX UNIQUE SCAN on       XPK_STOCKMOVEMENT                  1   570175 3.  TABLE ACCESS FULL on       STOCKMOVEMENT                    561   570175   2280700

The benefit of using EXISTS rather than IN for a subquery comparison is that EXISTS can potentially find much fewer rows than IN. IN is best used with literal values and EXISTS is best used as applying a fast access correlation between a calling and a subquery.

6.4.4 Groups

ANY, SOME, and ALL comparisons are generally not very conducive to SQL tuning. In some respects they are best not used.

< Day Day Up >