Designing Indexes for Performance

Team-Fly    

 
DB2 Universal Database for OS/390 v7.1 Application Certification Guide
By Susan Lawson
Table of Contents
Chapter 17.  Application Performance and Optimization

Designing Indexes for Performance

The following sections show the different access paths provided by indexes. We should ensure that indexes are in place to provide the best access path possible for all critical queries.

Indexes for Efficient Access

DB2 uses the following index access paths:

  • Matching index scan (MATCHCOLS > 0)

  • Index screening

  • Nonmatching index scan (ACCESSTYPE = I and MATCHCOLS = 0)

  • IN-list index scan (ACCESSTYPE = N)

  • Multiple index access (ACCESSTYPE is M, MX, MI, or MU)

  • One-fetch access (ACCESSTYPE = I1)

  • Index-only access (INDEXONLY = Y)

  • Equal unique index (MATCHCOLS = number of index columns )

Matching Index Scan (MATCHCOLS > 0)

In a matching index scan, predicates are specified on either the leading or all of the index key columns. These predicates provide filtering; only specific index pages and data pages need to be accessed. If the degree of filtering (cardinality) is high, the matching index scan is efficient.

Index matching predicates are applied in the sequence of the index columns. Therefore, coding sequence is not important for index matching predicates. DB2 evaluates WHERE clause predicates based upon the following sequence:

  1. Indexed predicates

    - Matching predicates

    - Index screening predicates

  2. Stage 1 predicates

  3. Stage 2 predicates

Excluding index matching predicates, within each stage (screening, stage 1 or 2), predicates are generally applied in the following sequence:

  1. All equal predicates (and IS NULL).

  2. All range predicates (and IS NOT NULL).

  3. All other predicates.

The final rule for predicate evaluation dictates that within each of the above guidelines, predicates are evaluated in the sequence that they are coded. This gives some control, and therefore the programmer should code the most restrictive predicates first.

Index Screening

In index screenin g, predicates are specified on index key columns but are not part of the matching columns. Those predicates improve subsequent data page access by reducing the number of rows that qualify while searching the index. For example, with an index on T(C1,C2,C3,C4) in the following SQL statement, C3 > 0 and C4 = 2 are index screening predicates, whereas C1 = 1 is a matching index predicate.

 SELECT *FROM T  WHERE C1 =1 AND C3 >0 AND C4 =2 AND C5 =8; 
Nonmatching Index Scan (ACCESSTYPE = I and MATCHCOLS = 0)

In a nonmatching index scan, no matching columns are in the index. Hence, all the index keys must be examined. Because a nonmatching index scan usually does not provide strong filtering, only a few cases provide an efficient access path if subsequent data pages must also be accessed. If the access path is index-only, then a nonmatching index scan may prove beneficial, especially if the index is smaller than the tablespace.

NOTE

graphics/note_icon.jpg

An index-only nonmatching index scan may not always be more efficient than a tablespace scan. A tablespace may be smaller than an index if the number of index keys is large or the tablespace is compressed. Also, assuming both are of similar size , an index scan may be less efficient because the scan must follow the leaf page pointer chain, which may not be sequential due to index page splits .


IN-List Index Scan (ACCESSTYPE = N)

An IN-list index scan is a special case of the matching index scan, in which a single indexable IN predicate is used as a matching equal predicate. You can regard the IN-list index scan as a series of matching index scans with the values in the IN predicate being used for each matching index scan. The following example has an index on (C1,C2,C3,C4) and might use an IN-list index scan:

 SELECT *FROM T  WHERE C1=1 AND C2 IN (1,2,3) AND C3>0 AND C4<100; 

This example could result in an ACCESSTYPE = N and MATCHCOLS = 3 (C1, C2, C3). C4 would be an index screening predicate.

NOTE

graphics/note_icon.jpg

At most, one IN predicate can be a matching predicate; the exception, however, is a noncorrelated IN subquery. IN-list predicates cannot be matching predicates for MX access or list prefetch.


Multiple Index Access (ACCESSTYPE is M, MX, MI, or MU)

Multiple index access uses more than one index to access a table. It is a good access path when

  • No single index provides efficient access.

  • A combination of index accesses provides efficient access.

RID lists are constructed for each of the indexes involved. The unions (OR conditions) or intersections (AND conditions) of the RID lists produce a final list of qualified RIDs that is used to retrieve the result rows, using list prefetch. You can consider multiple index access as an extension to list prefetch with more complex RID retrieval operations in its first phase. The complex operators are union and intersection. DB2 may choose multiple index access for the following query:

 SELECT *FROM EMP  WHERE (AGE =34)OR (JOB ='MANAGER'); 

For this query,

  • EMP is a table with columns EMPNO, EMPNAME, DEPT, JOB, AGE, and SAL.

  • EMPX1 is an index on EMP with key column AGE.

  • EMPX2 is an index on EMP with key column JOB.

One-Fetch Access (ACCESSTYPE = I1)

One-fetch index access requires retrieving only one row. It is the best possible access path and is chosen whenever it is available. It applies to a statement with a MIN or MAX column function: The order of the index allows a single row to give the result of the function. Either an ascending or descending index can be utilized to satisfy a MIN or MAX function using one-fetch index access.

Index-Only Access (INDEXONLY = Y)

With index-only access, the access path does not require any data pages because the access information is available in the index. Conversely, when an SQL statement requests a column that is not in the index, updates any column in the table, or deletes a row, DB2 has to access the associated data pages. Because the index is generally smaller than the table itself, an index-only access path usually processes the data efficiently .

NOTE

graphics/note_icon.jpg

The number of levels of the index can determine whether index-only access will be chosen instead of index and data access via a different index. Assume IX1 has four levels and provides an index-only access path, and IX2 has three levels and is not index-only. DB2 may choose IX2 and access the data pages, since both access paths result in four I/Os.


Equal Unique Index (MATCHCOLS = number of index columns)

An index that is fully matched and unique, and in which all matching predicates are equal predicates, is called an equal unique index case. This case guarantees that only one row is retrieved. If there is no one-fetch index access available, this is considered the most efficient access over all other indexes that are not equal unique. (The uniqueness of an index is determined by whether or not it was defined as unique.) Sometimes DB2 can determine that an index that is not fully matching is actually an equal unique index case. This is based upon the existence of another unique index with a subset of the key columns.

Indexes to Help Avoid Sorts

As well as providing selective access to data, indexes can also order data, sometimes eliminating the need for sorting. Some sorts can be avoided if index keys are in the order needed by ORDER BY, GROUP BY, a join operation, or DISTINCT in a column function. A DISTINCT sort can also be avoided if a unique index exists on the selected columns and/or WHERE clause columns. In other cases, as when list prefetch is used, the index does not provide useful ordering, and the selected data might have to be sorted.

When it is absolutely necessary to prevent a sort, consider creating an index on the column or columns necessary to provide that ordering. Consider also using the clause OPTIMIZE FOR 1 ROW to discourage DB2 from choosing a sort for the access path. Consider the following query:

 SELECT C1,C2,C3 FROM T  WHERE C1 >1 ORDER BY C1 OPTIMIZE FOR 1 ROW; 

An ascending index on C1 or an index on (C1,C2,C3) could eliminate a sort.

NOTE

graphics/note_icon.jpg

The OPTIMIZE FOR 1 ROW has implications other than avoiding a sort and therefore should be utilized only when necessary. Consider specifying a value that represents the number of rows that are required to be processed by the application.


Not all sorts are inefficient. For example, if the index that provides ordering is not an efficient one and many rows qualify, it is possible that using another access path to retrieve and then sort the data could be more efficient than the inefficient, ordering index.

Index Advisor Facility

The index advisor is a management tool that provides assistance in the designing of indexes on tables. It is useful in the following situations:

  • Finding the best indexes for a problem query.

  • Finding the best indexes for a set of queries (workload) subject to resource limits that are optionally applied.

  • Testing out an index on a workload without having to create the index.

There are two concepts associated with this facility: workload and virtual indexes.

A workload is a set of SQL statements (SELECT, INSERT, UPDATE, DELETE) that DB2 has to process over a given period of time. The information in the workload is concerned with the type and frequency of the SQL statements over a given period of time. The index advisor uses this workload information in conjunction with the database information to recommend indexes. The goal of the advising engine is to minimize the total workload cost.

Virtual indexes are indexes that do not exist in the current database schema. These indexes could be either recommendations that the facility has made or indexes that are being proposed to be created but should be modeled for the effect using the advisor facility.

For DB2 for OS/390 and z/OS, this tool works on a set of dynamic SQL statements (only dynamic SQL is currently supported; static will be delivered in a later release). This process is performed by creating a modeling database for DB2 for OS/390 on a DB2 for UNIX, Windows, or Linux workstation using configuration parameters to mimic DB2 for OS/390 and z/OS. One way to use this tool is to invoke it through the Control Center.


Team-Fly    
Top


DB2 Universal Database for OS. 390 v7. 1 Application Certification Guide
DB2(R) Universal Database for OS/390 V7.1 Application Certification Guide (IBM DB2 Certification Guide Series)
ISBN: 0131007718
EAN: 2147483647
Year: 2002
Pages: 163
Authors: Susan Lawson

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net