Online Programming Guidelines

 <  Day Day Up  >  

Utilize the following techniques to create efficient online DB2 applications.

Limit the Number of Pages Retrieved

To achieve subsecond transaction response time, try to limit the number of pages retrieved or modified. When subsecond response time is not required, the number of pages to be accessed can be increased until the service level agreement is not met. In general, try to avoid having an impact on more than 100 pages in online transactions.

Limit Online Joins

When you're joining rows, try to limit the number of rows returned by the transaction. There is a practical limit to the amount of data that a user can assimilate while sitting in front of a computer screen. Whenever possible, set a low limit on the number of rows returned (for example, approximately 125 rows, or 5 screens of data). For data intensive applications, adjust this total, as required, with the understanding that performance may degrade as additional data is accessed and returned to the screen.

graphics/v8_icon.gif

As of Version 8, DB2 enables up to 225 tables to be referenced in a single SQL statement. This limit was driven by ERP vendors who developed their applications on other DBMS platforms. Just because DB2 supports up to 225 tables in a SQL statement does not mean you should code such SQL statements, particularly online. As indicated previously, limit online joins to retrieve only the amount of data that actually can be consumed by an online user.


NOTE

DB2 allowed up to 225 tables to be referenced in a single SQL statement as of DB2 V6, but you had to jump through hoops in order to make it happen. In reality, DB2 allowed up to 15 query blocks per SQL statement, and each query block could reference up to 15 tables ”thus, 15 x 15 = 225. As of DB2 V8, you can reference 225 tables in each SQL statement without worrying about the number of query blocks.


Limit Online Sorts

To reduce online data sorting, try to avoid using GROUP BY , ORDER BY , DISTINCT , and UNION unless appropriate indexes are available. Of course, if your transaction absolutely requires sorting (to return data in a particular sequence, to group columns , or to remove duplicates), then perform the sort in DB2. Doing so will almost always outperform returning the data and sorting it in the program logic (or using an external sort package, such as DFSORT or Syncsort).

Issue COMMIT s Before Displaying

Always issue commits ( CICS SYNCPOINT , TSO COMMIT , or IMS CHKP ) before sending information to a terminal.

Modularize Transactions

When possible, design separate transactions for selecting, updating, inserting, and deleting rows. This way, you can minimize page locking and maximize modular program design.

Minimize Cascading DELETE s

Avoid online deletion of parent table rows involved in referential constraints specifying the CASCADE delete rule. When a row in the parent table is deleted, multiple deletes in dependent tables can occur. This result degrades online performance.

Keep in mind that as of V6, triggers also can cause cascading data modification. Be sure to include the impact of triggers when analyzing the overall impact referential integrity-invoked cascading deletes can cause.

Be Aware of Overactive Data Areas

An overactive data area is a portion of a table or index that is accessed and updated considerably more than other tables (or portions thereof) in the online application. Be aware of overactive data areas.

Overactive data areas are characterized by the following features: a relatively small number of pages (usually 10 pages or fewer, and sometimes only 1 row), and a large volume of retrievals and updates (usually busy more than half the time that the online application is active).

Overactive data areas can be caused, for example, by using a table with one row (or a small number of rows) to assign sequential numbers for columns in other tables or files; or by using a table to store counters, totals, or averages of values stored in other tables or files. You also can cause overactive data areas when you use tables to implement domains that are volatile or heavily accessed by the online system. These situations cause many different programs to access and modify a small amount of data over and over. An inordinate number of resource unavailable and timeout abends can be caused by overactive data areas unless they are monitored and maintained .

Reduce the impact of overactive data areas by designing transactions with the following characteristics:

  • Issue OPEN , FETCH , UPDATE , and CLOSE cursor statements (hereafter referred to as update sequences ) as close to each other as possible.

  • Invoke update sequences as rapidly as possible in the transaction; in other words, do not place unrelated operations in the series of instructions that update the overactive data area.

  • Code as few intervening instructions as possible between the OPEN , FETCH , and CLOSE statements.

  • Place the update sequence as close to the transaction commit point as possible (that is, near the end of the transaction code).

  • Isolate the active range to a single partition (or several partitions). Assign the partitions to a dedicated buffer pool (perhaps with a related hiperspace) and to a device and controller that has excess capacity during peak periods.

  • Try to use DDL options to reduce the impact of overactive data areas and increase concurrent access. You can do so by increasing free space on the table space and indexes for the tables or by increasing the MAXROWS table space parameter (or adding a large column to the end of the row for each table thus reducing the number of rows per page).

Consider Using TIMESTAMP for Sequencing

Sometimes you will need to automatically set the value for a column. Consider using TIMESTAMP data types instead of sequentially assigned numbers. You can generate timestamps automatically using the CURRENT TIMESTAMP special register (or the NOT NULL WITH DEFAULT option). A timestamp column has the same basic functionality as a sequentially assigned number, without the requirement of designing a table to assign sequential numbers. Remember, a table with a sequencing column can cause an overactive data area.

A column defined with the TIMESTAMP data type is marked by the date and time (down to the microsecond) that the row was inserted or updated. These numbers are serial unless updates occur across multiple time zones. Although duplicate timestamps can be generated if two transactions are entered at the same microsecond, this circumstance is rare. You can eliminate this possibility by coding a unique index on the column and checking for a -803 SQLCODE (duplicate index entry).

The only other drawback is the size of the timestamp data type. Although physically stored as only 10 bytes, the timestamp data is presented to the user as a 26-byte field. If users must remember the key, a timestamp usually does not suffice.

A common workaround for numbers that must be random is to use the microsecond portion of the timestamp as a random number generator to create keys automatically, without the need for a table to assign them. Note, though, that these numbers will not be sequenced by order of input.

graphics/v7_icon.gif

Consider Using IDENTITY Columns or SEQUENCE Objects

At times it is necessary for an application to have a column that stores sequential values. For every new row that is inserted, the previous value needs to be incremented and then stored in the new row. DB2 provides two mechanisms for automatically generating these sequential values: IDENTITY columns and SEQUENCE objects.

graphics/v8_icon.gif

IDENTITY columns were added to DB2 V7 (actually, they were made available during the V6 refresh); SEQUENCE objects were added to DB2 V8. Both provide a means of creating a column that is automatically incremented by DB2. In general, SEQUENCE objects are more flexible and manageable than IDENTITY columns. However, be sure to consult Chapter 5, where the details of each are discussed in-depth .


Consider Using ROWID for Direct Row Access

When the table you need to access contains a ROWID column, you can use that column to directly access a row without using an index or a table space scan. DB2 can directly access the row because the ROWID column contains the location of the row. Direct row access is very efficient.

To use direct row access, you must first SELECT the row using traditional methods . DB2 will either use an index or a scan to retrieve the row the first time. Be sure to retrieve the ROWID column and store it in a host variable. After the row has been accessed once, the ROWID column can be used to directly access the row again. Simply include a predicate in the WHERE clause for the ROWID column, such as the following:

 

 WHERE ROWID_COL = :HVROWID 

Of course, DB2 may revert to an access type other than direct row access if it determines that the row location has changed. You must plan for the possibility that DB2 will not choose to use direct row access, even if it indicates its intent to do so during EXPLAIN . If the predicate you are using to do direct row access is not indexable, and if DB2 is unable to use direct row access, a table space scan will be used instead. This can negatively impact performance.

For a query to qualify for direct row access, the search condition must be a stage 1 predicate of one of the following forms:

  • A simple Boolean predicate formulated as

     

     COLUMN = non-column expression 

    where COLUMN is a ROWID data type and non-column expression contains a ROWID value

  • A simple Boolean predicate formulated as

     

     COLUMN IN 'list' 

    where COLUMN is a ROWID data type and the values in the list are ROWID values and an index is defined on the COLUMN

  • A compound predicate using AND where one of the component predicates fits one of the two previous descriptions

CAUTION

Do not attempt to "remember" ROWID values between executions of an application program because the ROWID value can change, due to a REORG , for example.

Additionally, do not attempt to use ROWID values across tables, even if those tables are exact shadow copies. The ROWID values will not be the same across tables.


Do Not INSERT into Empty Tables

Avoid inserting rows into empty tables in an online environment. Doing so causes multiple I/Os when you're updating indexes and causes index page splits . If you must insert rows into an empty table, consider one of the following options. You can format the table by prefilling it with index keys that can be updated online instead of inserted. This way, you can reduce I/O and eliminate index page splitting because the index is not updated.

Another option is to partition the table so that inserts are grouped into separate partitions. This method does not reduce I/O, but it can limit page splitting because the index updates are spread across multiple index data sets instead of confined to just one.

Increase Concurrent Online Access

Limit deadlock and timeout situations by coding applications to increase their concurrency. One option is to code all transactions to access tables in the same order. For example, do not sequentially access departments in alphabetical order by DEPTNAME in one transaction, from highest to lowest DEPTNO in another, and from lowest to highest DEPTNO in yet another. Try to limit the sequential access to a table to a single method.

Another option is to update and delete using the WHERE CURRENT OF cursor option instead of using independent UPDATE and DELETE statements. A third option for increasing online throughput is to plan batch activity in online tables during inactive or off-peak periods.

Consider Saving Modifications Until the End of the UOW

You can write an application program so that all modifications occur at the end of each unit of work instead of spreading them throughout the program. Because modifications do not actually occur until the end of the unit of work, the placement of the actual SQL modification statements is of no consequence to the eventual results of the program. If you place inserts, updates, and deletes at the end of the unit of work, the duration of locks held decreases. This technique can have a significant positive impact on concurrency and application performance.

Use OPTIMIZE FOR 1 ROW to Disable List Prefetch

Turning off list prefetch for online applications that display data on a page-by-page basis is often desirable. When you use list prefetch, DB2 acquires a list of RIDs from matching index entries, sorts the RIDs, and then accesses data pages using the RID list. The overhead associated with list prefetch usually causes performance degradation in an online, paging environment. OPTIMIZE FOR 1 ROW disables list prefetch and enhances performance.

Implement a Repositioning Cursor for Online Browsing

Use repositioning techniques, similar to those discussed for repositioning batch cursors , to permit online browsing and scrolling of retrieved rows by a primary key. Implement this cursor to reposition using a single column key:

 

 EXEC SQL     DECLARE SCROLL0 FOR         SELECT   PROJNO, PROJNAME, MAJPROJ         FROM     PROJ         WHERE    PROJNO > :LAST-PROJNO         ORDER BY PROJNO END-EXEC. 

You have two options for repositioning cursors when browsing data online. Both are efficient if indexes appear on columns in the predicates. Test both in your critical online applications to determine which performs better.

The first uses predicates tied together with AND :

 

 EXEC SQL     DECLARE SCROLL1 FOR         SELECT   PROJNO, ACTNO, ACSTDATE,                  ACSTAFF, ACENDATE         FROM     PROJACT         WHERE    (PROJNO = :LAST-PROJNO         AND       ACTNO = :LAST-ACTNO         AND       ACSTDATE > :LAST-ACSTDATE)         OR       (PROJNO = :LAST-PROJNO         AND       ACTNO > :LAST-ACTNO)         OR       (PROJNO > :LAST-PROJNO)         ORDER BY PROJNO, ACTNO, ACSTDATE END-EXEC. 

The second uses predicates tied together with OR :

 

 EXEC SQL     DECLARE SCROLL2 FOR         SELECT    PROJNO, ACTNO, ACSTDATE,                   ACSTAFF, ACENDATE         FROM      PROJACT         WHERE     (PROJNO >= :LAST-PROJNO)         AND NOT   (PROJNO = :LAST-PROJNO AND ACTNO < :LAST-ACTNO)         AND NOT   (PROJNO = :LAST-PROJNO AND ACTNO = :LAST-ACTNO         AND        ACSTDATE <= :LAST-ACSTDATE)         ORDER BY  PROJNO, ACTNO, ACSTDATE END-EXEC. 

The rows being browsed must have a primary key or unique index that can be used to control the scrolling and repositioning of the cursors. Otherwise, rows might be eliminated because the cursors cannot identify the last row accessed and displayed. If all occurrences of a set of columns are not displayed on a single screen, and more than one row has the same values, rows are lost when the cursor is repositioned after the last value (a duplicate) on the previous screen.

 <  Day Day Up  >  


DB2 Developers Guide
DB2 Developers Guide (5th Edition)
ISBN: 0672326132
EAN: 2147483647
Year: 2004
Pages: 388

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net