Performance Considerations for LUW

This section discusses how to obtain the best performance of your database system with relative ease. When available, the focus is placed on the GUI tools of the various platforms; otherwise, commands will be discussed.

The Configuration Advisor

After you have created stored procedures, triggers, and UDFs, it is a good idea to perform an overall tuning of the database system to ensure that they have a properly configured environment in which to run. The easiest and quickest way to tune your database is through the Configuration Advisor GUI, which is launched from the Control Center. Using this utility allows you to tune in a matter of minutes what used to take days, and provides for a great initial starting point from which additional tuning, if desired, can be applied.

The Configuration Advisor will ask you a series of questions about your environment that you answer during the course of several screens. At the end, it will generate a script of recommendations that can be instantly applied, or it can be saved and applied at a later time. The script can even be scheduled through the Task Scheduler.

To launch the Configuration Advisor, from the Control Center right-click on the database that you want to configure and select Configuration Advisor. Figure 12.1 displays one of the "interview" screens.

Figure 12.1. The Configuration Advisor.

When you complete the interview, you will see a list of recommendations similar to what appears in Figure 12.2.

Figure 12.2. The Configuration Advisor's recommendations.

The package cache is a repository for statement information of both dynamic and static SQL. It increases the performance of statements by reducing lock contention in the system catalog tables, keeping frequently used access plans in memory, and increasing the sharing of dynamic SQL statements between applications.

A package is made up of sections and information about the compilation environment, such as optimization level, isolation level, and so on. A section is a compiled executable object that contains logic to satisfy the SQL. Every SQL request is associated with a specific section. These sections and their related packages can be found in the SYSCAT.PACKAGES system catalog view.

The package cache is created when the database is activated (either from the ACTIVATE DB command or first user connection) and will remain in memory until the database is deactivated (either through the DEACTIVATE DB command or when all users have disconnected). It is a performance consideration to ACTIVATE a database and to not rely on the first connection to load DB2's memory structures. The package cache is shared amongst all connections to the database.

When a DB2 agent servicing a connection is about to execute a dynamic SQL, it will check the package cache for the existence of an identical text of the statement. If it is found, the agent will copy the package into its own memory and execute the section. This is why it is important to use parameter markers in dynamic SQL, as it will increase the chances that the same statement text can be found in the Package Cache.

Monitoring SQL Performance

Sometimes you may find that certain queries are not executing as quickly as you would like them to. This is typically caused by a number of factors, the most common of which is a table scan being performed instead of an index. A table scan may be chosen if the current database statistics are out of date, and indicates the table consists of very few rows when in reality it may contain thousands. It can also be caused by an appropriate index simply not existing.

The best way to find slowly executing SQL is by using a Statement Event Monitor, which will capture both dynamic and static SQL. The basic steps are described in Table 12.1.

Table 12.1. Creating a Statement Event Monitor
Command	Description
`connect to <db_name>`	Obtain a connection to the database to monitor
`create event monitor stmtmon for STATEMENTS write to file 'D:\temp\eventmonitoring' buffersize 64 nonblocked`	Create an event monitor that will record statement events as they occur
`set event monitor stmtmon state 1`	Enable the statement event monitor to begin record ing statement activity
`run the application`	Run the application to generate statements to be recorded
`set event monitor stmtmon state 0`	Turn off the statement event monitor.
`db2evmon -db <db_name> -evm stmtmon > D:\temp\eventmonitoring\mon.txt`	Output the recorded information to a text file for analysis purposes.
`drop event monitor stmtmon`	Drop the statement event monitor object once it is no longer needed.

Figure 12.3 shows a sample of output generated by the statement monitor.

Figure 12.3. Sample statement event monitor output.

 12) Statement Event ...   Appl Handle: 8   Appl Id: *LOCAL.DB2.014B04013609   Appl Seq number: 0001   Record is the result of a flush: FALSE   -------------------------------------------   Type     : Static   Operation: Open   Section  : 2   Creator  : DB2ADMIN   Package  : P6280365   Consistency Token  : MBDcQXBU   Package Version ID  :   Cursor   : CURS2   Cursor was blocking: TRUE   -------------------------------------------   Start Time: 02-23-2004 20:36:09.070311   Stop Time:  02-23-2004 20:36:09.098640   Exec Time:  0.028329 seconds   Number of Agents created: 1   User CPU: 0.000000 seconds   System CPU: 0.010014 seconds   Fetch Count: 0   Sorts: 0   Total sort time: 0   Sort overflows: 0   Rows read: 8   Rows written: 0   Internal rows deleted: 0   Internal rows updated: 0   Internal rows inserted: 0   Bufferpool data logical reads: 0   Bufferpool data physical reads: 0   Bufferpool temporary data logical reads: 0   Bufferpool temporary data physical reads: 0   Bufferpool index logical reads: 0   Bufferpool index physical reads: 0   Bufferpool temporary index logical reads: 0   Bufferpool temporary index physical reads: 0   SQLCA:    sqlcode: 0    sqlstate: 00000 13) Statement Event ...   Appl Handle: 8   Appl Id: *LOCAL.DB2.014B04013609   Appl Seq number: 0001   Record is the result of a flush: FALSE   -------------------------------------------   Type     : Dynamic   Operation: Execute   Section  : 4   Creator  : NULLID   Package  : SYSSH200   Consistency Token  : SYSLVL01   Package Version ID  :   Cursor   : SQL_CURSH200C4   Cursor was blocking: FALSE   Text     : SELECT * FROM DB2ADMIN.EMPLOYEE              WHERE EMPNO = ?   -------------------------------------------   Start Time: 02-23-2004 20:36:09.065773   Stop Time:  02-23-2004 20:36:09.099012   Exec Time:  0.033239 seconds   Number of Agents created: 1   User CPU: 0.000000 seconds   System CPU: 0.000000 seconds   Fetch Count: 39   Sorts: 0   Total sort time: 0   Sort overflows: 0   Rows read: 39   Rows written: 0   Internal rows deleted: 0   Internal rows updated: 0   Internal rows inserted: 0   Bufferpool data logical reads: 39   Bufferpool data physical reads: 0   Bufferpool temporary data logical reads: 0   Bufferpool temporary data physical reads: 0   Bufferpool index logical reads: 0   Bufferpool index physical reads: 0   Bufferpool temporary index logical reads: 0   Bufferpool temporary index physical reads: 0   SQLCA:    sqlcode: 100    sqlstate: 02000

As you can see, the text of dynamic SQL statements is displayed, but for static statements you only see a package and a section. You need to perform one more step to view the static SQL. But before you do that, there are some things that you should look for in the output. Expensive SQL usually has one or more of the following characteristics:

Rows Read. Identifies the number of rows read by a statement and does not count index rows or direct table reads. A very high number could mean that an index is needed or that statistics are out of date.
Exec Time. The actual execution time for the statement. High execution times should be investigated further.
Sort Overflows. Identifies where costly sort overflows are occurring. This could indicate the need for an index, execution of the RUNSTATS command, or a larger sort heap.

Once you have found some statements to investigate, make note of the following:

For dynamic SQL: Text
For static SQL: Section and Package

Using Explain to Analyze Access Plans

As mentioned for static SQL you only see the Section and Package number. In order to determine, the SQL, you will need to use the db2expln utility:

 db2expln -database <dbname> -schema <schema> -package <package> -section <sect- ion> -output <outfile.txt>

The output you see is actually an Explain diagram. The reference does not cover interpreting the text-based explain. Instead, it will simply be used to determine the text of the SQL statement. Once you have the statement, we can use Visual Explain (see Figure 12.4) to see what is happening in terms of the access plan.

Figure 12.4. Visual Explain graph.

From the Control Center, right-click on the database for which you want to explain an SQL statement and select Explain SQL. In the SQL text box, you can input your SQL statement. In this case, we will explain SELECT * FROM DB2ADMIN.EMPLOYEE WHERE EMPNO = ?. Then click on OK to generate the query graph.

When analyzing the Explain output, try to identify the following:

Expensive operations such as large sorts, sort overflows, and heavy table usage that could benefit from more sort space, better indexes, updated statistics, or different SQL.
Table scans that could benefit from an index.
Poor predicate selectivity that could be caused by outdated statistics.

If you see table scans and you believe that a proper index exists and is simply not being chosen by the optimizer, you will want to perform a REORG followed by a RUNSTATS on the table(s) in question (explained later in the chapter). If you find the table scan is still occurring, it's time to use the Design Advisor.

The Design Advisor

The Design Advisor is used to recommend indexes, Materialized Query Tables (MQTs), and Multidimensional Clustering tables (MDCs) for a selected set of SQL statements. You can input SQL manually, import SQL from packages, or import the SQL of explained statements. Having the DB2 Wizard generate your MQTs, MDCs, and indexes based on your specified workload greatly reduces what could be a complicated analysis.

To launch the Design Advisor, right-click on the target database within the Control Center and select Design Advisor. The Design Advisor is straightforward to use. Figure 12.5 shows you how to define a workload.

Figure 12.5. Importing statements into the Design Advisor.

Large Object (LOB) Considerations

With regular data types, DB2 uses buffer pool(s) to cache data and index pages for faster in-memory reads and writes. However, LOBs can be as large as 2GB and the system may not have that much memory. Even if there is that much memory, it does not make sense to page all existing data out of the buffer pool just to serve one LOB data object. Therefore, LOBs are accessed directly from disk without going through the buffer pool. This is called a direct I/O operation. As you may imagine, such access is slower than in-memory buffer pool access. In such a case, the system cache may be useful, because the O/S can cache the LOBs instead of DB2. This is advantageous if the LOBs are reused.

Note

For storage and performance reasons, do not use LOBs for small data values. Use VARCHAR or VARCHAR FOR BIT DATA if possible, which can hold a maximum of 32,672 bytes of data.

Temporary Tables Considerations

Recall that temporary tables can remain memory bound until the buffer pool assigned to the user temporary table space is exhausted. For best performance, you should ensure that the buffer pool assigned to the user temporary table space is sufficiently large.

Consider the scenario where a temporary table session.employee is created and then populated with the contents of the employee table in the SAMPLE database. We want to determine if the temporary table remains fully memory-bound. The easiest way to determine if a temporary table is being paged to disk is to look at table space snapshots as illustrated in Figure 12.6.

Figure 12.6. Using table space snapshots to determine whether temporary tables are memory-bound.

 CREATE USER TEMPORARY TABLESPACE usertempspace     MANAGED BY SYSTEM USING ('usertempspace')     BUFFERPOOL ibmdefaultbp DECLARE TABLE SESSION.EMPLOYEE LIKE EMPLOYEE ON COMMIT PRESERVE ROWS INSERT INTO SESSION.EMPLOYEE SELECT * FROM EMPLOYEE UPDATE MONITOR SWITCHES USING BUFFERPOOL ON          -- (1) RESET MONITOR ALL                                    -- (2) GET SNAPSHOT FOR TABLESPACES ON SAMPLE               -- (3)

After declaring and populating the temporary table, we turn on the buffer pool monitor switch on Line (1) and reset DB2's monitors on Line (2). We then populate the temporary table by copying the contents of the employee table into session.employee. GET SNAPSHOT FOR TABLESPACES on Line (3) reveals on whether any parts of the temporary table was flushed to disk.

The snapshot output will be grouped into logical sections by table space. The section of interest, of course, will be the data collected for the user temporary table space we created (usertempspace). Figure 12.7 provides parts of this output. Because the full output is very large, only the statistics that are of primary interest in this example are listed.

Figure 12.7. Snapshot output is grouped into logical sections by table space.

 Tablespace name                            = USERTEMPSPACE   Tablespace ID                            = 3   Tablespace Content Type                  = User Temporary data   Buffer pool ID currently in use          = 1   Number of used pages                     = 1 ...   Buffer pool temporary data logical reads   = 37   Buffer pool temporary data physical reads  = 0 ...   Buffer pool temporary index logical reads  = 0   Buffer pool temporary index physical reads = 0

You can see from the output in Figure 12.7 that there have been no temporary data physical reads and no temporary index physical reads. The temporary table, therefore, has been memory-bound. Had any portion of the temporary table been paged to disk, the physical read counters would not be zero. If physical reads are occurring, one solution may be to increase size of the buffer pool assigned to the user temporary table space. If your temporary tables are quite large, simply adding more memory may not yield sufficient performance gains, and other alternatives should be investigated, which are discussed in the following subsections.

Improving Performance

Temporary tables perform well because

Data manipulation is not logged.
There is no need for locking.
Temporary tables can be memory-bound.

The only characteristic that is tunable is in the way memory is used by the temporary table. In the following sections, we will discuss the methods of tuning the performance of temporary tables.

Creating Indexes

In some cases, you may want to store a large amount of data in temporary tables. Large table scans on temporary tables will consume precious CPU cycles. DB2 allows you to create indexes on temporary tables; the syntax is the same as for persistent tables. Like temporary tables, indexes on temporary tables must also be created in the SESSION schema. Figure 12.8 illustrates.

Figure 12.8. An example of creating an index on a temporary table.

 DECLARE GLOBAL TEMPORARY TABLE session.temp1 (id INT, c2 INT) CREATE INDEX session.temp1idx ON session.temp1(id)

Ongoing Maintenance to Keep Performance at Its Peak

You should perform two very important maintenance activities on a regular basis to keep your database performing consistently wellnamely, REORG and RUNSTATS. REORG is used to eliminate fragmented data and reclaim space from deleted rows of tables and indexes. It should be run whenever there have been a considerable number of INSERT, UPDATE, or DELETE operations. RUNSTATS is used to collect statistics for tables and indexes that are used by the DB2 cost-based optimizer to make educated decisions when it comes to generating access plans for SQL. It is always a good idea to perform a RUNSTATS after a large number of INSERT, UPDATE, or DELETE operations; after adding an index; and after performing a REORG. All tables (including System Catalog tables) typically benefit from running RUNSTATS. RUNSTATS can now be throttled so that it can run while the database is being used with little (if any) impact on the users.

You can use the Control Center to configure these activities. In the Control Center, right-click on your database and select Configure Automatic Maintenance. Inside, you can specify online and offline maintenance windows and define the exact behavior of the maintenance commands. You can also optionally supply a list of contacts who should be notified of operational outcomes. Figure 12.9 shows the Summary page.

Figure 12.9. Automatic Maintenance configuration summary.

In order for your static SQL to take advantage of the new statistics, packages containing SQL will need to be rebound to the database. This is most easily accomplished through the db2rbind command, which will rebind all the packages in the database. When the package is rebound, a new access plan will be generated for the static SQL, and this plan will be optimized using the current statistics.

The specific command is: db2rbind <dbname> -l <logfile.out> ALL. REBIND can also be used to rebind individual packages.

To avoid having to manually run this command, open the DB2 Task Center and define a DB2 command task that is the previous db2rbind command. You can also enable notification in the Task Center similar the notification described previously.