< Day Day Up > |
All of the guidelines thus far in this chapter have explored ways to make retrieving data more efficient. But data also must be modified, and you need to ensure that data modification is performed efficiently , too. Under normal circumstances, you can modify data in a DB2 table in six ways:
This section provides tips for the efficient implementation of the first three methods . You can find guidelines for the others as follows :
Limit Updating Indexed ColumnsWhen you update columns in indexes, a corresponding update is applied to all indexes in which the columns participate. Updating can have a substantial impact on performance due to the additional I/O overhead. Use FOR UPDATE OF CorrectlySpecify only those columns that actually will or can be updated in the FOR UPDATE OF column list of a cursor. DB2 does not use any index that contains columns listed in the FOR UPDATE OF clause. Consider Using DELETE/INSERT Instead of FOR UPDATE OFIf all columns in a row are being updated, use DELETE on the old row and use INSERT on the new one rather than the FOR UPDATE OF clause. Update Multiple RowsYou have two options for updating data using the SQL UPDATE verb:
If the data does not have to be retrieved by the application before the update, use the direct SQL UPDATE statement. A cursor UPDATE with the WHERE CURRENT OF option performs worse than a direct UPDATE for two reasons. First, the rows to be updated must be retrieved from the cursor a row at a time. Each row is fetched and then updated. A direct UPDATE affects multiple rows with one statement. Second, when using a cursor, you must add the overhead of the OPEN and CLOSE statement. Update Only Changed ColumnsUPDATE statements should specify only columns in which the value will be modified. For example, if only the ACSTAFF column of the DSN8810.PROJACT table should be changed, do not code the following: EXEC SQL FETCH C1 INTO :HV-PROJNO, :HV-ACTNO, :HV-ACSTAFF, :HV-ACSTDATE, :HV-ACENDATE END-EXEC. MOVE 4.5 TO HV-ACSTAFF. UPDATE DSN8810.PROJACT SET PROJNO = :HV-PROJNO, ACTNO = :HV-ACTNO, ACSTAFF = :HV-ACSTAFF, ACSTDATE = :HV-ACSTDATE, ACENDATE = :HV-ACENDATE WHERE CURRENT OF C1; Although the host variables contain the same data currently stored in the table, you should avoid this type of coding. DB2 checks to see whether the data is different before performing the update. If none of the values are different from those already stored in the table, the update does not take place. Performance may suffer, though, because DB2 has to perform the value checking. You can avoid this situation by coding the UPDATE statement as follows: UPDATE DSN8810.PROJACT SET ACSTAFF = :HV-ACSTAFF WHERE CURRENT OF C1; Disregard this guideline when the application you are developing requires you to code a complicated check algorithm that DB2 can perform automatically. Because of the complexity of the code needed to check for current values, implementing this type of processing is not always feasible . Nevertheless, try to avoid specifying useless updates of this type when issuing interactive SQL. Consider Dropping Indexes Before Large InsertionsWhen you execute a large number of INSERT s for a single table, every index must be updated with the columns and the appropriate RIDs (record IDs) for each inserted row. For very large insertions, the indexes can become disorganized, causing poor performance. Dropping all indexes for the table, performing the INSERT s, and then re-creating the indexes might be more efficient. The trade-offs to consider are
Exercise Caution When Issuing Ad Hoc DELETE StatementsBe extremely careful when issuing SQL DELETE statements outside the control of an application program. Remember that SQL acts on a set of data, not just one row. All rows that qualify based on the SQL WHERE clause are updated or deleted. For example, consider the following SQL statement: DELETE FROM DSN8810.DEPT; This SQL statement, called a mass DELETE , effectively deletes every row from the DEPT table. Normally, this result is undesirable. Exercise Caution When Issuing Ad Hoc UPDATE StatementsWhen issuing an ad hoc UPDATE , take care to specify an appropriate WHERE clause. Consider the following SQL statement: UPDATE DSN8810.DEPT SET DEPTNAME = 'NEW DEPARTMENT'; This SQL statement changes the value of the DEPTNAME column for every row in the table to the value 'NEW DEPARTMENT' . This result occurs because no WHERE clause is coded to limit the scope of the UPDATE . Requests of this nature are not usually desirable and should be avoided. Mass DELETE Versus LOADSometimes you need to empty a table. You can do so by issuing a mass DELETE or by loading an empty data set. A mass DELETE usually is more efficient when you're using segmented table spaces. Loading an empty data set usually is more efficient when you're using simple or partitioned table spaces.
With DB2 Version 7 comes the ability to code a self-referencing sub- SELECT on searched UPDATE and DELETE statements. In previous releases of DB2, the WHERE clause cannot refer to the table (or view) being modified by the statement. For example, the following SQL is legitimate as of DB2 V7, and can be used to implement a 10% raise for employees who earn less than their department's average salary: UPDATE DSN8710.EMP E1 SET SALARY = SALARY * 1.10 WHERE SALARY < (SELECT AVG(SALARY) FROM DSN8710.EMP E2 WHERE E1.WORKDEPT = E2.WORKDEPT); DB2 will evaluate the complete subquery before performing the requested UPDATE . Use INSERT and UPDATE to Add Long ColumnsThe maximum length of a string literal that can be inserted into DB2 is 255 characters . This restriction poses a problem when you must insert a LONG VARCHAR column in an ad hoc environment. To get around this limitation, issue an INSERT followed immediately by an UPDATE . For example, if you need to insert 260 bytes of data into a LONG VARCHAR column, begin by inserting the first 255 bytes as shown: INSERT INTO your.table COLUMNS (LONG_COL, other columns ) VALUES (' first 255 bytes of LONG_COL ', other values ); Follow the INSERT with an UPDATE statement to add the rest of the data to the column, as in the following example: UPDATE your.table SET LONG_COL = LONG_COL ' remaining 5 bytes of LONG_COL ', WHERE KEY_COL = 'key value'; For this technique to be successful, a unique key column (or columns) must exist for the table. If each row cannot be uniquely identified, the UPDATE cannot be issued because it might update more data than you want. CAUTION Prior to DB2 V6 (and DB2 V5 with a retrofit APAR) the maximum length of a string literal that could be inserted into DB2 was limited to 254 characters, instead of 255.
The ability to code a self-referencing sub- SELECT on searched UPDATE and DELETE statements was added to DB2 Version 7. In previous releases of DB2, the WHERE clause cannot refer to the table (or view) being modified by the statement. For example, the following SQL is legitimate as of DB2 V7, and can be used to implement a 10% raise for employees who earn less than their department's average salary: UPDATE DSN8810.EMP E1 SET SALARY = SALARY * 1.10 WHERE SALARY < (SELECT AVG(SALARY) FROM DSN8810.EMP E2 WHERE E1.WORKDEPT = E2.WORKDEPT); DB2 will evaluate the complete subquery before performing the requested UPDATE .
DB2 Version 8 offers the intriguing new ability to SELECT from an INSERT statement. To understand why we might need to do so, let's first review some background data. In some cases, it is possible to perform actions that generate new data for an inserted row before it gets saved to disk. For example, a BEFORE TRIGGER might change data before it is even recorded to disk, but the application program will not have any knowledge of this change that is made in the trigger. Using general registers, identity columns, and user -defined defaults can produce similar effects. What can be done if the program needs to know the final column values? Prior to DB2 V8, this is difficult and inefficient to implement. For example, to retrieve that last identity column value, a separate singleton SELECT statement is needed to retrieve the result of IDENTITY_VAL_LOCAL() into a host variable. The SELECT FROM INSERT syntax solves this problem. It allows you to both insert the row and retrieve the values of the columns with a single SQL statement. It performs very well, because it performs both the INSERT and the SELECT as a single operation. Consider the following example: SELECT PRSTDATE FROM INSERT (PROJNO, DEPTNO, RESPEMP, PRSTDATE) INTO DSN8810.PROJ VALUES('000001', 'B01', '000020', CURRENT DATE); The data is inserted as specified in the VALUES clause, and then retrieved as specified in the SELECT . Without the ability to select PRSTDATE , the program would have no knowledge of the value supplied to PRSTDATE , because it was assigned using CURRENT DATE . Instead, the program would have had to issue a separate SELECT statement to retrieve the actual date. With this new syntax, the program can retrieve the CURRENT DATE value that was just inserted into PRSTDATE without adding overhead. List Columns for INSERTWhen you are coding an INSERT statement in an application program, list the column names for each value you are inserting. Although you could merely align the values in the same order as the column names in the table, doing so leads only to confusion. Furthermore, if ALTER is used to add new columns to the table, every INSERT statement that does not explicitly list the columns being inserted will fail. The proper format is INSERT INTO DSN8810.DEPT (DEPTNO, DEPTNAME, MGRNO, ADMRDEPT) VALUES ('077', 'NEW DEPARTMENT', '123456', '123') ;
When you need to INSERT multiple rows, consider deploying a multi-row INSERT to place more than one row at a time into the table. This capability is new as of DB2 Version 8. By inserting multiple rows at once, your request may become more efficient. More information on multi-row INSERT s is provided in Chapter 11. |
< Day Day Up > |