The CHECK DATA Option

 <  Day Day Up  >  

The CHECK DATA Option

The CHECK DATA utility is used to verify the accuracy and integrity of data in DB2 tables.

Referential Integrity Checking

One function of the CHECK DATA option of the CHECK utility checks the status of referential constraints. It is used to validate foreign key values in the rows of a dependent table against primary key values in its associated parent table. For example, consider a referential constraint defined in the DB2 sample tables. The DSN8810.DEPT table has a foreign key, RDE , defined on the column MGRNO . It references the primary key of DSN8810.EMP , which is the EMPNO column. The CHECK DATA utility can be used to verify that all occurrences of MGRNO in the DSN8810.DEPT sample table refer to a valid EMPNO in the DSN8810.EMP sample table.

CHECK DATA can run against a single table space, multiple table spaces, or a single partition of a partitioned table space.

CHECK DATA can delete invalid rows and copy them to an exception table. The CHECK DATA utility resets the check pending status if constraint violations are not encountered or if the utility was run with the DELETE YES option.

The JCL in Listing 31.1 can be used to check data in the DB2 sample tables that contain referential constraints.

Listing 31.1. CHECK DATA JCL
 //DB2JOBU  JOB (UTILITY),'DB2 CHECK DATA',MSGCLASS=X,CLASS=X, //   NOTIFY=USER //* //**************************************************************** //* //*            DB2 CHECK DATA UTILITY //* //**************************************************************** //* //UTIL EXEC DSNUPROC,SYSTEM=DSN,UID='CHEKDATA',UTPROC=" //* //*  UTILITY WORK DATASETS //* //DSNUPROC.SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SORTOUT DD DSN=&&SORTOUT, //        UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SYSERR DD DSN=&&SYSERR, //        UNIT=SYSDA,SPACE=(CYL,(1,1)) //DSNUPROC.SYSUT1 DD DSN=&&SYSUT1, //        UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.UTPRINT DD SYSOUT=X //* //*  UTILITY INPUT CONTROL STATEMENTS //*       This CHECK DATA statement checks DSN8810.DEPT for //*       referential constraint violations, deletes all //*       offending rows, and places them into the exception //*       table, DSN8510.DEPT_EXCPTN. //* //DSNUPROC.SYSIN    DD  *     CHECK DATA TABLESPACE DSN8D81A.DSN8S81D     FOR EXCEPTION IN DSN8810.DEPT             USE DSN8810.DEPT_EXCPTN     SCOPE ALL     DELETE YES /* // 

NOTE

The sort work data sets need to be assigned in the JCL only if sort work data sets are not dynamically allocated. Additionally, you should consider explicitly defining sort work data sets when checking very large tables.


Check Constraint Checking

The second function of the CHECK DATA option of the CHECK utility checks the status of check constraints. It is used to validate column values against check constraints defined on those columns . For example, consider a check constraint defined on the SALARY column of the DSN8810.EMP table as follows :

 

 CONSTRAINT CHECK_SALARY CHECK (SALARY < 50000.00) 

All values of the SALARY column must be less than 50000.00 or they are in violation of the check constraint. The CHECK DATA utility can be used to verify that all occurrences of SALARY in the DSN8810.EMP sample table actually contain a valid SALARY conforming to the check constraint.

The columns of a table can contain values that violate the check constraint in the following two circumstances:

  1. When a table that already contains data is altered to add a check constraint, enforcement of the constraint depends upon the value of the DB2 CURRENT RULES special register. If the value of the CURRENT RULES register is DB2 , check constraint enforcement is deferred during table alteration and the table is placed in a check pending state. If the value of the CURRENT RULES register is STD , check constraint enforcement is immediate. If no rows violate the constraint, the alteration proceeds normally. If existing rows do violate the constraint, the table is placed in a check pending state.

  2. When the LOAD utility is executed specifying the ENFORCE NO clause.

The syntax and JCL specification for checking check constraints is the same as that used for checking referential constraints.

LOB Reference Checking

The third function of the CHECK DATA option of the CHECK utility checks the status of LOB references. It is used to validate LOB columns against the LOB pointers to the auxiliary table. Before running CHECK DATA to check LOBs, be sure to first run CHECK LOB to ensure the validity of the LOB table space and run CHECK INDEX or REBUILD INDEX on the auxiliary table index to be sure it is valid.

CHECK DATA can be run against base table spaces only, not LOB table spaces.

The JCL in Listing 31.2 can be used to check data for the DB2 sample table that contains LOB columns.

Listing 31.2. CHECK DATA JCL (for LOB References)
 //DB2JOBU  JOB (UTILITY),'DB2 CHECK DATA',MSGCLASS=X,CLASS=X, //   NOTIFY=USER //* //**************************************************************** //* //*            DB2 CHECK DATA UTILITY //* //**************************************************************** //* //UTIL EXEC DSNUPROC,SYSTEM=DSN,UID='CHEKDATA',UTPROC=" //* //*  UTILITY WORK DATASETS //* //DSNUPROC.SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SORTWK02 DD UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SORTOUT DD DSN=&&SORTOUT, //        UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.SYSERR DD DSN=&&SYSERR, //        UNIT=SYSDA,SPACE=(CYL,(1,1)) //DSNUPROC.SYSUT1 DD DSN=&&SYSUT1, //        UNIT=SYSDA,SPACE=(CYL,(5,1)) //DSNUPROC.UTPRINT DD SYSOUT=X //* //*  UTILITY INPUT CONTROL STATEMENTS //*       This CHECK DATA statement checks DSN8810.EMP_PHOTO_RESUME //*       for LOB reference problems. //* //DSNUPROC.SYSIN    DD  *     CHECK DATA TABLESPACE DSN8D81L.DSN8S81B     SCOPE AUXONLY    AUXERROR REPORT /* // 

The SCOPE Parameter

The SCOPE parameter is used to set the scope of the rows in the table space that are to be checked. There are four SCOPE options:

PENDING

Indicates that only rows in CHECK PENDING status are to be checked for the specified table spaces, partitions, and tables. The referential integrity check, constraint check, and LOB column check are all performed. If this option is run on a table space that is not in CHECK PENDING status, the table space is ignored.

AUXONLY

Indicates that only the LOB column check is to be performed for table spaces having tables with LOB columns. The referential integrity and constraint checks are not performed.

ALL

Indicates that all dependent tables in the specified table spaces are to be checked. The referential integrity check, constraint check, and the LOB check are performed.

REFONLY

Indicates that all dependent tables in the specified table spaces are to be checked. However, only the referential integrity check and constraint check are performed. The LOB column check is not performed.


NOTE

PENDING is the default option if SCOPE is not specified.


The AUXERROR Parameter

The AUXERROR parameter is used to specify the action to take when LOB reference problems are encountered. There are two options:

REPORT

Indicates that the base table space is set to the auxiliary CHECK PENDING ( ACHKP ) status.

INVALIDATE

Indicates that the base table LOB column is set to an invalid status. The base table space is set to auxiliary warning ( AUXW ) status.


For both REPORT and INVALIDATE a LOB column check error is reported with a warning message.

NOTE

REPORT is the default option if AUXERROR is not specified.


CHECK DATA Phases

There are six phases of the CHECK DATA utility:

UTILINIT

Sets up and initializes the CHECK DATA utility.

SCANTAB

Extracts keys by index or table space scan and places them in the SYSUT1 DD.

SORT

Sorts the foreign keys using the SORTOUT DD (if the foreign keys were not extracted using an index).

CHECKDAT

Compares the extracted foreign keys to the index entries for the corresponding primary key. This phase also issues error messages for invalid foreign keys.

REPORTCK

Copies the invalid rows to the specified exception table and then deletes them from the source table if the DELETE YES option was chosen .

UTILTERM

Performs the final utility cleanup.


Estimating CHECK DATA Work Data Set Sizes

graphics/v7_icon.gif

The CHECK DATA utility requires the use of work data sets to accomplish referential constraint checking. Of course, as of DB2 V7 you can set up IBM DB2 utilities to allocate the required data sets dynamically. But this is not required; you may still choose to explicitly allocate utility work data sets. If you choose to do so, the following formulas can help you estimate the sizes of the work data sets required by the CHECK DATA utility. These calculations provide estimated data set sizes. More complex and precise calculations are in the DB2 Utility Reference manual. The formulas presented here, however, produce generally satisfactory results.


 

 SYSUT1 = (size of the largest foreign key + 13) x (total number of rows in the table to be graphics/ccc.gif checked) x (total number of foreign keys defined for the table) 

NOTE

If any number is 0, substitute 1.


 

 SORTOUT = (size of SYSUT1) SORTWKxx = (size of SORTOUT) x 2 SYSERR = (number of estimated referential constraint violations) x 60 

NOTE

Allocate at least one cylinder to the SYSERR data set.


After calculating the estimated size, in bytes, for each work data set, convert the number into cylinders , rounding up to the next whole cylinder. Allocating work data sets in cylinder increments enhances the utility's performance.

CHECK DATA Locking Considerations

The CHECK DATA utility can run concurrently with the following utilities: DIAGNOSE , MERGECOPY , MODIFY , REPORT , and STOSPACE .

CHECK DATA , when run specifying DELETE NO , will drain write claim classes for the table space and indexes being processed . When DELETE YES is specified, all claim classes are drained for the table space and indexes impacted.

When CHECK DATA is run against an individual partition, DB2 also drains the write claim class for the logical partition of the (type 2) indexes impacted if DELETE NO is specified. If DELETE YES is specified, DB2 drains all claim classes for the logical partition of the (type 2) indexes being acted upon. Regardless of the other options specified, if the FOR EXCEPTION option is specified, the table space containing the exception table (and any indexes) will have all claim classes drained.

CHECK DATA Guidelines

Use CHECK DATA to Ensure Data Integrity

Favor the use of the CHECK DATA utility to reset the check pending status on DB2 table spaces. CHECK DATA is the only way to verify, in an automated fashion and on demand, that DB2 table data is referentially intact and that the data conforms to all check constraints. The alternate methods of resetting the check pending status are as follows:

  • Running the REPAIR utility, specifying SET NOCHECKPEND for the appropriate table spaces

  • Issuing the START DATABASE command, specifying ACCESS(FORCE)

Neither option ensures data integrity.

Another valid way to reset the check pending status is with the LOAD utility, specifying the ENFORCE CONSTRAINTS option. However, this requires a sequential data set suitable for loading, and this type of data set is not readily available for most application table spaces. Even if a load data set is available, the data it contains might be out of date, and thus of little benefit.

Use SCOPE PENDING

Specify the SCOPE PENDING option when executing the CHECK DATA utility to reduce the amount of work the utility must perform. With the SCOPE PENDING option, CHECK DATA checks only the rows that need to be checked for all tables in the specified table space. This means that only data in check pending is checked. If the table space is not in check pending, the CHECK DATA utility issues a message and terminates processing. This is the most efficient way to execute the CHECK DATA utility because it minimizes runtime by avoiding unnecessary work. The alternative is to specify SCOPE ALL , which checks all dependent tables in the specified table spaces.

Run CHECK DATA When Data Integrity Is Questionable

Execute CHECK DATA after the following:

  • Loading a table without specifying the ENFORCE CONSTRAINTS option.

  • A check constraint is added to a table and data within an existing row of that table violates the constraint.

  • A table is altered to add a check constraint and the CURRENT RULES special register contains DB2 .

  • When row violations are encountered by the CHECK DATA utility using the DELETE NO option.

  • The partial recovery of table spaces in a referential set.

Both situations result in DB2 placing the loaded or recovered table spaces into a check pending status. The CHECK DATA utility is necessary to ensure referentially sound data and to remove the check pending status, permitting future data access.

Bypass CHECK DATA Only When Data Integrity Is Verifiable

After a full recovery of all table spaces in a referential set, you might want to bypass the execution of the CHECK DATA utility. Depending on the order in which the recovery took place, some table spaces are placed in a check pending status. If you have followed the COPY guidelines presented in this book, however, the full recovery of a table space set is referentially sound. In this case, the REPAIR utility specifying the SET NOCHECKPEND option can be used instead of CHECK DATA , because CHECK DATA would be a waste of time.

Define Exception Tables for Tables That Require CHECK DATA

An exception table stores the rows that violate the referential constraint being checked. An exception table should be identical to the table being checked but with the addition of two columns: one column identifies the RID of the offending row, and the other identifies a TIMESTAMP that indicates when the CHECK DATA utility was run.

These two columns can have any name as long as it isn't the same name as another column in the table. The names used in the following example are recommended because they clearly identify the column's use. To avoid ambiguity, use the same column names for all exception tables. The exception table can be created using the following DDL statements:

 

 CREATE TABLE   DSN8810.DEPT_EXCPTN   LIKE DSN8610.DEPT; ALTER TABLE   DSN8810.DEPT_EXCPTN   ADD   RID         CHAR(4); ALTER TABLE   DSN8810.DEPT_EXCPTN   ADD   CHECK_TS    TIMESTAMP; 

The exception table does not need to be empty when the CHECK DATA utility is run because the TIMESTAMP column identifies which execution of CHECK DATA inserted the offending rows.

Do not create a unique index for any exception table. A unique index could cause the CHECK DATA utility to fail because of the insertion of non-unique key values. Non-unique indexes should not pose a problem.

Place the exception tables in a segmented table space. You also can place multiple exception tables in a single segmented table space.

Use DELETE YES for Optimum Automation

Rows that violate the referential constraint can be deleted from the table being checked if the DELETE YES parameter was specified. This is often the preferred method of executing the CHECK DATA utility in a production environment because the elimination of constraint violations is automated. If the deleted rows are needed, they can be retrieved from the exception table.

If DELETE NO is specified instead of DELETE YES , the CHECK DATA utility does not reset the check pending flag, but the rows in violation of the constraint are identified for future action.

A problem can occur, however, when you run the CHECK DATA utility with the DELETE YES option. When a row is deleted from the dependent table, it could cause cascading deletes to one or more dependent tables. This may result in valid data being deleted if the violation is caused by a missing primary key in a parent table. For this reason, you might want to avoid the DELETE YES option. At any rate, exercise caution when checking data with DELETE YES .

Be Aware of Inconsistent Indexes

If rows that appear to be valid are deleted, ensure that the indexes defined for the dependent and parent tables are valid. If data in either index is invalid, the CHECK DATA utility might indicate referential constraint violations, even though there are none. Indexes can be checked for validity using the CHECK INDEX utility (discussed in the next section).

Also, ensure that the parent table contains all expected data. If rows are missing because of improper deletions or partial loads, CHECK DATA will delete the foreign key rows as well (if DELETE YES was specified).

Consider Checking at the Partition Level

CHECK DATA can be executed at the partition level. Choosing to check at the partition level provides the following benefits:

  • Pinpoint integrity checking can be performed. If the user has a good idea which partition has a data integrity problem, CHECK DATA can be run on that partition only.

  • A regularly scheduled CHECK DATA pattern can be established, whereby a single partition is checked daily (or weekly). This establishes a data-integrity checking process that eventually checks the entire table, but not so frequently as to cause availability problems.

Rerun CHECK DATA After an Abend

The CHECK DATA utility cannot be restarted. If it abends during execution, determine the cause of the abend, terminate the utility, and rerun it. Common causes for CHECK DATA abends are lockout conditions due to concurrent data access and changes to the table being checked (for example, new columns), without corresponding changes to the exception table.

Do Not Use CHECK DATA for Informational Referential Constraints

The CHECK DATA utility cannot be used to check informational referential constraints . An informational referential constraint is a DB2 DDL constraint that is added as information only (usually to support automatic query rewrite for materialized query tables).

An information referential constraint documents a constraint that is managed programmatically, not by DB2 declarative RI.

NOTE

Some third-party ISVs offer CHECK utilities that can be used to manage user-defined referential constraints. If you are using a CHECK utility other than IBM's, consult the vendor manual to determine whether it handles informational referential constraints.


 <  Day Day Up  >  


DB2 Developers Guide
DB2 Developers Guide (5th Edition)
ISBN: 0672326132
EAN: 2147483647
Year: 2004
Pages: 388

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net