LOB Guidelines

< Day Day Up >

The following guidelines can be used to help you implement multimedia object/relational databases using LOBs with DB2.

Do Not Edit the `ROWID`

The ROWID columns required for implementing LOBs should be generated by DB2. The ROWID column cannot be modified. When supplying a value to a ROWID column in a LOAD , the value should have been previously generated by DB2 and then unloaded at some point. Do not attempt to create a ROWID value; they should always be generated by DB2 to ensure accuracy.

NOTE

You cannot use LOAD to load ROWID values if the ROWID column was created with the GENERATED ALWAYS clause.

Define the `ROWID` Columns As `NOT NULL`

A ROWID column cannot be null. You should explicitly indicate NOT NULL when defining the column. If you do not DB2 will implicitly create the column as NOT NULL . This is the exact opposite of what DB2 will do for other columns that do not specify NOT NULL . That is, when NOT NULL is specified, DB2 will make the column nullable (except for a ROWID column). This is another good reason for never relying upon defaults ”because defaults can be confusing.

Implement LOBs with Care

When LOB columns are implemented and populated , they can consume a large amount of space. Be absolutely certain that the large objects are required by your applications before storing them using DB2.

Using LOBs Versus `VARCHAR` and `VARGRAPHIC` Data Types

A column defined as a VARCHAR data type holds alphanumeric data. The size of a VARCHAR column can range from 1 byte to a maximum of 8 bytes less than the record size. The record size depends upon the size of the table space page and whether or not an EDITPROC is being used. Table 9.3 outlines the maximum size of a VARCHAR column based on the page size of the table space.

Table 9.3. Maximum size of a `VARCHAR` column

EDITPROC Used?	Page 4K	Page 8K	Page 16K	Page 32K
YES	4046	8128	16320	32704
NO	4056	8138	16330	32714

If the VARCHAR specifies a size greater than 255, it is considered a LONG VARCHAR column.

Similar to VARCHAR columns, VARGRAPHIC columns are variable in size, but are used to store binary data. The size of a VARGRAPHIC column can range from 1 byte to a maximum of 2 bytes less than one half the record size. If the VARGRAPHIC specifies a size greater than 127, it is considered a LONG VARGRAPHIC column.

In general, BLOBs are preferable to VARGRAPHIC data types, and CLOBs or DBCLOBs are preferable to large VARCHAR data types. The LOB data types can be used to store larger amounts of data and have less impact on the other data elements of the table because LOBs are not stored in the same physical table space.

However, for smaller amounts of data VARCHAR and VARGRAPHIC data types can be easier to implement, administer, and manage. When dealing with character data less than 255 bytes or graphic data less than 127 bytes, consider using VARCHAR and VARGRAPHIC data types.

Use LOBs with User -Defined Distinct Types

Usually, you should not create tables with the LOB data types. Instead, for each LOB you wish to store, create a user-defined distinct type to use. Failure to do so can make it difficult to understand the type of data being stored in the table. For example, if you wish to store audio sound bites, consider creating a UDT such as

 CREATE DISTINCT TYPE SOUND_BITE AS BLOB(1M)

Then, when you create the table, you can use the UDT. Instead of specifying BLOB(1M) as the data type you can specify SOUND_BITE , such as

 CREATE TABLE CAMPAIGN_DETAILS (CANDIDATE_LNAME     CHAR(40) NOT NULL,  CANDIDATE_FNAME     CHAR(25) NOT NULL,  ELECTION_YR         INTEGER,  SPEECH_SAMPLE       SOUND_BITE)

Isn't it easier to determine that the SPEECH_SAMPLE column is actually audio because you used the SOUND_BITE data type? If you specified the underlying type, BLOB(1M) instead, it might be a photo or a movie of the candidate delivering the speech or some other binary object.

Use LOB Locators to Save Program Storage

By using LOB locators instead of directly accessing the actual LOB data, you can manipulate LOB data in your programs without retrieving the data from the DB2 table. This is a good technique because it reduces the amount of storage required by your program.

Defer Evaluation of LOB Expressions

LOB data is not moved until the program assigns a LOB expression to a target destination. So, when you use a LOB locator with string functions and operators, you can create an expression that DB2 does not evaluate until the time of assignment. This is called deferring evaluation of a LOB expression. Deferring evaluation can improve LOB I/O performance.

Use the Sample LOB Applications Shipped with DB2

DB2 ships with several sample applications that use LOB columns. Use these samples as templates to assist you when writing applications that use LOB columns. The sample applications include

`DSN8DLPL`	A C program that uses LOB locators and `UPDATE` statements to move binary data into a column of type BLOB.
`DSN8DLRV`	A C program that uses a LOB locator to manipulate CLOB data.
`DSNTEP2`	The dynamic SQL sample program written in PL/I allocates an SQLDA for rows that include LOB data and uses that SQLDA to describe a SQL statement and fetch data from LOB columns.

Consider Using UDFs to Limit LOB Overhead

You can create and use UDFs designed to return only a portion of a LOB, thereby limiting the transfer of LOB data to only the portion that the application requires. This can greatly reduce the amount of traffic required between a client application and the database server.

For example, consider a query designed to return a CLOB column, for example the EMP_RESUME column in DSN8810.EMP . The CLOB column contains character text. You can select the CLOB_LOCATOR into a host variable and then use the POSSTR() function to find the offset of a specific string within the CLOB. The CLOB_LOCATOR in the host variable is passed as the argument to POSSTR() . Finally, the SUBSTR() function can be used to select a portion of the CLOB.

Use Indicator Variables with LOB Locators for Nullable LOBs

DB2 uses indicator variables differently for LOB locators. When you SELECT a column that is null into a host variable (other than a LOB locator) an associated indicator variable is assigned a negative value to indicate that this column is set to null. But DB2 uses indicator variables a little differently for LOB locators because a LOB locator can never be null.

When you SELECT a LOB column using a LOB locator, and the LOB column is set to null, DB2 will assign a negative value to the associated indicator variable. But the LOB locator value does not change.

So, when using use LOB locators to retrieve data from columns that can contain nulls, always define indicator variables for the LOB locators. After fetching data into the LOB locators check the indicator variables. If the indicator variable indicates that the LOB column is null, do not use the value in the LOB locator. It is not valid because the LOB column is null.

Avoid Logging Large LOBs

Because LOBs are typically very large, logging changes to LOBs can become quite inefficient. Avoid logging by specifying LOG NO when creating the LOB table space. The default is LOG YES . If the size of the LOB is greater than 1MB, favor specifying LOG NO . For smaller LOBs, specifying LOG NO still can be beneficial, though.

When LOG NO is specified for the LOB table space, changes to the LOB column are not written to the log. LOG NO has no effect on a commit or rollback operation; the consistency of the database is maintained regardless of whether the LOB value is logged. When LOG NO is specified, changes to system pages and to the auxiliary index are still logged.

Another consideration when deciding whether or not to log LOBs is how frequently the LOB column is updated. For LOBs that are rarely or never changed, logging does not provide any advantage. Many LOB columns, such as graphics, are just replaced entirely when changes need to be made.

Keep in mind, though, that whenever logging is turned off, some form of recoverability still needs to be established. This can be as simple as retaining the input documents somewhere so they can be re-inserted to the LOB column in case of a database problem.

Isolate LOBs in Their Own Buffer Pool

Take special care when assigning LOBs to a buffer pool. Use a bufferpool that is not shared with other, non-LOB data. Additionally, assign the deferred write threshold ( DWQT ) to for the LOB bufferpool(s).

For LOBs that are not logged, changed pages are written at COMMIT . With DWQT set to , the writes will be processed in the background, continually, instead of all at once when committed. For LOBs that are logged, setting DWQT to avoids huge amounts of data being written at DB2 checkpoints.

For more information on buffer pools and DWQT , refer to Chapter 28, "Tuning DB2's Components."