A large object is a data type used by DB2 to manage unstructured data. DB2 provides three built-in data types for storing large objects:
Actually, the three LOB data types can be used to store 1 byte less than 2 gigabytes of data.
BLOBs, CLOBs, and DBCLOBs are collectively referred to as LOBs. The three LOB data types are designed to
efficiently
store and access large amounts of unstructured data. DB2 understands that it is expensive to move and manipulate large objects. Therefore, LOBs are treated differently than the other standard built-in data types.
LOBs are not stored in the same structure as the rest of the data in the DB2 table. Instead, the table contains a descriptor that points to the actual LOB value. The LOB value itself is stored in separate LOB table space in an auxiliary table.
Application programs are written using LOB locators. A LOB locator represents the value of a LOB but does not actually contain the LOB data. This method is used because LOBs are typically very large and therefore expensive in terms of the resources required to move and store the LOB value. By using LOB locators, programs can avoid the expense associated with materializing LOBs.
When LOBs are created, the DBA can specify whether LOBs are to be logged or not. Once again, because LOBs are very large, logging them can be quite expensive and
consume
a large amount of log storage.
LOB
Columns
Versus
VARCHAR
and
VARGRAPHIC
Columns
It has been possible to store multimedia data in DB2 databases since Version 1 using
VARCHAR
and
VARGAPHIC
columns. But these data types provide limited functionality and
usefulness
when compared to LOBs.
The maximum
size
of a
VARCHAR
or
VARGRAPHIC
column is 32KB. This limitation may not pose a problem for smaller databases, but modern (often Web-enabled) applications usually require larger multimedia data. A 32KB text document is not a very large document at all. And 32KB is miniscule when it comes to storing multimedia data types such as audio, video, graphics, and images.
NOTE
One of the biggest considerations when using LOBs is their size. DBAs and developers who decide to use LOBs will need to understand that even small or simple graphics and text files can consume a large amount of space.
For
comparative
purposes, consider that the document used to produce this chapter is approximately 95KB in size. In practice, many business documents are much larger.
Once again, for comparative purposes, the file used to produce Figure 2.3 (in Chapter 2) is approximately 39KB in size. And the graphic contained in that file is quite simple compared to many other types of business graphics.
When you are sure that the text or graphic you wish to store will always consume less than 32KB of storage, then you can use a
VARCHAR
or
VARGRAPHIC
data type instead of one of the LOB data types. However, LOB data types might still be preferable because of the efficient manner in which they are handled by DB2. Remember,
VARCHAR
and
VARGRAPHIC
data is stored with the rest of the data in the table space, as opposed to LOB data, which is stored in an auxiliary LOB table space.
Creating Tables That Contain LOB Columns
There are four basic steps required to create and populate a table that uses LOB data types.
The first step is to define the appropriate columns in the DB2 table. Define one
ROWID
column and as many LOB columns as needed. Only one
ROWID
column is required regardless of the number of LOB columns you specify. The
ROWID
and LOB columns are defined using the
CREATE TABLE
or
ALTER TABLE
statement. The definition of the LOB column must specify whether the column is a BLOB, CLOB, or DBCLOB. Furthermore, you must specify a size for the LOB. Failure to specify a size causes DB2 to use the following default:
-
For BLOBs
” 1 MB (or 1,048,576 bytes)
-
For CLOBs
” 1,048,576 single byte characters
-
For DBCLOBs
” 524,288 double-byte characters
NOTE
Regardless of the length you specify, a BLOB column is stored as a long string column of varying length.
The LOB column in the DB2 table will contain only information about the LOB, not the actual data value. The table containing the LOB definition is referred to as the base table.
The
ROWID
column is used by DB2 to locate the LOB data. A
ROWID
is a unique 19-byte system generated value. If you are adding a LOB column and a
ROWID
column to an existing table, you must use two
ALTER TABLE
statements. Add the
ROWID
with the first
ALTER TABLE
statement and the LOB column with the second
ALTER TABLE
statement.
In the second step you will need to create a table and a table space to store the LOB data. The table is referred to as an auxiliary table; the table space is called a LOB table space. The base table can be in a partitioned table space but the LOB table space cannot be partitioned.
NOTE
The LOB table space must be created in the same database as the base table.
If the base table is not partitioned, you must create one LOB table space and one auxiliary table for each LOB column. If the table space containing the base table is partitioned, you must create one LOB table space and one auxiliary table for each partition, for each LOB. For example, if your base table has six partitions, you must create six LOB table spaces and six auxiliary tables for each LOB column. To further
illustrate
the base table to auxiliary table relationship, refer to Figure 9.1.
Figure 9.1. Base table to auxiliary table relationship for storing LOBs.
Use the
CREATE LOB TABLESPACE
statement to create LOB table spaces and the
CREATE AUXILIARY TABLE
statement to create auxiliary tables.
The third step is to create a unique index on the auxiliary table. Each auxiliary table must have exactly one index. The
CREATE INDEX
statement is used to create the auxiliary table index. Do not specify any columns for the index key. When a
CREATE INDEX
is issued against an auxiliary table DB2 will implicitly define the index key on the
ROWID
column.
The final step is to populate the LOB data into the table. Though we know that the actual LOB data is stored in an auxiliary table in a LOB table space and not in the base table, when you populate the LOB data you must reference the base table. If the total length of the LOB column and the base table row is less than 32KB, you can use the
LOAD
utility to populate the data into DB2. If the LOB column is greater in size you must use
INSERT
or
UPDATE
statements. When using
INSERT
to populate the LOB data you must ensure that your application has access to adequate storage to hold the entire LOB value that is to be inserted.
A Sample Table Using LOB Columns
Consider the steps you would need to take to add an org chart to the DSN8810.DEPT sample table. The org chart is a BLOB of no more than 5 megabytes in size. The first step would be to alter the table to add two columns: a
ROWID
column and a BLOB column. For example:
ALTER TABLE DSN8810.DEPT
ADD ROW_ID ROWID GENERATED ALWAYS;
COMMIT;
ALTER TABLE DSN8810.DEPT
ADD DEPT_ORG_CHART BLOB(5M);
COMMIT;
The
next
step would be to create the LOB table space and auxiliary table for the LOB column. For example:
CREATE LOB TABLESPACE TDORGCHT
IN DSN8D81A
LOG NO;
COMMIT;
CREATE AUXILIARY TABLE DEPT_ORGCHART_TAB
IN DSN8D81A. TDORGCHT
STORES DSN8810.DEPT
COLUMN DEPT_ORG_CHART;
COMMIT;
Following this you must create the index on the auxiliary table. Remember, you do not need to specify columns for the index key when an index is defined on an auxiliary table. The following SQL
CREATE
statement defines the auxiliary table index:
CREATE UNIQUE INDEX XDEPTORG
ON DEPT_ORGCHART_TAB;
COMMIT;
NOTE
If
BIND
parameter
SQLRULES
is set to
STD
, or if special register
CURRENT RULES
has been set to
STD
, DB2 will automatically create the LOB table space, auxiliary table, and auxiliary index when you issue the
ALTER TABLE
statement to add the LOB column.
Accessing LOB Data
LOB columns can be accessed using SQL just like other columns, in most cases. For example, you can code an SQL
SELECT
statement to retrieve the resume information stored in the
EMP_RESUME
column of the
DSN8810.EMP
table as
follows
SELECT EMPNO, EMP_RESUME
FROM DSN8810.EMP;
When embedding SQL in application programs you need to take the size of LOBs into consideration. By using a LOB locator you can manipulate LOB data without actually moving the data into a host variable. A LOB locator is a reference to the large object, and not the LOB data itself. Figure 9.2 illustrates this principle.
Figure 9.2. Using LOB locators.
A LOB locator is associated with a LOB data value or LOB expression, not with a row in a DB2 table or an actual physical storage location in a DB2 table space. So, once you
SELECT
the LOB value using a LOB locator the value of the locator should not change, but the actual value of the LOB might change.
DB2 provides two statements to work with LOB locators:
|
FREE LOCATOR
|
Removes the association between the LOB locator and its LOB value before a unit of work ends.
|
|
HOLD LOCATOR
|
Maintains the association between a LOB locator and its LOB value after the unit of work ends. After issuing the
HOLD LOCATOR
statement, the LOB locator will keep its association with the LOB data value until the program ends or
FREE LOCATOR
is issued.
|
NOTE
You cannot use
EXECUTE IMMEDIATE
with the
HOLD LOCATOR
or
FREE LOCATOR
statements when issuing dynamic SQL.
By using LOB locators your application programs will require significantly less memory than would be required if entire LOB values were returned to the program. The LOB locator can be returned from queries, inserted into new tables, and used by the application code like any other host variable. LOB locators enable the application to
defer
actually accessing the large object itself until the application needs the contents of that object.
You will need to
DECLARE
host variables to hold the LOB data or LOB locators. The host variables must be declared of SQL type BLOB, CLOB, or DBCLOB. DB2 will generate an appropriate declaration for the host language. For example, review Tables 9.1 and 9.2 for COBOL host variable declarations for LOB
variables
and LOB locators. In SQL statements you must refer to the LOB host variable or locator variable specified in the SQL type declaration. In host language statements (such as COBOL) you must use the variable generated by DB2.
Table 9.1. LOB Variable Declarations
|
Declared in the Program
|
Generated by DB2
|
01 BLOB-VAR USAGE IS
SQL TYPE IS BLOB(1M).
|
01 BLOB-VAR.
02 BLOB-VAR-LENGTH PIC 9(9) COMP.
02 BLOB-VAR-DATA.
49 FILLER PIC X(32767).
Repeat above line 31 times.
49 FILLER PIC X(32).
|
01 CLOB-VAR USAGE IS
SQL TYPE IS CLOB(40000K).
|
01 CLOB-VAR.
02 CLOB-VAR-LENGTH PIC 9(9) COMP.
02 CLOB-VAR-DATA.
49 FILLER PIC X(32767).
Repeat above line 1249 times.
49 FILLER PIC X(1250).
|
01 DBCLOB-VAR USAGE IS
SQL TYPE IS CLOB(40000K).
|
01 DBCLOB-VAR.
02 DBCLOB-VAR-LENGTH PIC 9(9) COMP.
02 DBCLOB-VAR-DATA.
49 FILLER PIC G(32767)
USAGE DISPLAY-1.
Repeat above 2 lines 1249 times.
49 FILLER PIC G(1250).
|
The size limitation for COBOL variables is 32,767 bytes. This is a limit of the COBOL compiler. That is why DB2 generates multiple declarations of 32,767 bytes until it
reaches
the 1M specification.
Table 9.2. LOB Locator Variable Declarations
|
Declared in the Program
|
Generated by DB2
|
01 BLOB-LOC USAGE IS
SQL TYPE IS BLOB-LOCATOR.
|
01 BLOB-LOC PIC S9(9) USAGE IS BINARY.
|
01 CLOB-LOC USAGE IS
SQL TYPE IS CLOB-LOCATOR.
|
01 CLOB-LOC PIC S9(9) USAGE IS BINARY.
|
01 DBCLOB-LOC USAGE IS
SQL TYPE IS DBCLOB-LOCATOR.
|
01 DBBLOB-LOC PIC S9(9) USAGE IS BINARY.
|
NOTE
The sizes of the LOBs you can declare and manipulate depend on the limits of the host language and the amount of storage available to your program. LOB host variables can be defined for the C, C++, COBOL, Assembler, PL/I, and Fortran programming languages.
LOB Materialization
When DB2 materializes a LOB it places the LOB value into contiguous storage in a data space. The amount of storage that is used in data spaces for LOB materialization depends on the size of the LOB data and the number of LOBs being materialized.
Because LOBs are usually quite large, LOB materialization should be avoided until it is
absolutely
required. DB2 will perform LOB materialization under the following circumstances:
-
When a LOB host variable is assigned to a LOB locator host variable in an application program
-
When a program calls a UDF that specifies a LOB as at least one of the arguments
-
When a LOB is moved into or out of a stored procedure
-
When a LOB is converted from one CCSID to another
By reducing the number of times you take these actions in your programs you can minimize LOB materialization and enhance the performance of applications that access LOB data. You cannot completely eliminate LOB materialization. However, using LOB locators you can minimize its impact on your applications.
LOBs and Locking
A lock that is held on a LOB value is referred to as a LOB lock.
When a row is read or modified in a table containing LOB columns the application will obtain a normal transaction lock on the base table. The locks on the base table also control concurrency for the LOB table space. When locks are not
acquired
on the base table, because of
ISOLATION(UR)
for example, DB2 maintains data consistency by using locks on the LOB table space.
Regardless of the isolation level, for other reasons DB2 also obtains locks on the LOB table space and the LOB values stored in that LOB table space. For more details on LOB locking refer to Chapter 23, "Locking DB2 Data."