Understanding Data Structures | DB2 for z/OS Version 8 DBA Certification Guide

A database is an organized collection of related objects. SQL is used throughout the database industry as a common method of issuing database queries. SQL is considered a language, composed of statements, functions, and data types. An SQL statement is used to access database objects using relational operations. Before examining the SQL language, you need to understand some DB2 terminology. This chapter will be referring to the basic data structures, or objects, that are defined for each DB2 database. These objects include

Tables: data types, aliases, and synonyms
Views
Materialized query tables
Indexes: keys
Table spaces
Index spaces
Sequence objects
Databases
Storage groups

Figure 4-1 shows the relationship among the DB2 objects.

Figure 4-1. A hierarchy of DB2 structures

A table is an unordered set of data records. It consists of columns and rows. Each column is based on a data type. Once created and populated with data, tables are referenced in the FROM clause of the SQL SELECT statements or in automatic query rewrite. Tables are of four types:

Permanent (base) tables
Auxiliary tables
Temporarydeclared or globaltables
Materialized query tables (MQTs)

This chapter discusses only tables created with a DDL statement. These tables are created using the CREATE TABLE, CREATE AUXILIARY TABLE, or CREATE GLOBAL TEMPORARY TABLE statement, and each is a logical representation of the way the data is physically stored on disk. This chapter also discusses how to define MQTs, but their use and the declared temporary tables will be discussed in Chapter 6.

Data types are used to define the type of data that will occupy each column in a DB2 table, and to provide the length of a column. Data that does not match the data type defined to the column will not be allowed in that column. Data types may be string, numeric, date, time, row IDs, LOB (large object), and user defined. Depending on the data type, the length can be fixed or varying. Data types can be categorized as built in or user defined.

Data types that are defined by DB2 are referred to as built in. User-defined data types allow you to define your own type of data to be stored in the column. These user-defined data types are based on existing built-in data types.

An alias is a pointer to another table and is a substitute for the three-part name of a table or a view. An alias can be qualified by an owner ID, and the table referenced can be on the local site or on a remote site. Aliases are often used in place of any three-part named table in order to make the references portable, as DB2 on z/OS is the only server that supports three-part name syntax. Aliases are not dropped if the table they point to is dropped.

Synonyms are used to refer to a table using a different name or to refer to another owner's table as if you were the owner. A synonym, a private pointer to a table, can be referenced only by its owner and is not allowed to be qualified.

Unlike an alias, a synonym can be used to refer only to a table in the same subsystem in which it was defined. If the table is dropped, so too is the synonym.

Views are alternative ways of viewing data in one or more base tables or other views. A view is not a physical object and does not store data. Rather, a view is a DB2 catalog entry that, when accessed, executes an SQL statement that retrieves data from other tables or views. Views can be used to limit access to certain kinds of data, create customized views of data for users, or allow for alterations of tables without affecting application programs. Views can be read only if necessary, or they can be updatable, insertable, or deleteable. Using views has some restrictions.

Materialized query tables (MQTs) are objects created to allow whole or parts of each query to be precomputed and then use computed results to answer future queries. MQTs provide the means to save the results of prior queries and then reuse the common query results in subsequent queries. This helps avoid redundant scanning, aggregating, and joins.

Indexes are ordered sets of pointers associated with a table. Indexes can be created on permanent tables or on declared temporary tables. The index is based on the values of one or more columns in the table. The index values can be either ascending or descending. Each table can have one or more indexes, and indexes can also be used to enforce uniqueness in the data values and clustering sequence. Indexes are used to improve performance when accessing data. If an index is used, large scans of data can be avoided.

Indexes are physical objects that are created in index spaces, which are data sets like those that support the table spaces. The index space is automatically created in the same database as the table on which it is defined.

Indexes are either partitioned or nonpartitioned. A partitioned index is created on a partitioned table space and can be primary or secondary. A nonpartitioned index is used on tables defined in segmented, simple table spaces or as an alternative index on a partitioned table space.

NOTE

The maintenance overhead of indexes will negatively impact the performance of INSERT, UPDATE, and DELETE statements.

Many keys can be defined on a table. A key is a named ordered set of columns on one table. A composite key is one made of several columns. Each key will play a particular role for a given table.

Table spaces are objects that consist of one or more data sets used to store DB2 data. Tables are created in table spaces, which can have one or many tables, depending on the type of table space defined.

Table spaces are of four types: simple, segmented, partitioned, and LOB. Both simple and segmented table spaces allow for multiple tables, but only segmented table spaces should be used, owing to the many performance issues.

Simple table spaces can contain one or more tables; however, the rows of the tables are not kept on separate pages, which can cause concurrency and space usage issues.
Segmented table spaces can contain one or more tables, and the space is divided into same-size segments. Each segment contains rows from only one table.
Partitioned table spaces divide the data into several separate data sets, or partitions. A partitioned table space can have only one table. Up to 4,096 partitions of up to 64GB each are allowed. Each partition can have different characteristics, such as volume placement and free space.
LOB table spaces are required to hold large-object data and are associated with the table space that holds the logical LOB column.

When an index is created, an index space also is created. An index space is a set of virtual storage access method (VSAM) linear data sets that hold index data. These index spaces are implicitly associated with the database that contains the table on which the index is defined.

A sequence object is a user-defined object that generates a sequence of numeric values according to the specifications in which it was created. These standalone objects provide an incremental counter generated by DB2.

Databases in DB2 for z/OS are collections of table spaces and index spaces. The database on the z/OS platform does not have any physical storage characteristics but acts more like an umbrella over all its dependent objects. This database allows for defaults to be established for the table space and indexes within in.

Storage groups are used to define a list of disk volumes to DB2 for its use when creating physical objects, such as table spaces and index spaces. The storage may be allocated when the tables are loaded with data or when they are created. Storage groups are created using DDL.