Schemas, Collections, Libraries, and Databases | IBM i5/iSeries Primer(c) Concepts and Techniques for Programmers, Administrators, and Sys[... ]ators

Non-i5 systems often allow for the existence of multiple databases. Each database contains a discrete set of related tables, with no inherent connection between the various databases. The schema provides a logical structure to the objects within the database, creating subgroups to avoid duplicate names or other confusion.

The i5 traditionally did not allow for multiple named DB2 databases on the same system (although these became available on V5R2 through the use of independent disk pools). Schemas are implemented as libraries within the DB2 on the i5, so that all objects within the same schema are also within the same library.

Collection is another name for a schema, and it is implemented in the same manner. This term was used heavily by IBM in the past, although recently it has been replaced with the more universally acceptable term "schema."

Typically, all the schemas and collections created on the i5 server exist within the same database. Pay close attention to users’ library lists to avoid accidentally accessing objects in the wrong schema. Within SQL, you may identify the specific schema being used by preceding the object name with the schema and a period, as shown here:

      SELECT * FROM MYSCHEMA.MYTABLE

Depending on which tool you are using to execute the SQL statement and its configuration, you may need to use the system naming convention instead of the SQL convention shown above. In that case, you would use a ‘/’ as the separator, as shown below:

      SELECT * FROM MYSCHEMA/MYTABLE

The current default schema can be changed using the syntax shown here:

      SET SCHEMA MYSCHEMA

Other ways allow you to more permanently set the default schema. The user profile can also identify the default library (schema) to use. Within the GUI SQL Scripting tool, the JDBC setup function provides the same option. It is good that SQL provides the ability to explicitly name the schema, but if you are doing so in every SQL statement you create, you will soon grow tired of all the prefixing!

As each schema is created, a catalog is created for that schema, to help track and manage the objects within the schema. Along with the catalog, the system also creates two objects for use in journaling, QSQJRN and QSQJRN0001.

Catalogs

When you create an SQL schema, the system also creates a number of management objects within the schema. Table 25.1 list these objects.

Table 25.1: System Catalog Files
Table	Description
QSQJRN0001	Journal receiver
QSJRN	Journal
SYSCATALOGS	Relational databases
SYSCHKCST	Check constraints
SYSCOLUMNS	Column attributes
SYSCST	All constraints
SYSCSTCOL	Columns used in a constraint
SYSCSTDEP	Constraint dependencies on tables
SYSINDEXES	Indexes
SYSKEYCST	Keys (unique, primary, and foreign key)
SYSKEYS	Index keys
SYSPACKAGE	Packages
SYSREFCST	Referential constraints
SYSTABLES	List of tables and views
SYSTRIGCOL	Columns used in a trigger
SYSTRIGDEP	Objects used in a trigger
SYSTRIGGERS	Triggers
SYSTRIGUPD	Columns in the WHEN clause of a trigger
SYSVIEWDEP	View dependencies on tables
SYSVIEWS	View definitions

Each of the "SYS" tables listed above are actually views. The system maintains a master catalog in the library QSYS2. So, each of the tables listed here exists in QSYS2, and each schema contains a group of views over the system tables that selects only those entries that are relevant for their own schema.

Bearing that in mind, if you want to review information about all the tables in the system, issue a command such as:

      SELECT * FROM QSYS2/SYSTABLES

Or, if you only want to see data for a specific schema, use:

      SELECT * FROM your-schema/SYSTABLES

And, if you only want to see data for the current schema, use:

      SELECT * FROM SYSTABLES

Tables

Traditional databases use physical files to contain the data used by the applications. In the SQL terminology used on most computer systems, these files are called tables. The tables are made up of a list of columns that define the data items (fields) held within the table. As the application runs, rows (records) are added to the table. Tables within SQL can become fairly complex, but well start with the simpler versions.

To create a sample inventory transaction table, issue a command such as:

      CREATE TABLE INVTRAN      (ITEM CHAR (7),      LOC CHAR (10),      QTY DEC (7, 2),      TDATE DATE,      TTIME TIME)

The table INVTRAN is created in your current library on the i5, with five columns. The file is initially empty; use INSERT statements to load data into the table. By default, the table has no indexes built over it. Indexes can be built over the table using the CREATE INDEX command discussed later in this chapter.

Each column has a specific definition, including a data type and possibly a length. Commonly used data types are displayed in Table 25.2.

Table 25.2: SQL Data Types
INT/INTEGER	Large integer
FLOAT	Floating point
NUMERIC (length, scale)	Zoned decimal format; uses one byte per digit, with the sign on the last half byte
DEC/DECIMAL (length, scale)	Packed decimal format; uses one half byte per digit and sign
VARCHAR (length)	Variable length character field
DATE	Eight-digit date
TIME	Six-digit time
TIMESTAMP	Combination of date and time including a six-digit microsecond
DOUBLE	Double-byte floating point
CLOB (length)	Character large object
BLOB (length)	Binary large object
DBCLOB (length)	Double byte character large object
DATALINK	URL
User-defined ()	A user-defined data type

Views

A view is similar to the logical file typically used in the traditional databases on the i5. Its role is to provide a predefined grouping or redefinition of the columns within a table. For example:

      CREATE VIEW BDVIEW AS        ( SELECT NAME,        BMONTH CONCAT '/' CONCAT BDAY CONCAT '/' CONCAT BYEAR        AS BIRTHDATE        FROM EMPMAST)

As you can see, the view simply defines a SELECT statement as a new entity within the database. No actual object exists for the view; no overhead is incurred by its existence. Any number of views may be created without reducing system performance. Views are useful in that they allow the database administrator to define various complex instructions and conversions once for a given file. So, in the example above, any user who needs the complete birth date from the Employee Master File could use BDVIEW to retrieve the birth date, rather than write the lengthy concatenation string themselves. Over time, this saves users hours of frustration.

Another type of view to consider is the summary view. The summary view uses column functions to provide information such as totals, averages, maximums, and the like. Having views predefined with this information already available makes certain complex SQL statements much easier to code. For example, we might choose to create a summary view over the Inventory File that looks like this:

      CREATE VIEW INVTOT AS        (SELECT ITEM_NUMBER,        SUM(INVENTORY_QUANTITY) AS TOTAL        FROM INVMAST GROUP BY ITEM_NUMBER )

This simple example defines the view INVTOT as having the total on-hand quantity for any given item. This total amount recalculates automatically each time the table is referenced. In many applications, a master file holds the current on-hand balance; hence, a significant programming effort is required to keep that figure in sync with the detail that supports it. The technique of using a summary view allows developers to greatly reduce the amount of programming and CPU effort required to maintain the "total" columns within master files by letting the database do it itself. If you perform far more reads than updates on those columns, however, there could be a performance cost associated to implementing this technique. As always, test and evaluate your decisions in database design.

Aliases and Multimember Processing

Occasionally, traditional databases will make use of something know as multimember processing. There is nothing new about a "member": Virtually every database has at least one member. These members represent a discrete set of records within the table. Some databases refer to these members as data sets. Typically, data sets are used to logically separate distinct groups of records to improve performance. For example, by bundling all the records for one year in the same member within the history file, inquiries for that year run much faster than if they had to sort through the entire history file.

In certain cases, this kind of processing may still be warranted, but in general, I believe that the advent of EVIs and better query optimization allow us to get away from multimember processing. If you decide that multimember processing is appropriate for your application, however, or if you are simply forced to deal with it against your will, rest assured that SQL does provide a simple mechanism for this process.

By default, SQL expects that all database files will have only one member. You can identify specific members within a table using an alias. An alias simply defines a pointer from a distinct named object to a specific member within a table. DB2 UDB for the i5 actually implements this as a DDM file that points back to the local system. A DDM file is a tool that was often used for remote database communications many years ago; there is no need to deal with the DDM file itself. Simply issue the CREATE statement as shown:

      CREATE ALIAS MYALIAS FOR MYSCHEMA/MYFILE (MYMEMBER)

Once created, the alias exists as a permanent object within the database. Any number of aliases may be defined over a table. Each one should have its own unique name, and it must reference a member different from the rest.