User-Defined Data Types and Strong Typing

< Day Day Up >

User -Defined Data Types and Strong Typing

User-defined data types allow you to create custom data types based on existing DB2 data types. UDTs can be beneficial when you need specific data types geared toward your organization's data processing requirements. One example where UDTs may prove useful is to define new data types for foreign currencies, for example,

 CREATE DISTINCT TYPE AUSTRALIAN_DOLLAR AS DECIMAL(11,2); CREATE DISTINCT TYPE EURO AS DECIMAL(11,2); CREATE DISTINCT TYPE US_DOLLAR AS DECIMAL(11,2); CREATE DISTINCT TYPE JAPANESE_YEN AS DECIMAL(15,2);

DB2 enforces strong typing on user-defined data types. Strong typing prohibits non-defined operations between different types. For example, the following operation will not be allowed due to strong typing:

 TOTAL = AUSTRALIAN_DOLLAR + EURO

Strong typing, in this case, helps us to avoid an error. Adding two different currencies together, without converting one currency to the other, will always result in nonsense data. Think about it: You cannot add a handful of Australian coins with a handful of U.S. coins and come up with anything meaningful (or, perhaps more importantly, spendable).

Consider another example where UDFs have been defined to convert currency amounts. If a specific conversion function is defined that accepts US_DOLLAR data types as input, you would not want to accept other currencies as input. Doing so would most likely cause the UDF to convert the currency amount incorrectly. For example, consider the UDF USDTOYEN() created as follows :

 CREATE FUNCTION USDTOYEN(US_DOLLAR)   RETURNS JAPANESE_YEN . . .

This function accepts a US_DOLLAR amount and converts it to JAPANESE_YEN . Consider the problems that could occur if, instead of a US_DOLLAR input, an AUSTRALIAN_DOLLAR amount was allowed to be specified. Without strong typing, the function would use the conversion routines for US_DOLLAR and arrive at the wrong JAPANESE_YEN amount for the input argument, which was actually specified as an AUSTRALIAN_DOLLAR . With strong typing, the function will reject the request as an error.

When using UDTs you can define only those operations that are pertinent for the UDT. For example, not all numbers should be available for math operations like addition, subtraction, multiplication, and division. A Social Security number, for instance, should always be numeric, but never needs to participate in mathematical equations. Other examples include credit card numbers, account numbers, and vehicle identification numbers . By assigning these types of data items to UDTs you can eliminate operations that do not make sense for the data type.

To summarize, strong typing ensures that only functions, procedures, comparisons, and assignments that are defined for a data type can be used.

User-Defined Distinct Types and LOBs

One of the most important uses for UDTs is to better define the contents of LOB columns. LOB columns allow large multimedia objects, such as audio, video, and large text documents, to be stored in DB2 columns . DB2 supports three types of LOB data types:

BLOB Binary large object
CLOB Character large object
DBCLOB Double-byte character large object

For more details on DB2's object/relational support, refer to Chapter 9, "Large Objects and Object/Relational Databases." For the purposes of this chapter, it is sufficient to know that these types of columns can be created to house complex, unstructured data.

Let's look at a quick example. Suppose you want to create a DB2 table that contains an audio data column. You could define the column as a BLOB , such as in the following statement:

 CREATE TABLE  userid  .MOVIE   (MOVIE_ID       INTEGER      NOT NULL,    MOVIE_NAME     VARCHAR(50)  NOT NULL,    MOVIE_REVIEW   BLOB(1M),    ROW_ID         ROWID GENERATED ALWAYS) IN  database.tablespace  ;

NOTE

A ROWID must appear in every table that contains a BLOB column or a UDT based on a BLOB data type. The role of the ROWID data type is explained further in Chapter 9.

However, this does not help us to know that the column contains audio. All we know is that the column contains a BLOB ”which might be audio, video, graphic, and so on. We might surmise from the column name that the contents are audio, but it might be a video review. To rectify the potential confusion, you can create a UDT of type AUDIO as follows:

 CREATE DISTINCT TYPE AUDIO AS BLOB(1M);

Then create the table specifying the column as the new UDT, instead of just as a BLOB . In fact, you could also create a video user-defined data type and store the actual video contents of the movie in the table as well, as shown in the following:

 CREATE DISTINCT TYPE VIDEO AS BLOB(2G); CREATE TABLE  userid  .MOVIE   (MOVIE_ID       INTEGER      NOT NULL,    MOVIE_NAME     VARCHAR(50)  NOT NULL,    MOVIE_REVIEW   AUDIO,    MOVIE          VIDEO) IN  database.tablespace  ;

This table DDL is much easier to read and understand than if both the MOVIE_REVIEW and the MOVIE columns were defined only as BLOB s.

The AUDIO and VIDEO UDTs that you created can now be used in the same way that you use DB2's built-in data types.

Using UDTs for Business Requirements

Another good use of UDTs is to take advantage of strong typing in applications. Remember that strong typing means that only those functions, comparisons, and assignments that are defined for a particular UDT can be executed. How is this an advantage? Consider the scenario where two table are defined, one containing an INTEGER column named SHOE_SIZE , and the other table containing an INTEGER column named IQ_RATING . Because both are defined as INTEGER data types, it is permissible to compare SHOE_SIZE to IQ_RATING . There really is no reason to permit this, and the results will be meaningless. To disable this ability, you could create two UDTs as follows:

 CREATE DISTINCT TYPE SHOESIZE AS INTEGER; CREATE DISTINCT TYPE IQ AS INTEGER DECIMAL(11,2);

The SHOE_SIZE column can then be created as a SHOESIZE data type, and the IQ_RATING column can be created as the IQ data type. Then it will be impossible to compare the two columns because they are of different data types and DB2 enforces strong typing. Furthermore, when UDTs are used as arguments to functions, that function must be defined to accept that UDT. For example, if you needed to determine average shoe sizes, the AVG function could not be used. But, you could create a sourced UDF that accepts the SHOESIZE data type as input, as shown in the following:

 CREATE FUNCTION AVG(SHOESIZE)   RETURNS INTEGER   SOURCE SYSIBM.AVG(INTEGER);

NOTE

The built-in functions are within the SYSIBM schema.

An alternative to creating sourced functions is to use casting functions in your expressions. Casting allows you to convert a source data type into a target data type. Whenever a UDT is created, two casting functions are created: one to convert the UDT to the base data type, and another to convert the base data type to the new UDT. For example, when we created two UDTs named SHOESIZE and IQ , four casting functions were created as follows:

`IQ(INTEGER)`	Accepts an `INTEGER` and converts it to `IQ`
`INTEGER(IQ)`	Accepts an `IQ` and converts it to `INTEGER`
`SHOESIZE(INTEGER)`	Accepts an `INTEGER` and converts it to `SHOESIZE`
`INTEGER(SHOESIZE)`	Accepts a `SHOESIZE` and converts it to `INTEGER`

The casting functions have the same names as the target data types. These casting functions are created automatically by DB2 behind the scenes. You do not need to do anything in order for them to exist other than to create the UDT.

So, to use casting functions to provide an average SHOE_SIZE , you could code the following instead of creating a sourced AVG function:

 SELECT AVG(INTEGER(SHOE_SIZE))...

You must understand, though, that strong typing applies not only to user-defined functions and built-in scalar and column functions, but also to DB2's built-in operators, also referred to as the infix operators . These are plus ( + ), minus ( - ), multiply ( * ), divide ( / ), and concatenation ( or CONCAT ). It is best to create sourced functions for these operations, instead of casting, if you want to use them with UDTs ”for example

 CREATE FUNCTION '+' (SHOESIZE, SHOESIZE)   RETURNS SHOESIZE   SOURCE SYSIBM. '+' (INTEGER, INTEGER);

NOTE

The built-in operators are within the SYSIBM schema.

Without this sourced function, it would not be possible to add two SHOESIZE columns using SQL. This is probably fine, because there is no real need to add two shoe sizes together in the real world. Of what possible value would the result be? So, it is best to create sourced functions only for those built-in infix operators that make sense and are required for business reasons.

For example, it would be wise to create sourced infix operator functions for the AUSTRALIAN_DOLLAR , EURO , US_DOLLAR , and JAPANESE_YEN data types we discussed earlier, because it makes sense to add, subtract, multiply, and divide currencies. Using sourced functions is easier and more effective than casting. Consider which is easier

 USD_AMT1 * USD_AMT2

 DECIMAL(USD_AMT1) * DECIMAL(USD_AMT2)

Clearly the first alternative is better.

This same problem does not exist for comparison operators, because the CREATE DISTINCT TYPE statement has a clause to automatically create comparison operators for UDTs. The clause is WITH COMPARISONS . For example, if the EURO UDT is defined as follows

 CREATE DISTINCT TYPE EURO AS DECIMAL(11,2) WITH COMPARISONS

You will be able to use the following comparison operators on columns defined as the EURO data type:

 BETWEEN NOT BETWEEN IN NOT IN IS NULL IS NOT NULL >     >=     > <     <=     < =     <>     =

Always specify the WITH COMPARISONS clause when creating UDTs unless

The UDT is based on a BLOB , CLOB , or DBCLOB .

or
The UDT is not based on VARCHAR or VARGRAPHIC and has a length greater than 255 bytes.

Assigning Values and UDTs

When assigning values to columns, the value must be of the data type of the column or of a compatible data type. For UDTs, you must use the casting functions to assign values to columns of a UDT.

For example, if you wanted to assign the value of a column named YEN_AMT , which is defined as a JAPANESE_YEN , to a column named EURO_AMT , which is defined as EURO , you must create a UDF that converts yen to euros. If one does not exist, you cannot assign a yen amount to the EURO column. Think about it: If the value of YEN_AMT was 50,000, simply assigning 50,000 to the EURO_AMT would result in a lot more value because 50,000 yen is a much smaller amount of money than is 50,000 euros.

The bottom line is that a conversion UDF is required for assignment of one UDT data type to another UDT data type.

If you are using host variables , DB2 makes the task a bit easier. You can assign the value of a column defined as a UDT to a host variable, if DB2 allows you to assign the underlying source data type to the host variable. For example, consider that a host variable, :HV-1 , is defined in a COBOL program as a valid DECIMAL (that is PIC S9(7)V9(2) COMP-3 ). You can assign a column of type US_DOLLAR , USD_AMT , to a host variable underlying data type of US_DOLLAR is DECIMAL(9,2) . For example,

 SELECT  USD_AMT INTO :HV-1 FROM    TAB-USD;

But, when you assign a value in a host variable to a column defined as a UDT, the type of the host variable can be cast to the UDT. So, a host variable defined as PIC S9(7)V9(2) COMP-3 in a COBOL program is fine for the USD_DOLLAR column. However, if the host variable were of a non-compatible data type, the assignment would fail. So the following statement is valid only if :HV-1 is defined appropriately:

 INSERT INTO TAB-USD VALUES (. . ., :HV-1, . . .) ;

User-Defined Distinct Type Guidelines

The following guidelines can be used to help you implement effective and efficient user-defined functions for your organization.

Naming User-Defined Functions

The rules for naming user-defined distinct types are similar to those for naming user-defined functions. However, the UDT name in combination with the schema name must be unique. DB2 will not allow a UDT to be created if the schema and UDT name matches another existing data type.

Furthermore, the name of the user-defined distinct type cannot be any of the following system defined key words:

`ALL`	`AND`
`ANY`	`BETWEEN`
`DISTINCT`	`EXCEPT`
`EXISTS`	`FALSE`
`FOR`	`FROM`
`IN`	`IS`
`LIKE`	`MATCH`
`NOT`	`NULL`
`ONLY`	`OR`
`OVERLAPS`	`SIMILAR`
`SOME`	`TABLE`
`TRUE`	`TYPE`
`UNIQUE`	`UNKNOWN`
`=`	`=`
`<`	`<=`
`>`	`>=`
`<`	`>`
`<>`

Comparing UDTs to Base Data Types

DB2 does not let you compare data of a UDT to data of its source type. However, you can compare a UDT to its source data type by using a cast function. For example, to compare a US_DOLLAR column, namely USD_AMT , to a DECIMAL column, namely DEC_AMT , you could use the following SQL:

 WHERE  USD_AMT > US_DOLLAR(DEC_AMT)

Cast Constants and Host Variables to UDTs

Constants and host variables will not be defined as user-defined distinct types. They will generally be specified in the underlying base data type of the UDT. For example, consider a UDF that is created to convert JAPANESE_YEN to EURO values. This UDF might look like the following:

 CREATE FUNCTION CONVERT_YEN_EURO(JAPANESE_YEN)     RETURNS EURO   EXTERNAL NAME 'YENEURO'   PARAMETER STYLE DB2SQL   LANGUAGE C;

This UDF will accept only the JAPANESE_YEN data type as an input parameter. To use this UDF with a host variable or constant, you must use casting functions. For example, to convert 50,000 Japanese yen to euros, you could call the UDF with the following code:

 CONVERT_YEN_EURO(JAPANESE_YEN(50000.00)

In this case, the underlying base data type as defined for JAPANESE_YEN is DECIMAL(11,2) . The same basic idea can be used for host variables, substituting the host variable name for the constant 50000.00 .

For dynamic SQL statements, if you want to use a parameter marker with a UDT, you can cast it to the data type of the UDT as follows:

 WHERE CAST (? AS US_DOLLAR) > USD_AMT

Of course, you also could code the inverse of this operation as follows:

 WHERE ? > DECIMAL(USD_AMT)

Using `UNION` with UDTs

DB2 enforces strong typing of UDTs in UNION statements. When you use a UNION to combine column values from several tables, the columns still must be UNION -compatible as described in Chapter 1, "The Magic Words." Recall that union compatibility means that for the two sets of columns being unioned

The two sets must contain the same number of columns.
Each column of the first set must be either the same data type as the corresponding column of the second set or convertible to the same data type as the corresponding column of the second set.

So, if you were to UNION data from a USD_SALES table and a YEN_SALES table, the following statement would not work because the data types are not compatible:

 SELECT  YEN_AMT FROM    YEN_SALES UNION SELECT  USD_AMT FROM    USD_SALES;

Instead, you would have to ensure that the amounts were cast to the same data type. This can be done by using the automatic casting functions built by DB2 when the UDTs were created, or by using UDFs you may have created for converting currencies. A valid example using the casting functions follows:

 SELECT  DECIMAL(YEN_AMT) FROM    YEN_SALES UNION SELECT  DECIMAL(USD_AMT) FROM    USD_SALES;

The results are all returned as decimal values. However, the results may not be useful because you will not know which amounts represent yen and which represent U.S. dollars. It would be better to use conversion functions to convert one currency to the other, for example, creating a UDF named CONVERT_YEN_USD to convert yen amounts to U.S. dollar amounts and using it as follows:

 SELECT  CONVERT_YEN_USD(YEN_AMT) FROM    YEN_SALES UNION SELECT  USD_AMT FROM    USD_SALES;

In this case, the results are all returned in U.S. dollars. This makes the results easier to interpret and understand.

< Day Day Up >