Types of User-Defined Functions

 <  Day Day Up  >  

Types of User -Defined Functions

There are two ways of creating a user-defined function: You can code your own function program from scratch, or you can edit an existing function.

Two types of user-defined functions can be written from scratch: scalar functions and table functions . Recall from Chapter 3, "Using DB2 Functions," that scalar functions are applied to a column or expression and operate on a single value. Table functions are a different type of function that, when invoked, return an entire table. A table function is specified in the WHERE clause of a SELECT statement taking the place of a table, view, synonym, or alias.

Scalar and table user-defined functions are referred to as external functions because they are written and developed outside of (or external to) DB2. External UDFs must be written in a host programming language. DB2 user-defined functions can be written in Assembler, C, C++, COBOL, Java, or PL/I.

A third type of user-defined function can be created from another existing function. This is a sourced function. A sourced function is based on a function that already exists ”it can be based on a built-in function or another user-defined function that has already been created. A sourced function can be based on an existing scalar or column function. So, external functions are coded from scratch, whereas sourced functions are created based on a pre-existing function (see Figure 4.1).

Figure 4.1. Sourced versus external UDFs.

graphics/04fig01.gif


User-defined functions are similar in functionality to application subroutines, but user-defined functions are different because they can be used inside SQL statements. In fact, the only way that user-defined functions can be executed is from within an SQL statement. This gives them great power. A user-defined function is not a substitute for an application subroutine, subprogram, or procedure. Instead, user-defined functions are used to extend the functionality of the SQL language.

The Schema

User-defined functions, user-defined distinct types, stored procedures, and triggers all are associated with a schema. By default, the schema name is the authid of the process that issues the CREATE FUNCTION , CREATE DISTINCT TYPE , CREATE PROCEDURE , or CREATE TRIGGER statement.

A schema, therefore, is simply a logical grouping of procedural database objects (user-defined functions, user-defined distinct types, stored procedures, and triggers).

You can specify a schema name when you create a user-defined function, type, or trigger. If the schema name is not the same as the SQL authorization ID, the issuer of the statement must have either SYSADM or SYSCTRL authority, or the authid of the issuing process must have the CREATEIN privilege on the schema.

For example, the following statement creates a user-defined function named NEWFUNC in the schema named MYSCHEMA :

 

 CREATE FUNCTION MYSCHEMA.NEWFUNC ... 

If the MYSCHEMA component was not included in the CREATE statement, the schema would default to the authid of the person (or process) that executed the CREATE statement. In short, the schema is set to the owner of the function. If the CREATE statement was embedded in a program, the owner is the authid of the owner of the plan or package; if the statement is dynamically prepared, the owner is the authid in the CURRENT SQLID special register.

Creating User-Defined Functions

Before using DDL to create a user-defined function, the function program should be coded and prepared. This requires the developer to write the program, precompile, compile, link-edit the program, BIND the DBRM for the program (if the function contains SQL statements), and then test the program to be sure it is free of bugs .

Then, before the user-defined function can be used, it must be registered to DB2 using the CREATE FUNCTION DDL statement. For example, assume that you have written a user-defined function program. Further assume that the program returns the number of days in the month for a given date. The following is a simplified version of the CREATE FUNCTION statement that could be used to register the UDF to DB2:

 

 CREATE FUNCTION DAYSINMONTH(DATE)   RETURNS INTEGER   EXTERNAL NAME 'DAYMTH'    LANGUAGE COBOL; 

This statement creates a UDF named DAYSINMONTH , with one parameter of DATE data type, that returns a single value of INTEGER data type. The external name for the function program is DAYMTH and it is coded in COBOL.

NOTE

Most of the parameters have been omitted from this simple CREATE FUNCTION example. The parameters available for the CREATE FUNCTION statement are discussed in depth later in this chapter.


After the user-defined function has been created, and the appropriate authority has been GRANT ed, the UDF can be used in an SQL statement as follows :

 

 SELECT EMPNO, LASTNME, BIRTHDATE, DAYSINMONTH(BIRTHDATE) FROM   DSN8810.EMP WHERE  DAYSINMONTH(BIRTHDATE) < 31; 

The result of this statement would be a list of employees whose birth date falls in a month having fewer than 31 days (that is, February, April, June, September, and November). This assumes that the program for the user-defined function DAYSINMONTH is correctly coded to examine the date specified as input and return the actual number of days in the month.

There are many different characteristics that need to be determined and specified when creating UDFs. Table 4.1 outlines the characteristics and whether each applies to external functions, sourced functions, or both.

Table 4.1. Characteristics of DB2 User-Defined Functions

Characteristic

Definition

Validity

UDF name (Input parameter types)

Name of the UDF and its B parameters.

B

RETURNS [TABLE]

Output parameter types.

B

SPECIFIC

Specific name.

B

graphics/v7_icon.gif PARAMETER CCSID

Encoding scheme for string parameters: ASCII, EBCDIC, or UNICODE.

B

EXTERNAL NAME

Name of the UDF program.

X

LANGUAGE

Programming language used to write the UDF program.

X

[NOT] DETERMINISTIC

Whether the UDF program is deterministic.

X

...SQL...

Whether or not SQL is issued in the UDF program and if SQL modifies or just reads DB2 data. Valid options are NO SQL , MODIFIES SQL DATA , READS SQL DATA , or CONTAINS SQL .

X

SOURCE

Name of the source function.

S

PARAMETER STYLE

The linkage convention used by the UDF program. DB2 SQL indicates parameters for indicator variables are associated with each input value and the return value to allow for NULL s.

X

FENCED

The UDF code cannot run in the DBMS address space.

X

...NULL...

Whether the function is called if any input arguments are NULL at execution time.

X

[NO] EXTERNAL ACTION

Whether the UDF performs an action that changes the state of objects that DB2 does not manage (such as files).

X

[NO] SCRATCHPAD

Whether or not a scratchpad is used to save informationfrom one invocation of the UDF to the next .

X

[NO] FINAL CALL

Whether or not a final call is made to the UDF program to free system resources.

X

[DIS]ALLOW PARALLEL

Whether or not parallel processing is permitted.

X

[NO] COLLID

Package collection ID of UDF package. NO indicates same as calling program.

X

WLM ENVIRONMENT

The name of the WLM environment.

X

ASUTIME [NO] LIMIT

CPU resource limit for an invocation of a UDF.

X

STAY RESIDENT

Whether or not the UDF load module stays in memory.

X

PROGRAM TYPE

Whether the program runs as a main routine ( MAIN ) or as a subroutine ( SUB ).

X

SECURITY

Type of security to beused: DB2 , USER , or DEFINER .

X

graphics/v7_icon.gif SPECIAL REGISTERS

Indicates whether values of special registers are inherited or defaulted.

X

graphics/v7_icon.gif STATIC DISPATCH

At function resolution time, DB2 chooses a function based on the static (or declared) types of the function parameters.

X

RUN OPTIONS

LE/370 runtime options.

X

CARDINALITY

An estimate of the expected number of rows returned by a table function.

X

[NO] DBINFO

Whether or not an additional argument is passed when the UDF is invoked.

X


NOTE

If the Validity column of Table 4.1 contains the value X, the characteristic applies to external functions only; S means it applies to sourced functions only; and B means it applies to both external and sourced functions.


UDF Programming Language Options

DB2 offers several application programming languages in which UDF programs can be written. The following languages are supported by DB2 for UDF creation:

  • Assembler

  • C

  • C++

  • COBOL

  • graphics/v7_icon.gif Java (new as of DB2 V7)

  • PL/I

DB2 also supports SQL scalar functions that are written using only SQL functionality.

Keep in mind, too, that all UDF programs must be designed to run in IBM's Language Environment (LE/370).

How Functions Are Executed

User-defined functions run in WLM-managed stored procedure address spaces. To execute a user-defined function, simply reference the function in an SQL statement. The SQL statement can be issued dynamically or statically, as part of an application program or via ad hoc SQL ” anywhere SQL can be run, the UDF can be coded. External UDFs require WLM running in GOAL mode.

When a function is invoked in an SQL statement, DB2 must choose the correct function to run to satisfy the request. DB2 will check for candidate functions to satisfy the function request. The manner in which DB2 chooses which function to run is based on the following criteria:

  • First of all, the schema must match. If the function being invoked is fully qualified, the schema must match for the function to be considered a candidate for execution. If the function being invoked is not fully qualified, DB2 will check the SQL path of the invoking process to find a function with a matching schema.

  • Of course, the name must match the function being invoked for the user-defined function to be considered a candidate for execution.

  • The number of parameters for the user-defined function must match the number of parameters specified by the invoked function. Additionally, the data type of each parameter must match, or be promotable to, the data types specified for each parameter in the function being invoked. Refer to Table 4.2 for a list of which data types are promotable to other data types. The data types in the first column can be promoted to the data types in the second column. When performing function resolution, the earlier the data type in the second column appears, the more preferable it is to the other promotable data types.

    To clarify this requirement, consider the following example:

     

     SELECT XSCHEMA.FUNCX(COLA) FROM   TABLE; 

    The data type of COLA is SMALLINT . Furthermore, two user-defined functions have been created named FUNCX , both in the same schema, XSCHEMA . Both FUNCX UDFs require one parameter, but one is defined with an INTEGER data type and the other with a data type of REAL . The SMALLINT data type is promotable to both INTEGER and REAL , but, because INTEGER appears first in the promotion list, the FUNCX with the INTEGER parameter will be used instead of the one with the REAL parameter.

  • The appropriate authority must exist. That is, the invoking authid must have the authority to execute the user-defined function.

  • Finally, the timestamp of the BIND for the user-defined function must be older than the timestamp of the BIND for the package or plan that invokes the function.

NOTE

For a function that passes a transition table, the data type, length, precision, and scale of each column in the transition table must match the data type, length, precision, and scale of each column of the table specified in the function definition.


For unqualified UDFs, it is possible that two or more candidate functions will fit equally well. In this case, the user-defined function whose schema name is earliest in the SQL path will be chosen for execution.

For example, suppose functions XSCHEMA.FUNC1 and YSCHEMA2.FUNC1 both fit the function resolution criteria equally well. Both have the same function name but different schema names . Both also fit the rest of the criteria regarding number of parameters, parameter data types, and requisite authority. If the SQL path is

 

 "ZSCHEMA"; "YSCHEMA"; "SYSPROC"; "SYSIBM"; "XSCHEMA"; 

DB2 will select function YSCHEMA.FUNC1 because YSCHEMA is before XSCHEMA in the SQL path.

The SQL path is specified to DB2 in one of two ways. The SQL path is determined by the CURRENT PATH special register for dynamically prepared SQL statements. For dynamic SQL, the SQL path can be set by issuing the SET CURRENT PATH statement. The PATH parameter of the BIND and REBIND commands is used to specify the SQL path for SQL containing UDFs in plans and packages.

DB2 supports function overloading. This means that multiple functions can have the same name and DB2 will decide which one to run based on the parameters. Consider an example where an application developer writes a UDF that overloads the addition operator + . The UDF is created to concatenate text strings together. The + function is overloaded because if it is acting on numbers it adds them together; if the function is acting on text it concatenates the text strings.

Table 4.2. Data Type Promotability

Data Type

Can Be Promoted to

CHAR or GRAPHIC

CHAR or GRAPHIC

 

VARCHAR or VARGRAPHIC

 

CLOB or DBCLOB

VARCHAR or VARGRAPHIC

VARCHAR or VARGRAPHIC

 

CLOB or DBCLOB

CLOB or DBCLOB

CLOB or DBCLOB

BLOB

BLOB

SMALLINT

SMALLINT

 

INTEGER

 

DECIMAL

 

REAL

 

DOUBLE

INTEGER

INTEGER

 

DECIMAL

 

REAL

 

DOUBLE

DECIMAL

DECIMAL

 

REAL

 

DOUBLE

REAL

REAL

 

DOUBLE

DOUBLE

DOUBLE

DATE

DATE

TIME

TIME

TIMESTAMP

TIMESTAMP

ROWID

ROWID

UDT

UDT (with the same name)


The process of following these steps to determine which function to execute is called function resolution .

NOTE

When automatic rebind is invoked on a package or plan that contains UDFs, DB2 will not consider any UDF created after the original BIND or REBIND was issued. In other words, only those UDFs that existed at the time of the original BIND or REBIND are considered during function resolution for plans and packages bound as a result of automatic rebind.


DSN_FUNCTION_TABLE and EXPLAIN

You can use EXPLAIN to obtain information about DB2 function resolution. To use EXPLAIN to obtain function resolution information, you must create a special table called DSN_FUNCTION_TABLE . When EXPLAIN is executed and UDFs are used, DB2 will store function resolution details for each UDF in the statement, package, or plan, in DSN_FUNCTION_TABLE .

Information will be populated in DSN_FUNCTION_TABLE when you execute an EXPLAIN on an SQL statement that contains one or more UDFs, or when you run a program whose plan is bound with EXPLAIN(YES) , and the program executes an SQL statement that contains one or more UDFs.

NOTE

EXPLAIN actually can be used to return a lot more information about SQL statements, including the actual access paths used by DB2 to run the SQL. Refer to Chapter 25, "Using EXPLAIN," for an in-depth exploration of using EXPLAIN and interpreting its results.


Remember, you must create a table named DSN_FUNCTION_TABLE before you can use EXPLAIN to obtain function resolution details. A sample CREATE statement for this table follows:

 

 CREATE TABLE  userid  .DSN_FUNCTION_TABLE   (QUERYNO        INTEGER      NOT NULL WITH DEFAULT,    QBLOCKNO       INTEGER      NOT NULL WITH DEFAULT,    APPLNAME       CHAR(8)      NOT NULL WITH DEFAULT,    PROGNAME       CHAR(8)      NOT NULL WITH DEFAULT,    COLLID         CHAR(18)     NOT NULL WITH DEFAULT,    GROUP_MEMBER   CHAR(8)      NOT NULL WITH DEFAULT,    EXPLAIN_TIME   TIMESTAMP    NOT NULL WITH DEFAULT,    SCHEMA_NAME    CHAR(8)      NOT NULL WITH DEFAULT,    FUNCTION_NAME  CHAR(18)     NOT NULL WITH DEFAULT,    SPEC_FUNC_NAME CHAR(18)     NOT NULL WITH DEFAULT,    FUNCTION_TYPE  CHAR(2)      NOT NULL WITH DEFAULT,    VIEW_CREATOR   CHAR(8)      NOT NULL WITH DEFAULT,    VIEW_NAME      CHAR(18)     NOT NULL WITH DEFAULT,    PATH           VARCHAR(254) NOT NULL WITH DEFAULT,    FUNCTION_TEXT  VARCHAR(254) NOT NULL WITH DEFAULT   ) IN  database.tablespace  ; 

After executing EXPLAIN for an SQL statement that uses a UDF, each of these columns will contain information about the UDF chosen during function resolution. The actual definition of the information contained in the columns of DSN_FUNCTION_TABLE is shown in Table 4.3.

Table 4.3. DSN_FUNCTION_TABLE Columns

Column Name

Description

QUERYNO

Indicates an integer value assigned by the user issuing the EXPLAIN or by DB2. Enables the user to differentiate between EXPLAIN statements.

QBLOCKNO

Indicates an integer value enabling the identification of subselects or a union in a given SQL statement. The first subselect is numbered 1 ; the second, 2 ; and so on.

APPLNAME

Contains the plan name for rows inserted as a result of running BIND PLAN specifying EXPLAIN(YES) . Contains the package name for rows inserted as a result of running BIND PACKAGE with EXPLAIN(YES) . Otherwise , contains blanks for rows inserted as a result of dynamic EXPLAIN statements.

PROGNAME

Contains the name of the program in which the SQL statement is embedded.

COLLID

Contains the collection ID for the package.

GROUP_MEMBER

Indicates the member name of the DB2 that executed EXPLAIN . The column is blank if the DB2 subsystem was not in a data sharing environment when EXPLAIN was executed.

EXPLAIN_TIME

Contains a TIMESTAMP value indicating when the EXPLAIN that created this row was executed.

SCHEMA_NAME

Contains the name of the schema for the invoked function.

FUNCTION_NAME

Contains the name of the UDF to be invoked.

SPEC_FUNC_NAME

Contains the specific name of the UDF to be invoked.

FUNCTION_TYPE

Contains a value indicating the type of function to be invoked:

 

SU

Scalar Function

 

TU

Table Function

VIEW_CREATOR

If the function is referenced in a CREATE VIEW statement, this column contains the creator name for the view. If not, the column is left blank.

VIEW_NAME

If the function is referenced in a CREATE VIEW statement, this column contains the name for the view. If not, the column is left blank.

PATH

Contains the value of the SQL path at the time DB2 performed function resolution for this statement.

FUNCTION_TEXT

Contains the first 100 bytes of the actual text used to invoke the UDF, including the function name and all parameters.


CAUTION

For UDFs specified in infix notation, the FUNCTION_TEXT column of DSN_FUNCTION_TABLE will contain the function name only. For example, suppose that * is UDF in the following reference:

 

 COL1*COL6 

In this case, the FUNCTION_TEXT column will contain only the value * and not the entire reference ( COL1*COL6 ).


Table Functions

Table functions are different in nature from scalar functions. A table function is designed to return multiple columns and rows. Its output is a table. An example using a table function follows:

 

 SELECT WINNER, WINNER_SCORE, LOSER, LOSER_SCORE FROM   FOOTBALL_RESULTS(5) WHERE  LOSER_SCORE = 0; 

In this case, the table function FOOTBALL_RESULTS() is used to return the win/loss statistics for football games . The table function can be used in SQL statements, just like a regular DB2 table. The function program is designed to fill the rows and columns of the "table." The input parameter is an INTEGER value corresponding to the week the game was played ; if is entered, all weeks are considered. The previous query would return all results where the losing team was shut out (had 0 points) during the fifth week of the season .

The following or similar CREATE FUNCTION statement could be used to define the FOOTBALL_RESULTS() function:

 

 CREATE FUNCTION FOOTBALL_RESULTS(INTEGER)  RETURNS TABLE (WEEK INTEGER,                 WINNER CHAR(20),                 WINNER_SCORE INTEGER,                 LOSER CHAR(20),                 LOSER_SCORE INTEGER)  EXTERNAL NAME FOOTBALL    LANGUAGE C  PARAMETER STYLE DB2SQL    NO SQL    DETERMINISTIC  NO EXTERNAL ACTION    FENCED    SCRATCHPAD    FINAL CALL    DISALLOW PARALLEL    CARDINALITY 300; 

The key parameter is the RETURNS TABLE parameter, which is used to define the columns of the table function. The function program must create these rows itself or from another data source, such as a flat file.

The value supplied for the CARDINALITY parameter is only an estimate. It is provided to help DB2 optimize statements using the table function. It is possible to return more or fewer rows than is specified in CARDINALITY .

Sourced Functions

Sourced functions are created from already existing built-in (scalar and column) and user-defined (scalar) functions. The primary reason to create a sourced function is to enable functions for user-defined distinct data types. This is required because DB2 implements strong typing.

More information on sourced functions and strong typing is provided later in this chapter in the section "User-Defined Data Types and Strong Typing." For now, though, the following is an example of creating a sourced UDF:

 

 CREATE FUNCTION FINDWORD (DOCUMENT, VARCHAR(50))   RETURNS INTEGER   SPECIFIC FINDWORDDOC SOURCE SPECIFIC FINDWORDCLOB; 

In this example, a new function, FINDWORD , is created from an existing function, FINDWORDCLOB . The function finds the location of the supplied word ( expressed as a VARCHAR(50) value) in the supplied DOCUMENT . The function returns an INTEGER indicating the location of the word in the DOCUMENT . DOCUMENT is a user-defined type based on a CLOB data type.

User-Defined Function Guidelines

The following guidelines can be used to help you implement effective and efficient user-defined functions for your organization.

Naming User-Defined Functions

The rules for naming user-defined functions are somewhat complex. The UDF name can be the same as another UDF, even if it is in the same schema. However, to give one function the same name as another function in the same schema, the number of parameters and the data type of the parameters must differ . DB2 will not allow a UDF to be created if the schema, UDF name, number of parameters, and data type of each parameter match another existing UDF.

Furthermore, the name of the user-defined function cannot be any of the following system defined key words:

ALL

AND

ANY

BETWEEN

DISTINCT

EXCEPT

EXISTS

FALSE

FOR

FROM

IN

IS

LIKE

MATCH

NOT

NULL

ONLY

OR

OVERLAPS

SIMILAR

SOME

TABLE

TRUE

TYPE

UNIQUE

UNKNOWN

=

=

<

<=

>

>=

<

>

<>

 

External UDF Program Restrictions

When you develop programs for external user-defined functions, DB2 places certain restrictions on the type of services and functions that can be used. Keep the following restrictions in mind as you code your external UDF programs:

  • COMMIT and ROLLBACK statements cannot be issued in a user-defined function. The UDF is part of the unit of work of the issuing SQL statement.

  • RRSAF calls cannot be used in user-defined functions. DB2 uses the RRSAF as its interface to user-defined functions. Therefore, any RRSAF calls made within the UDF code will be rejected.

  • If your user-defined function does not specify either the EXTERNAL ACTION or SCRATCHPAD parameter, the UDF may not execute under the same task each time it is invoked.

  • All open cursors in user-defined scalar functions must be closed before the function completes, or DB2 will return an SQL error.

  • The host language that is used to write UDF programs can impose restrictions on UDF development as well. Each programming language has its own restrictions and limits on the number of parameters that can be passed to a routine in that language. Be sure to read the programming guide for the language being used (before you begin coding) to determine the number of parameters allowed.

  • WLM running in GOAL mode is required for UDFs.

NOTE

The limitation on the number of parameters for the programming language to be used can impact table UDFs because table functions often require a large number of parameters (that is, at least one output parameter for every column of the table).


Starting and Stopping UDFs

Starting and stopping external UDFs can be an administrative burden . When UDFs fail, they will be stopped (unless you specify otherwise when the UDF is created). In order for any application to call the UDF again, it must first be started again. You should develop administrative procedures such that appropriate personnel have the authority to start and stop UDFs.

To start a UDF, you must issue the following DB2 command:

 

 -START FUNCTION SPECIFIC 

This will cause the named UDF to be activated. You do not need to issue a START command for new UDFs; DB2 will automatically activate a new UDF the first time an SQL statement invokes the function. For example, the following command starts the function XFUNC in the schema XSCHEMA :

 

 -START FUNCTION SPECIFIC(XCHEMA.XFUNC) 

CAUTION

The -START FUNCTION SPECIFIC command will not refresh the Language Environment in the WLM-established stored procedure address space. In order to refresh the Language Environment to establish a new load module for a UDF, you must issue the following WLM command:

 

 VARY WLM, APPLENV=applenv,REFRESH 


Similarly, you can issue the -STOP FUNCTION SPECIFIC command to stop a UDF. Stopping a UDF prevents DB2 from accepting SQL statements that invoke that function. Any SQL statements that have already been queued or scheduled by DB2 will continue to run. This command will not prevent these SQL statements from running.

UDF administration is complicated by the fact that only users with one of the following authorities (or qualities) can execute the START and STOP FUNCTION SPECIFIC commands:

  • The owner of the UDF

  • SYSOPR authority

  • SYSCTRL authority

  • SYSADM authority

CAUTION

Remember, these commands (both START FUNCTION SPECIFIC and STOP FUNCTION SPECIFIC ) apply to external UDFs only; built-in functions and sourced UDFs cannot be stopped and started using DB2 commands.


Starting and Stopping All Functions

Issuing either the -START FUNCTION SPECIFIC or the “STOP FUNCTION SPECIFIC command without any arguments starts, or stops, all UDFs known to the system. For example

 

 -START FUNCTION SPECIFIC 

starts all UDFs for the DB2 subsystem on which the command was issued.

Exercise Control Over UDF Stopping

In Version 8, DB2 for z/OS gives you greater control over UDF execution and utilization. You can now specify for each stored procedure or UDF the maximum number of failures that are allowed before DB2 stops the routine. Previously, when a UDF failed, DB2 stopped it, resulting in administrative overhead ”particularly in a test environment.

By specifying the most appropriate value for an individual routine, you can let some routines continue to be invoked for development and debugging and stop other routines for maintenance before they cause problems in a production environment.

The available options that can be specified using CREATE or ALTER FUNCTION are

  • STOP AFTER FAILURE : ” Indicating that the UDF is to be stopped every time it fails

  • STOP AFTER n FAILURES : ” Indicating that the UDF is to be stopped only after every n th failure

  • CONTINUE AFTER FAILURE : ” Indicating that the UDF is not to be stopped when it fails

These options should only be specified on external UDFs, not on sourced UDFs.

Keep It Simple

Each user-defined function program should be coded to perform one and only one task. The UDF program should be as simple as possible while still performing the desired task. Do not create overly complex UDF programs that perform multiple tasks based on the input. It is far better to have multiple UDFs, each performing one simple task, than to have a single, very complex UDF that performs multiple tasks . The UDF program will be easier to code, debug, understand, and maintain when it needs to be modified.

Use DSN_FUNCTION_TABLE

To be sure that the right UDF is being chosen during function resolution, use EXPLAIN to populate DSN_FUNCTION_TABLE . It is only by reading the contents of DSN_FUNCTION_TABLE that you can ascertain which UDF was chosen for execution by DB2 during function resolution.

Promote UDF Reuseability

User-defined functions should be developed with reuseability in mind. After the UDF has been coded and registered to DB2, it can be shared by multiple applications. It is wise to code your UDFs such that they perform simple, useful tasks that can be used by many applications at your site.

Reusing UDFs in multiple applications is better than creating multiple UDFs having the same (or similar) functionality for each application. You should promote reuseability while at the same time keeping the UDF code as simple as possible.

Promote Consistent Enterprisewide SQL

Consider replicating UDFs across subsystems in order to keep the SQL language consistent across the enterprise. This might add a few maintenance steps; however, the gain is a more mobile development staff.

Handle UDF Abends

When an external UDF abends, the invoking statement in the calling program receives an error code, namely SQLCODE -430 . The unit of work containing the invoking statement must be rolled back. The calling program should check for the “430 SQLCODE and issue a ROLLBACK when it is received.

Furthermore, the UDF might need to be restarted depending on the setting of the [STOP CONTINUE] AFTER FAILURE option for the UDF.

Invoke UDFs Using Qualified Names

Use the qualified name of a function in the invoking SQL statement. By doing so, you simplify function resolution. DB2 will only search for functions in the specific schema you code. Therefore, DB2 is more likely to choose the function you intend, and the function resolution process will take less time to complete, because fewer functions will qualify as candidates.

Consider Using SET CURRENT PACKAGE PATH

The new special register, CURRENT PACKAGE PATH , was added with DB2 Version 8. It is particularly useful for those organizations that use more than one collection for packages. In prior releases, applications that do not use plans must issue the SET CURRENT PACKAGE PATH statement each time a package from a different collection is used. With the CURRENT PACKAGE PATH special register, an application programmer can specify a list of package collections in one SET CURRENT PACKAGE PATH statement.

Using SET CURRENT PACKAGE PATH can reduce network traffic, simplify application coding, and result in improved processing and elapsed time.

CAST Parameters to the Right Data Type

Use the CAST function to cast the parameters of the invoked UDF to the data types specified in the user-defined function definition. This assists the function resolution process to choose the correct function for execution.

For example, consider a sample UDF named TAXAMT . It requires one input parameter, which is defined as DECIMAL(9,2) . If you want to pass a column defined as INTEGER to the UDF, use the CAST function as follows to cast the value of the integer column to a DECIMAL(9,2) value:

 

 SELECT TAXAMT(CAST (INT_COL AS DECIMAL(9,2))) FROM   TABLE; 

Define UDF Parameter Data Types Efficiently

Avoid defining UDF parameters using the following data types: CHAR , GRAPHIC , SMALLINT , and REAL . Instead, use VARCHAR , VARGRAPHIC , INTEGER , and DOUBLE , respectively.

To clarify this guideline, consider a UDF named FUNCX that is defined with a parameter of data type SMALLINT . To invoke this UDF, the parameter must be of data type SMALLINT . Using a data type of INTEGER will not suffice. For example, the following statement will not resolve to FUNCX , because the constant 500 is of type INTEGER , not SMALLINT :

 

 SELECT FUNCX(500) FROM   TABLE; 

The same line of thinking applies to CHAR , GRAPHIC , and REAL data types. Of course, you could use the CAST function as described previously to resolve this problem. But it is better to avoid the problem altogether by specifying VARCHAR , VARGRAPHIC , INTEGER , and DOUBLE as parameter data types instead.

Choose Parameter Data Types for Portability

If you need to ensure that your UDFs are portable across platforms other than DB2 for z/OS, avoid defining UDFs with parameter data types of FLOAT or NUMERIC . Instead, use DOUBLE or REAL in place of FLOAT , and DECIMAL in place of NUMERIC .

UDFs Do Not Require Parameters

It is possible to code user-defined functions that have no parameters. However, when creating and executing the UDF, you still need to specify the parentheses with no value supplied for a parameter. For example, to create a procedure named FLOWERS() that requires no parameters, you should code the following:

 

 CREATE FUNCTION FLOWERS(). . . 

Similarly, to execute the UDF, you would code it in an SQL statement with the parentheses, but without specifying any parameter values, as shown in the following:

 

 SELECT FLOWERS() FROM   TABLE; 

Use the Sample User-Defined Functions As Templates

IBM provides quite a few sample programs for user-defined functions. Examine these samples for examples of how to implement effective DB2 user-defined functions. There are sample function programs for

  • Converting date and time formats

  • Returning the name of the day or month for a specific date

  • Formatting floating point data as a currency value

  • Returning DB2 catalog information for DB2 objects

  • Returning a table of weather data

These functions can be used as samples to learn how to code function programs for your specific needs and requirements.

SQL Usage Options Within External UDFs

There are four options for external functions regarding their usage of SQL:

  • NO SQL Indicates that the function cannot execute SQL statements. However, non-executable SQL statements, such as DECLARE CURSOR , are not restricted.

  • MODIFIES SQL DATA Indicates that the function can execute any legal SQL statement that can be issued by a UDF.

  • READS SQL DATA Indicates that the function can execute SQL statements that access data, but cannot modify data (this is the default SQL usage option for UDFs).

  • CONTAINS SQL Indicates that the function can execute SQL statements as long as data is neither read nor modified, and the SQL statement is legal for issuance by a UDF.

Table 4.4 indicates which SQL statements are valid for each type of SQL usage just described.

Table 4.4. Using SQL Within User-Defined Functions

SQL Statement

NO SQL

CONTAINS SQL

READS SQL

MODIFIES SQL

ALLOCATE CURSOR

N

N

Y

Y

ALTER

N

N

N

Y

ASSOCIATE LOCATORS

N

N

Y

Y

BEGIN DECLARE SECTION

Y

Y

Y

Y

CALL

N

Y

Y

Y

CLOSE

N

N

Y

Y

COMMENT ON

N

N

N

Y

COMMIT

N

N

N

N

CONNECT

N

N

N

N

CREATE

N

N

N

Y

DECLARE CURSOR

Y

Y

Y

Y

DECLARE GLOBAL

       

TEMPORARY TABLE

N

Y

Y

Y

DECLARE STATEMENT

Y

Y

Y

Y

DECLARE TABLE

Y

Y

Y

Y

DELETE

N

N

N

Y

DESCRIBE

N

N

Y

Y

DESCRIBE CURSOR

N

N

Y

Y

DESCRIBE INPUT

N

N

Y

Y

DESCRIBE PROCEDURE

N

N

Y

Y

DROP

N

N

N

Y

END DECLARE SECTION

Y

Y

Y

Y

EXECUTE

N

Y

Y

Y

EXECUTE IMMEDIATE

N

Y

Y

Y

EXPLAIN

N

N

N

Y

FETCH

N

N

Y

Y

FREE LOCATOR

N

Y

Y

Y

GRANT

N

N

N

Y

HOLD LOCATOR

N

Y

Y

Y

INCLUDE

Y

Y

Y

Y

INSERT

N

N

N

Y

LABEL ON

N

N

N

Y

LOCK TABLE

N

Y

Y

Y

OPEN

N

N

Y

Y

PREPARE

N

Y

Y

Y

RELEASE

N

N

N

N

RENAME

N

N

N

Y

REVOKE

N

N

N

Y

ROLLBACK

N

N

N

N

SELECT

N

N

Y

Y

SELECT INTO

N

N

Y

Y

SET

N

Y

Y

Y

SET CONNECTION

N

N

N

N

SIGNAL SQLSTATE

N

Y

Y

Y

UPDATE

N

N

N

Y

VALUES

N

N

Y

Y

VALUES INTO

N

Y

Y

Y

WHENEVER

Y

Y

Y

Y


CAUTION

When a stored procedure is called from a user-defined function, it must allow for the same or more restrictive data access as the calling UDF. For example, a UDF defined as READS SQL DATA can call a procedure defined as READS SQL DATA or CONTAINS SQL . It cannot call a procedure defined as MODIFIES SQL DATA . The hierarchy of data access from least to most restrictive is

 

 MODIFIES SQL DATA READS SQL DATA CONTAINS SQL 


When to DISALLOW PARALLEL Operations

A table function cannot operate in parallel, so the DISABLE PARALLEL parameter should be specified when issuing a CREATE FUNCTION statement for a table UDF.

Some functions that are NOT DETERMINISTIC can receive incorrect results if the function is executed by parallel tasks. Specify the DISALLOW PARALLEL option for these functions.

Likewise, some functions that rely on a SCRATCHPAD to store data between UDF invocations might not function correctly in parallel. Specify the DISALLOW PARALLEL option for these functions, too.

DETERMINISTIC Versus NOT DETERMINISTIC

Be sure to specify accurately whether the UDF will always return the same result for identical input arguments. If the UDF always returns the same result for identical input arguments, the UDF is DETERMINISTIC . If not, the UDF should be identified as NOT DETERMINISTIC . Any UDF that relies on external data sources that can change should be specified as NOT DETERMINISTIC . Other examples of functions that are not deterministic include any UDF that contains SQL SELECT , INSERT , UPDATE , or DELETE statements or a random number generator.

DB2 uses the [NOT] DETERMINISTIC parameter to optimize view processing for SQL SELECT , INSERT , UPDATE , or DELETE statements that refer to the UDF. If the UDF is NOT DETERMINISTIC , view merge and parallelism will be disabled when the UDF is specified.

Choose the UDF SECURITY Option Wisely

The SECURITY parameter indicates how the UDF will interact with an external security product, such as ACF2 or RACF. If SECURITY DB2 is specified, the UDF does not require an external security environment.This is the default value for SECURITY . If the UDF accesses resources protected by an external security product, the access is performed using the authid that is associated with the WLM-established stored procedure address space.

If SECURITY USER is specified, an external security environment should be established for the function. If the function accesses resources that the external security product protects, the access is performed using the primary authid of the process that invoked the UDF.

The third and final option for SECURITY is DEFINER . If this option is chosen and the UDF accesses resources protected by an external security product, the access is performed using the primary authid of the owner of the UDF.

Handling Null Input Arguments

There are two option for handling null input arguments in user-defined functions: RETURNS NULL ON NULL INPUT and CALLED ON NULL INPUT . If nulls are to be allowed to be specified as input to a UDF, the UDF must be programmed to test for and handle null inputs.

If RETURNS NULL ON INPUT is specified when the UDF is created, the function is not called if any of the input arguments are null. The result of the function call is null.

If CALLED ON NULL INPUT is specified when the UDF is created, the function is called whether any input arguments are null or not. In this case, the UDF must test for null input arguments in the function program.

UDF Scratchpads

The [NO] SCRATCHPAD clause should be specified to indicate whether DB2 provides a scratchpad for the UDF to utilize. In general, external UDFs should be coded as reentrant, and a scratchpad can help to store data between invocations of the UDF. A scratchpad provides a storage area for the UDF to use from one invocation to the next.

If a scratchpad is specified, a length should be provided. The length can be from 1 to 32,767; the default is 100, if no length is specified.

The first time the UDF is invoked, DB2 allocates memory for the scratchpad and initializes it to contain all binary zeroes. The scope of a scratchpad is a single SQL statement. A separate scratchpad is allocated for each reference to the UDF in the SQL statement. So, if the UDF is specified once in the SELECT -list and once in the WHERE clause, two scratchpads would be allocated. Furthermore, if the UDF is run in parallel, one scratchpad is allocated for each parallel task.

CAUTION

Take care when using SCRATCHPAD with ALLOW PARALLEL because results can be difficult to predict.Consider, for example, a UDF that uses the scratchpad to count the number of times it is invoked.The count would be thrown off if run in parallel because the count would be for the parallel task, not the UDF.For this reason, be sure to specify DISALLOW PARALLEL for UDFs that will not operate in parallel.


If the UDF acquires system resources, be sure to specify the FINAL CALL clause to make sure that DB2 calls the UDF one last time, so the UDF can free the system resources it acquired .

Specify EXTERNAL ACTION UDFs in SELECT -List to Ensure Processing

To make sure that DB2 executes a UDF with external actions for each row of the result set, the UDF should be in the SELECT -list of the SQL statement. The access path chosen by DB2 determines whether UDFs in predicates are executed. Therefore, to be sure the external actions in a UDF are processed , the UDF should be invoked in the SELECT -list, not just in a predicate.

Specify CARDINALITY Option to Guide Results for Table UDFs

DB2 Version 8 adds the capability to override the expected cardinality of a table UDF on a case-by-case basis. The cardinality of the UDF is specified in the SYSIBM.SYSROUTINES catalog table when the UDF is registered to DB2. Consider overriding cardinality when you expect the results to be significantly greater than or less than the documented cardinality. Doing so enables DB2 to better gauge the access path for SQL queries against table UDFs.

There are two options for using the CARDINALITY keyword:

  • CARDINALITY n Indicates the expected number of rows to be returned by the table UDF. In this case, n must be an integer constant.

  • CARDINALITY MULTIPLIER n Indicates a multiplier for the cardinality (as stored in SYSIBM.SYSROUTINES ); the result of multiplying the CARDINALITY column in the catalog by the multiplier n specifies the expected number of rows to be returned by the UDF. In this case, n can be any numeric constant.

Consider the following two examples:

 

 SELECT * FROM TABLE(FOOTBALL_RESULTS(0) CARDINALITY 28) AS WEEK2; SELECT * FROM TABLE(FOOTBALL_RESULTS(0) CARDINALITY MULTIPLIER 2) AS WEEK2; 

For the sake of argument, let's assume that the CARDINALITY column in SYSIBM.SYSROUTINES for this UDF is 14 (the normal number of NFL games in a week). Either of the two SQL statements can be used to estimate the number of rows to be returned after two weeks of play.

For the MULTIPLIER option, we can specify non-integers. If only half a week's football results were available, we could change the cardinality using the multiplier as follows:

 

 SELECT * FROM TABLE(FOOTBALL_RESULTS(0) CARDINALITY MULTIPLIER 0.5) AS HALFWEEK; 

 <  Day Day Up  >  


DB2 Developers Guide
DB2 Developers Guide (5th Edition)
ISBN: 0672326132
EAN: 2147483647
Year: 2004
Pages: 388

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net