Creating a UDF | SQL Server 2000 Stored Procedures Handbook (Experts Voice)

Creating a UDF is similar to creating a stored procedure. To being with, we use the CREATE FUNCTION statement, followed by the name of the function we are creating. This is followed with any incoming parameters, although there are no output parameter definitions, as you would find with stored procedures. Any output data is placed in a RETURNS statement that follows the input parameter definition. The remainder of building a function then follows a similar manner to that of building a stored procedure.

It is important to know that functions can be defined in two different ways, and before we can create a UDF we need to know about these two types. Now, let's discuss them.

Types of Functions

No matter how the information is returned, functions fall into two categories – deterministic and non-deterministic. It is important to know what type of function you are creating, as this affects the design of the function, such as the code that can be placed into the function. Based on the T-SQL placed within the function SQL Server decides if the function is deterministic or not.

Deterministic Functions

Deterministic functions will always return the same information each time they are called, providing that the parameters passed remain the same. The Factorial stored procedure of Chapter 1 is one such example. Another example would be a function that returns the area of a circle when we pass in the radius.

The requirements for a function to be deterministic are:

The function must be schema bound (even if it doesn't access any schema objects).
All functions (in-built or user-defined) called by it must also be deterministic.
No external objects can be used. In other words, the function can only reference objects having local scope.
They cannot call any extended stored procedures.

Non-Deterministic Functions

Non-deterministic functions can return different results each time they are called with a specific set of input values. For example, a function retrieving the price of a share on the stock market or the balance of a customer's bank account would be non-deterministic. Keep in mind that between calls a non-deterministic function may not return the same information.

We can tell if a function is deterministic or not by checking the value of its OBJECTPROPERTY through a SELECT statement. This property is defined and set by the SQL Server.

We can check the determinism of the function by executing the following code (execute it on the Northwind database):

     SELECT OBJECTPROPERTY(OBJECT_ID('dbo.fn_EmpName'),'IsDeterministic')

A return value of 0, false, indicates the function is non-deterministic, while 1, true, indicates that it is deterministic.

Computed Columns

Deterministic functions are important when creating an index on a view or computed column.

Important

If an index on a view or computed column refers to a user defined function, then that function must be deterministic.

When you create an index on a computed column, the index must calculate the values for the computed column to enable it to build the keys for the index. The index will then only work while any data modification on the column follows exactly the same algorithm; in other words, the function used to calculate the value needs to be deterministic. An error will be raised if you try to modify key values that use a algorithm different from that used when the index was originally created. If you do want to alter the computation, for example if you have a column that computes a tax amount, and the tax rate alters, then you need to drop the index and recreate it.

Example of Creating a UDF

Now, let's create a simple UDF. It will format an employee's name from the Employee table of Northwind database and return the employee's title, initial of first name, and last name. This is a deterministic function:

     CREATE FUNCTION fn_EmpName (@EmpID INT)     RETURNS VARCHAR(50)     AS     BEGIN       DECLARE @NewName VARCHAR(50)       SELECT @NewName = RTRIM(TitleOfCourtesy) + ' ' + LEFT(FirstName,1) +         '. ' + RTRIM(LastName)       FROM dbo.Employees       WHERE EmployeeID = @EmpID       RETURN @NewName     END

Creating functions requires some amount of planning, as UDFs have some limitations, as compared to stored procedures. We will be discussing these limitations in the next section.

For now, we will attempt to convert a non-deterministic function into a deterministic one.

Converting a Non-Deterministic Function to Deterministic

This process is somewhat tricky, as we are trying to include a non-deterministic function within our UDF, which is not allowed.

Important

Any system function, used within a UDF must be deterministic. If you create a UDF which is non-deterministic, it can be used within another UDF.

The follow example demonstrates how to get around such a problem.

The non-deterministic function, in our example, adds the number of business days, which we pass as a parameter, to the current date and return the new date. Note that our business day won't include weekends. The value returned by our UDF is a DATETIME data type and will be the new date. Our function makes use of the GETDATE() function, which is non-deterministic:

     CREATE FUNCTION fn_AddWorkDays (@NoDays INT)     RETURNS DATETIME     AS     BEGIN       DECLARE @Today DATETIME       SET @Today = GETDATE()       -- This should not happen but just in case the function is       --called on a weekend       IF DATEPART(dw,@Today) IN (1,7)       RETURN NULL       --Main Part of the function       WHILE @NoDays > 0       BEGIN         IF DATEPART(dw,@Today) = 6         SET @Today = DATEADD(day,3,@Today)         ELSE         SET @Today = DATEADD(day,1,@Today)         SET @NoDays = @NoDays - 1       END       RETURN @Today     END

When we execute this code in Query Analyzer we will receive the following error:

     Server: Msg 443, Level 16, State 1, Procedure fn_AddWorkDays, Line 7     Invalid use of 'getdate' within a function.

This error occurs because of the GETDATE() function, which is non-deterministic, and as we mentioned earlier we cannot have non-deterministic functions inside of a UDF.

Now, to get around this problem we need to modify the function, so that it contains only deterministic function calls. This means we cannot use the GETDATE() function; instead, we will have to start with our own date. In other words, we will have to pass it in as a parameter.

Now, take a look at the modified function:

     CREATE FUNCTION fn_AddWorkDays (@NoDays INT, @InDate DATETIME)     RETURNS DATETIME     AS     BEGIN       IF DATEPART(dw,@InDate) IN (1,7)       RETURN NULL       DECLARE @Today DATETIME       SET @Today = @InDate       ...

Here, the caller is required to provide the current date. This might be something that you would want to put in the function's documentation, so that any future amendments don't try to put GETDATE() in. Of course, this implies that the function's caller passes in the necessary date.

Although we may view this as a deterministic function, SQL Server sees this as non-deterministic. The reason is that DATEPART() is non-deterministic, even although it will always return the same value for the same input parameters. DATEADD() is also seen by SQL Server as being non-deterministic. However, using these wouldn't stop your UDFs from being deterministic.

As with stored procedures, the BEGIN and END statements surround multi-statement lines of code, and you code functions as you would code any other stored procedure, with one or two exceptions.

Errors cannot be raised using the RAISERROR statement and functions must have an explicit RETURN statement, compared to stored procedures which can have an explicit or implicit RETURN.

In our above example, we return a NULL value if the date is invalid. You have to be aware of this if you are coding functions, as returning a bad value can cause problems. For example, in our above example if we didn't return a NULL value and we happened to pass an invalid date, then it might be difficult to interpret the output of this function. If the return value was used in a WHERE clause, then you could even find erroneous data being processed.

Keep your function as short and succinct as possible, as the function could be called against every row of returned data in a rowset; for example, if the function is formatting text, such as concatenating the initial o f an employee's first name and last name into a single column. If the function is deterministic and the value returned is constant, then it would be better to call the function and place the value in to a local variable, rather than calling the function many times. This is quite a performance gain, especially when working with a large number of rows.

Now let's look at calling a UDF.