Chapter 13: CLR Functions and Triggers | Microsoft SQL Server 2005 Stored Procedure Programming in T-SQL & .NET

This chapter explores the details of developing SQL Server 2005 managed functions and managed triggers.

CLR Function

SQL Server 2005 also supports the creation of CLR user-defined functions. It is possible to create three different types of these functions:

Scalar-valued
Table-valued
User-defined Aggregations

I will show you how to create scalar-and table-valued CLR functions in this chapter, and I will cover User-defined Aggregations in the next chapter.

CLR functions are created as CLR stored procedures—as static functions or methods with a SqlFunction attribute in a .NET class. The class is then deployed on SQL Server as an assembly, and finally functions are then linked with their names in a database. From that point on, the functions are accessible from Transact-SQL or managed code.

Scalar-valued CLR Functions

Functions of this type return a single value (scalar) to the caller. Like their Transact-SQL counterparts, they can be used as part of any Transact-SQL expression (for example, in Where clauses, select lists, or conditional statements). They can have input parameters or return values of any SQL Server data type except these:

text
ntext
image
timestamp
rowversion
cursor
table

Managed code must use .NET data types that are compatible with SQL Server data types (as described earlier in the "CLR Data Types Mapping" section in Chapter 12. In the following function, I used the .NET function for conversion of dates. It supports some predefined formats and cultures and also custom formats based on a set of format specifiers. It goes far beyond the limited set of formats available in the Transact-SQL Convert function. The CLR function is preceded by a SqlFunction attribute. The return statement is used to perform conversion and return the result to the caller.

      [SqlFunction]      public static String cf_DateConv(DateTime dt, string format)      {              return dt.ToString(format);      }

User-defined CLR functions can be deployed in Visual Studio in the same way as stored procedures. If you want to deploy them manually, the only difference is that you will use a Create Function statement, and that it will contain a Returns clause that defines the data type of the returned value:

      CREATE ASSEMBLY MyCShrpUdf FROM 'MyCShrpUdf.dll'      GO      CREATE FUNCTION dbo.cf_DateConv()      RETURNS Varchar(30)      AS EXTERNAL NAME MyCShrpUdf.CLRModules.cf_DateConv

You can use statements such as the following one to test it:

 SELECT [Asset5].[dbo].[cf_DateConv]  (GetDate(),'dddd - d - MMMM')

The result, on a machine using U.S. English culture, will be

 ----------------------- Sunday - 4 - December (1 row(s) affected)

The next function performs almost the opposite operation. It is used when you have string dates in another language (culture) and some unusual format. It uses the Parse method of the DateType class to recognize the string as a date written in a specific language (culture).

      [SqlFunction]      public static String cf_DateConv_DtFmtCult(string dt,                                                 string format,                                                 string culture)      {          CultureInfo MyCultureInfo = new CultureInfo(culture);          DateTime MyDateTime = DateTime.Parse(dt, MyCultureInfo);          return MyDateTime.ToString(format);      }

You can test it like this:

 SELECT [dbo].cf_DateConv_DtFmtCult ('12 June 2002',                                     'dddd - d - MMMM, yyyy',                                     'de-DE'}

Note

To use the CultureInfo class, you must include a reference to the System.Globalization namespace in your code.

The following function verifies that the string that is passed as a parameter is a valid string. It is based on the Regex.IsMatch function that is defined in the System. Text.RegularExpressions namespace.

 [SqlFunction] public static bool cf_IsValidEmail(string email) {         //Is it a valid email address:         return Regex.IsMatch(email,      @"^([\w-]+\.)*?[\w-]+@[\w-]+\.([\w-]+\.)*?[\w]+$"); }

You can test it with this:

 select dbo.cf_IsValidEmail('dejanOtrigonblue.com') select dbo.cf_IsValidEmail('dejan@trigon.blue..com')

The result is

 ----- 1 (1 row(s) affected) ----- 0 (1 row(s) affected)

SqlFunction Attribute

The primary purpose of this attribute is to notify the .NET compiler that the following managed static function will be converted to a CLR function. However, there are some additional useful properties, explained next, that you can use with the attribute.

DataAccess

The DataAccess property of the SqlFunction attribute helps the compiler to optimize the function by not using an in-process provider code if the function does not require it. The simple functions shown earlier did not use this additional property. Instead, they relied on the default behavior of CLR functions in SQL Server—that data is not accessed. This default is equivalent to declaring it using the DataAccess property:

 [SqlFunction(DataAccess = DataAccessKind.None)]

The following function accesses data through the database and requires the additional property:

 [SqlFunction(DataAccess = DataAccessKind.Read)] public static int cf_OrderCount() {     using (SqlConnection sqlConn         = new SqlConnection("context connection=true"))     {         sqlConn.Open();         SqlCommand sqlCmd = new SqlCommand(             "select count(*) AS 'Order Count' from dbo.OrderHeader",              sqlConn);         return (int)sqlCmd.ExecuteScalar();     } }

You can test it with this:

 SELECT Asset5.dbo.cf_OrderCount()

Note

SQL Server 2005 Books Online contains the following statement: "The SqlFunction attribute indicates whether or not the function accesses or modifies data…." I believe that this is a mistake in the documentation, and that the only allowed enumerators are DataAccessKind.Read and DataAccessKind.None.

IsDeterministic

This attribute allows SQL Server 2005 to optimize indexing of the results of user-defined functions that are used in calculated columns.

 [SqlFunction(IsDeterministic = true)]

IsPrecise

This attribute is also used to help SQL Server decide whether the results of the function should be indexed if used in a computed column. The idea is that floating point operations are not precise and therefore cannot be used on indexed computed columns.

 [SqlFunction(IsDeterministic = true, IsPrecise = true)]

SystemDataAccess

This attribute defines whether a CLR function requires access to system virtual tables and catalogs. Possible values are SystemDataAccessKind. Read and SystemDataAccessKind. None (which is the default value).

 [SqlFunction(SystemDataAccess = SystemDataAccessKind.Read)]

Name

By default, Visual Studio 2005 will name the CLR function using the name of the static method in managed code. This attribute is used to force Visual Studio 2005 to change the name of the CLR function.

 [SqlFunction(Name = "cf_OrderCount",                      DataAccess = DataAccessKind.Read)] public static int OrderCount() ...

Table-valued CLR Functions

Table-valued CLR functions (CLR TVFs) can be used as Transact-SQL TVFs in any Transact-SQL construct in which a table can be used.

The CLR TVF interface is very strange and complex from the standpoint of object-oriented languages such as C# and Visual Basic .NET, so these functions must be declared in a very specific way and require a relatively complex and unusual structure.

Creating a Table-valued CLR Function

To define a CLR TVF, you must define two methods in managed code:

InitMethod A function that returns an object of the IEnumerable class (or some other class derived from IEnumerable), which is preceded by the SqlFunction attribute and which will become a CLR TVF
FillRow A method that returns (fills) a single result row

Note

It is not required, but it is acceptable to actually name tbese two methods using tbese names. Personally, I like to name InitMethod after tbe function tbat it is implementing, and I typically leave tbe second metbod as FillRow. However, even if tbe names are changed, I will continue to refer to tbese methods as InitMethod and FillRow.

The SqlFunction attribute of InitMethod uses FillRowMethodName to point to the FillRow method. This is the place where you can change the name of the FillRow method if you want:

 [SqlFunction(FillRowMethodName = "FillRow",

With the attribute set in this way, the framework calls the MoveNext() method once for each row of the object (row set) to be returned. It then calls the FillRow() method each time. The procedure must have a generic parameter of the type object (which can be filled with anything) and a list of out parameters that correspond to the columns of the CLR TVF:

 public static void FillRow(object obj,                            out string fileName, out long size) {     Object[] row = (object[])obj;     fileName = (string)row[0];     size = (long)row[1]; }

This procedure returns (as output parameters) the content of a generic (input) object (parameter).

Data for the generic object is collected in the first function. An object of the ArrayList class is filled with generic objects that contain file information. In the following function, the system loops through files and gathers full filenames (path) and sizes in bytes. At the end, the ArrayList object is returned as a return value of the function, since it implements the IEnumerable interface (that is, it is derived from the interface):

 public static IEnumerable ClrTvfFolderList(string folder) {     ArrayList fileArray = new ArrayList();     // loop through files in the folder     foreach (string file in Directory.GetFiles(folder,         "*.*", SearchOption.TopDirectoryOnly))     {         FileInfo fi = new FileInfo(file);         object[]    row  =  new  object[2];         row[0]    =   fi.FullName;         row[l]    =   fi.Length;         fileArray.Add(row);     }     return fileArray; }

To pull everything together, I have added references to System.IO and System. Collections and wrapped everything in a public class:

 using System; using System.Data; using System.Data . SqlClient; using System.Data.SqlTypes; using Microsoft.SqlServer.Server; using System.IO; using System.Collections; public partial class UserDefinedFunctions {     [SqlFunction(FillRowMethodName = "FillRow",         TableDefinition = "fileName nvarchar (max) , size bigint")]     public static IEnumerable ClrTvfFolderList(string folder)     {        ArrayList fileArray = new ArrayList();        // loop through files in the folder        foreach (string file in Directory.GetFiles(folder,            "*.*", SearchOption.TopDirectoryOnly))     {        FileInfo fi = new FileInfo(file);        object!]   row  =  new  object [2] ;        row[0]    =   fi.FullName;        row[l]    =   fi.Length;        fileArray.Add(row);     }     return f i1eArray;   }   public static void FillRow(object obj, out string fileName, out long siz)   {       Object!] row = (object [] } obj ;       // get data from obj       fileName = (string)row[0];       size = (long)row[1];   } };

To deploy the function using Visual Studio 2005,1 set Permission Level to External and made the function part of the dbo schema. It is also critical to add a TableDefinition attribute in the SqlFunction identifier. It should contain a list of column definitions (result set) for the CLR TVF. Only then will Visual Studio 2005 know how to populate the Table clause of the Create Function statement. However, if you are deploying the function manually, you do not need to have this attribute.

To deploy the procedure manually, you should use a script that creates the assembly and creates the function:

 CREATE ASSEMBLY ClrTvfFolderList FROM' C:\Proj ects\ClrTvfFolder\bin\Debug\ClrTvfFolderList.dll' WITH PERMISSION_SET = External_Resources GO CREATE FUNCTION ClrTvfFolderList(string folder) RETURNS TABLE (fileName nvarchar(max), size bigint) AS EXTERNAL NAME ClrTvfFolderList.UserDefinedFunctions.ClrTvfFolderList GO

The Create Function statement can be used to set the name of the function. It could be different from, or the same as, the name of the method it references in the External Name clause. The format of External Name is Assembly. Class.Method.

Note

Some may argue that you can achieve a similar result using xp_cmdshell. While this is true, xp_cmdshell also opens a door to your system that is difficult to guard. It is less risky to create this type of assembly to give you limited functionality while satisfying your requirements.

The following table-valued function parses a delimited string and returns it as a recordset. The function can be used to accept arrays as input values for stored procedures:

 using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using System.Data.SqlClient; using Microsoft.SqlServer.Server; using System.Text.RegularExpressions; using System.Text; using System.Collections; namespace Samples.SqlServer {     public class stringsplits     {         [SqlFunction(             FillRowMethodName = "FillRow",             TableDefinition = "segment nvarchar(max)")]         public static IEnumerable clrtvf_Split(SqlString str, string splitChar)         {             string[] m_strlist;             if (!str.IsNull)             {                 m_strlist = str.Value.Split(splitChar.ToCharArray());                 return m_strlist;             }             else                 return "";         }         public static void FillRow(Object obj, out string segment)         {             segment = (string)obj;         }     } };

Most of the work is performed using the .NET Framework Split() function, which parses the string and splits it into an array.

To test this function, use something like this:

 select * from clrtvf_Split('Memory, CPU, Resolution, Size, Weight, Color, HDD-count, HDD-capacity', ' , ' )

You will get the following result:

 -------------- Segment Memory CPU Resolution Size Weight Color HDD-count HDD-capacity (8 row(s) affected)

The following function is another version of the Split function. It differs slightly from the previous one in that it returns a row number (array index) in the result. It is based on the use of the SplitParts struct, designed to store pairs of string segments and indexes in the array.

 using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using System.Data.SqlClient; using Microsoft.SqlServer.Server; using System.Text.RegularExpressions; using System.Text; using System.Collections; namespace Samples.SqlServer {     public class stringsplit     {         internal struct SplitParts         {             // structure that will be used to store parts             public int id;             public string result;             public SplitParts(int splitld, string split)             {                 id = splitld;                 this.result = split;             }         }         [Microsoft.SqlServer.Server.SqlFunctionAttribute(              FillRowMethodName = "FillRow",              TableDefinition = "segment nvarchar(max),                                 i int"}]         public static IEnumerable clrtvf_Splitl(SqlString str,                                                 string splitChar)         {             // split str using delimiter in splitChar             //and return it as table valued function             // in format: segment varchar(max), i (row_number) int             string [] m strlist;             SplitParts [] a;             if (!str.IsNull)             {                 m_strlist = str.Value.Split(splitChar.ToCharArray(}};                 int count = m_strlist.GetUpperBound(0};                 a = new SplitParts[count + 1] ;                 for (int i = 0; i < m_strlist.GetUpperBound(0} + 1; i++)                 {                     a[i] = new SplitParts(i, m_strlist[i]};  ;                 }                 return a;             }             else                 return "";         }         public static void FillRow(Object obj, out string segment, out int i)         {             SplitParts ab = (SplitParts}obj;             segment = ab.result;             i = ab.id;         }     } }

Note

It is possible to do this in other ways as well. For example, you could create a static variable to serve as a counter of rows. The problem with such a solution is that the use of static variables is not permitted in Safe assemblies.

To test the function, execute the following:

 select * from clrtvf_Splitl('Memory, CPU, Resolution, Size, Weight,' + 'Color, HDD-count, HDD-capacity', ', ' )

The result will be

 str                           ind -----------                   ----------- Memory                        0 CPU                           1 Resolution                    2 Size                          3 Weight                        4 Color                         5 HDD-count                     6 HDD-capacity                  7 (8 row(s) affected)

Output Values in Visual Basic

Output values for the FillRow method in C# are parameters passed by reference. Visual Basic .NET does not support output parameters, although it does support pass-by-reference parameters. You need to add a special <Out()> attribute to the ByRef declaration of the parameter:

 Public Shared Sub FillRow (ByVal obj As Object,                            <Out()> ByRef Segment As SqlChars,                            <Out()> ByRef i As Sqllnt32)

For everything to work, you also need to add a reference to the System.Runtime. InteropServices namespace in the Visual Basic .NET assemblies.

Using Nonstandard .NET Framework Assemblies

The following table-valued function returns a list of logical drives on the system, along with their types and free space values. It is based on the class library in the SystemManagement namespace, which is not in the list of assemblies/namespaces that are loaded by default in SQL Server. Thus, this function not only demonstrates how to create a table-valued function and how to read free space on a drive, but also how to use nonstandard .NET Framework assemblies.

The function uses ManagementObjectSearcher to query ("SELECT * From Win32_LogicalDisk"!☺!) the system for disks registered in Windows and return it as ManagementObjectCollection. Data is then loaded to the array of the LogicalDriveList struct and finally returned as a table by the FillRow function.

 using System; using System.Data; using System.Data.Sql; using System.Data.SqlTypes; using System.Data.SqlClient; using Microsoft.SqlServer.Server; using Systern.Management; using System.Collections; using System.Collections.Generic; internal struct LogicalDriveList {    public int Driveld;    public string DriveLogicalName;    public Int64 DriveFreeSpace;    public string DriveType;    public LogicalDriveList(int driveld,                            string driveLogicalName,                            string driveType,                            int driveFreeSpace)    {       DriveId = driveld;       this.DriveLogicalName = driveLogicalName;       this.DriveType = driveType;       this.DriveFreeSpace = driveFreeSpace;    } } public class GetDrives  {    [Microsoft.SqlServer.Server.SqlFunctionAttribute(        FillRowMethodName = "FillRow",        TableDefinition = "i int,                           DriveLogicalName nvarchar(255),                           DriveType nvarchar(255),                           DriveFreeSpace bigint")]    public static IEnumerable clrtvf_GetLogicalDrives()    {        List<LogicalDriveList> drs = new List<LogicalDriveList>(),        ManagementObjectCollection queryCollection = getDrivesO;        const int Removable = 2;        const int LocalDisk = 3;        const int Network = 4;        const int CD = 5;        int count = 0;        string driveType = string.Empty;        foreach (ManagementObject mo in queryCollection)        {            count++;            switch (int.Parse(mo["DriveType"].ToString()))            {                case Removable: //removable drives                    driveType = "Removable";                    break;                case LocalDisk: //Local drives                    driveType = "LocalDisk";                    break;                case CD: //CD rom drives                    driveType = "CD";                    break;                case Network: //Network drives                    driveType = "Network";                    break;                default: //defalut to folder                    driveType = "UnKnown";                    break;            }            LogicalDriveList a = new LogicalDriveList();            a.DriveId = count;            a .DriveLogicalName = mo["Name"] .ToStringf);            a.DriveType = driveType;           if (mo["FreeSpace"] != null)               a.DriveFreeSpace = Int64.Parse(mo["FreeSpace"]-ToString());           else               a.DriveFreeSpace = 0;           drs.Add(a);        }        return drs;    }    protected static ManagementObjectCollection getDrivesO    {       //get drive collection       ManagementObjectSearcher query = new           ManagementObjectSearcher("SELECT * From Win32_LogicalDisk ");       ManagementObjectCollection queryCollection = query.Get();       return queryCollection;    }    public static void FillRow(Object obj,        out int i,        out string DriveLogicalName,        out string DriveType,        out Int64 DriveFreeSpace)    {           LogicalDriveList Id = (LogicalDriveList)obj;           DriveLogicalName = Id.DriveLogicalName;           i = Id.Driveld;           DriveType = Id.DriveType;           DriveFreeSpace = Id.DriveFreeSpace;    } }

The System. Management namespace/assembly can be registered using the following Transact-SQL statement (just make sure that you are referencing the right folder and version of the .NET Framework):

 CREATE ASSEMBLY Management FROM 'C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\System.Management.dll' WITH permission_set=unsafe; go

Note that it must be declared as Unsafe since it accesses resources that are outside of SQL Server. Management Studio will issue a set of warnings:

 Warning: The Microsoft .Net frameworks assembly 'system.management, version=2.0.0.0, culture=neutral, publickeytoken=b03f5f7fIld50a3a, processorarchitecture=msil.' you are registering is not fully tested in SQL Server hosted environment. Warning: The Microsoft .Net frameworks assembly  'system.configuration, version=2.0.0.0, culture=neutral, publickeytoken=b03f5f7fIld50a3a, processorarchitecture=msil.' you are registering is not fully tested in SQL Server hosted environment. Warning: The Microsoft .Net frameworks assembly  ...

Now, you also need to include in your project a reference to the new component that has been registered on the SQL Server:

Choose Project | Add Reference from the Visual Studio 2005 menu.
In the Add References dialog box, select the SQL Server tab and then select Management in the list of component names (see Figure 13-1).

image from book
Figure 13-1: Adding a reference to a component of a database project

Since the function references objects that are not in SQL Server, you also need to declare the project Unsafe (Project | Options | Database | Permission Level). Now you can deploy it and then test the function using:

 select * from dbo.clrtvf_GetLogicalDrives()

The result will look like this:

 DriveLogicalName DriveType          i         DriveFreeSpace A:               Removable          1         0 C:               LocalDisk          2         20331072000 D:               Network            3         140518309888 E:               CD                 4         0 F:               Network            5         140518297600 H:               Network            6         81000570880 N:               Network            7         120188755968 T:               Network            8         174118621184 W:               Network            9         120188755968

Note

All examples of CLR TVFs so far have been based on eitber data from external sources or data generated inside tbe function. If a CLR TVF needs access to SQL Server data (for example, if it executes a Transact-SQL query), you need to set tbe SqlFunctionAttribute attribute to DataAccess=DataAccessKind.Read.

Differences Between Transact-SQL and CLR TVFs

The most important difference is that a Transact-SQL table-valued function must materialize the whole result as an intermediate table. Since Transact-SQL TVFs are based on tables, a developer can specify some constraints such as Unique index and nullability of columns.

CLR table-valued functions are based on a streaming model. As soon as a row is materialized, it is streamed to the caller, which can consume it immediately. The caller can use rows without waiting for a whole table to be materialized or received. This is the primary reason why CLR TVFs can be up to three times faster than their Transact-SQL counterparts.

CLR TVF s cannot return non-Unicode strings such as varchar, char, text, or varchar (max). They must be replaced with their Unicode counterparts.