Designing Your Own User-Defined Aggregates Using the CLR


Tip

To practice the steps in this section, CLR Integration must be enabled on your SQL Server. To learn about this setting, see the SQL Server Books Online topic "Surface Area Configuration for Features."


SQL Server 2005 allows you to create new aggregate functions written directly using CLR (common language runtime) languages. For example, imagine that you need to calculate the average elapsed time between orders. This is not a very practical example, but it will show you how to work with the Date, Time, and TimeSpan datatypes, which can be hard to manage using only T-SQL.

Creating an Aggregate Function Stub

1.

From the Start menu, select All Programs | Microsoft Visual Studio 2005 | Microsoft Visual Studio 2005.

2.

From the File menu, select New, and then Project. This will open the New Projects dialog box, as shown below.

3.

In the Project Types pane, expand the Visual Basic node and select the Database project type. In the Templates pane, select SQL Server Project. Specify a name and file path for this project, and click the OK button to create it.

4.

The Add Database Reference dialog box, shown below, will open. Choose the database reference you wish to use and click the OK button to connect. If you do not have any database references set up, you will be prompted to create a new one. When choosing or creating a reference, be sure to specify AdventureWorks as the database to connect to.

5.

If you are prompted as shown below to enable SQL/CLR debugging, click the Yes button to enable SQL/CLR debugging.

6.

From the View menu, select Solution Explorer. Within Solution Explorer, right-click your project and select Add, and then select Aggregate from the context menus, as shown here:

7.

In the Add New Item dialog box, name the aggregate function WAVG.vb (for weighted average) as shown below and click the Add button.

8.

When you add the WAVG.vb item, Visual Studio adds the following code stub to the aggregate structure.

<Serializable()> _ <Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)> _ Public Structure WAVG Public Sub Init()     ' Put your code here End Sub Public Sub Accumulate(ByVal value As SqlString)     ' Put your code here End Sub Public Sub Merge(ByVal value As WAVG)     ' Put your code here End Sub Public Function Terminate() As SqlString     ' Put your code here     Return New SqlString("") End Function ' This is a place-holder field member Private var1 As Integer End Structure


There are four procedures defined in this structure: Init, Accumulate, Merge, and Terminate. You will have to add the appropriate code to them to ensure that your function will execute properly.

Before you start, you have to define what datatype the aggregate will manage. For this example, because we will operate over dates and times, define the datatype as SqlDateTime.

Because you will be working with elapsed times, a TimeSpan datatype might seem like the right type to return. However, this datatype is not usable in SQL Server. Instead, you will have to return a string with a representation of the time span.

To calculate the weighted average, you need to total all the time spans for a set of records and then divide this total by the number of records in the set.

Adding Code to the Stubs

1.

To perform the required calculation, first create three local variables within the WAVG function. Enter the declarations as shown below:

Public Structure WAVG     Private Ticks As Long 'Cumulate the ticks between dates     Private Previous As SqlDateTime 'Store the previous dates                                     'to obtain the elapsed time     Private Count As Integer 'Qty of records processed


2.

Aggregate functions normally execute over grouped sets of records. For each group, the executor will call the Init procedure before processing the first record. In the Init procedure, you need to initialize your variables using the following code:

Public Sub Init()     Count = 0     Previous = Nothing     Ticks = -1 'To detect the first record End Sub


3.

For each record in the group, the executor will call the Accumulate procedure, which is detailed below. Note that you need to change the argument for this procedure from the default datatype of SqlString to a SqlDateTime.

Public Sub Accumulate(ByVal value As SqlDateTime)     Dim span As New TimeSpan(0)     If Ticks > -1 Then         span = New TimeSpan(Ticks)         span = span.Add(value.Value.Subtract(Previous.Value))     Else         Ticks = 0     End If     Previous = value     Count += 1     Ticks = span.Ticks End Sub


The first time the executor calls this procedure, we will have no previous date. Therefore, the number of Ticks between the date currently being examined and the previous date will be zero. The code stores the actual date in the Previous variable in order to make our calculation the next time Accumulate is called. Also, we add one to the Count variable to keep track of the number of records we have examined.

Subsequent calls of the Accumulate procedure calculate the different time spans between dates. Since the Ticks variable is no longer equal to -1, the procedure creates a TimeSpan and adds to it the difference between the actual date and the previous one. It also can preserve the date in the Previous variable and increments the Count variable as before.

4.

When there are no more records for the set, the executor calls the Terminate function. Enter it as detailed below:

Public Function Terminate() As String     If Ticks <= 0 Then         Return New TimeSpan(0).ToString     Else         Dim Resp As Long = CLng(Ticks / Count)         Dim RespDate As New System.TimeSpan(Math.Abs((Resp)))         Return RespDate.ToString     End If End Function


If no time spans were calculated, the Ticks variable will be zero. In this case, we would return the string representation of a new TimeSpan with zero ticks.

If time spans are calculated, then we divide the sum of these differences by the count of records processed and then create a new TimeSpan variable to hold the result. Then, we return the string representation of that TimeSpan.

Note

SQL Server might also separate the work into smaller pieces whose results need to be combined by calling the Merge method. In a real application, you will need to implement this function appropriately.

5.

To test the function, edit the Test.sql file that Visual Studio automatically added to your project. This file can be found under the Test Scripts folder in Solution Explorer. In this file, enter the following SELECT statement that uses your function.

SELECT CONVERT(nvarchar(7), OrderDate, 111) AS Period,     dbo.WAVG(OrderDate) as Span FROM Sales.SalesOrderHeader GROUP BY CONVERT(nvarchar(7), OrderDate, 111)


6.

Then select Build <ProjectName> from the Build menu, and select Start Debugging from the Debug menu to run the script.

Below you can see part of the result set.

Table 5-9. Results

Period

Span

2001/07

03:54:46.9565217

2001/08

03:07:00.7792208

2001/09

03:22:43.1067961

2001/10

03:34:55.5223881

2001/11

02:41:14.1312741

2001/12

02:24:57.9865772

2002/01

03:09:28.4210526

2002/02

02:35:31.2000000

2002/03

02:44:15.5133080

2002/04

02:51:08.8524590

2002/05

02:24:28.8963211

2002/06

02:28:05.1063830

2002/07

02:12:55.3846154

2002/08

01:42:51.4285714

2002/09

02:15:08.7378641

2002/10

02:23:02.7814570

2002/11

02:08:05.8895706

...

...


If you take a closer look at the Accumulate procedure, you will notice that it creates a TimeSpan variable each time it runs; meanwhile it stores the value that it calculates in the Ticks variable. Why not just use a TimeSpan variable? The problem here is that the CLR aggregate functions have to be serialized between calls. The way this serialization occurs is defined by one attribute in the declaration of the function. For example, look at the declaration below:

<Microsoft.SqlServer.Server.SqlUserDefinedAggregate(Format.Native)> _ Public Structure WAVG


The Format argument establishes the serialization format. In the Native format, you can only serialize Value types, not Reference types like CLR classes (including System.String) or your own classes. In this example, we can translate the value we need to preserve between calls by storing its representation in the Ticks variable of datatype long. However, if you need to use Reference types, you can change the Format argument to Format.UserDefined. If you make this change, however, you will have to implement your own serialization mechanism. Refer to the SQL Server Books Online topic "Invoking CLR User-Defined Aggregate Functions" for more information about serialization mechanisms.

The complete example of this CLR aggregate function is included in the sample files for this chapter in the WAVG folder.




Solid Quality Learning - Microsoft SQL Server 2005. Applied Techniques Step by Step
Microsoft SQL Server 2005: Applied Techniques Step by Step (Pro-Step by Step Developer)
ISBN: 0735623163
EAN: 2147483647
Year: 2006
Pages: 115

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net