Setting Up a Test Environment

There are a number of ways to test your CLR executable using Visual Studio once it's deployed, but I don't expect you'll be that happy with the default behavior unless you make a few changes before trying to use it. Consider that SQL Server CLR executables are typically called from T-SQL stored procedures, functions, triggers, scripts, or batches. No, it does not make much sense to call a CLR-based stored procedure from a Windows forms application, ASP application, or Web Serviceyou could simply execute the same code from within the context of the application. So, my testing scenarios will be based on T-SQL scripts. This means if your skills lean toward SQL Server Management Studio (SSMS), go for it. The test scripts I created in the early days before Visual Studio debugging was hooked up were executed by SSMS. Now that Visual Studio debugging is wired, I'm still using SSMS. It still exposes more developer-friendly features.

When you try to debug a function with Visual Studio (by clicking the debug "run" button), Visual Studio looks for a default "debug script". However, before this T-SQL is executed, Visual Studio (by default) insists on rebuilding and redeploying the project. This can be problematic with CLR User-Defined types and some of the more complex CLR executables. To get around this problem, I created a special solution configuration used just for testingI'll discuss this a bit later.

To help choose a testing scenario that makes sense for you and best leverages your skills, let's look at a number of ways you can test your new SQL Server CLR executable:

Visual Studio includes a fairly simple (but unpopulated) SQL script (test.sql) that can be used to test your newly created functionassuming you add the T-SQL to invoke your newly defined executable. While the script shows how to call some of the more basic CLR executables, none of the specifics is filled inyou'll have to do that yourself.
Code your own T-SQL test script (as I'm about to illustrate) and add it to the project. Point Visual Studio at this script by setting the "default test script" (right-click the project).
Use SQL Server Management Studio's Object Explorer to generate a "SELECT TO" script that can be used to test the executable in a Query window.
Use a custom test script that's executed by the SQLCMD command-line tool or the SQL Server Management Studio Query window.

Addressing the CLR Function in T-SQL

CLR functions must be explicitly referenced in T-SQL by the owner. This means that you'll need to add the owner name to the function name to call in your test script, as shown in Figure 13.18. Yes, in this script, the database name is also includedit does not hurt to be over-explicit when addressing objects. It saves SQL Server some time when referencing the object.

Figure 13.18. Addressing a CLR function

Incidentally, this script was generated by SQL Server Management Studio. I simply right-clicked on the selected function and choose "Script to SELECT TO" in a new query window.

To use this script, you'll have to replace the parameter placeholder <@dblTempIn, float,> with a parameter value.

Notice that this (scalar) function's "return" value is captured by a SELECT. Remember, you can code the CLR function as you would any T-SQL function, as shown in Figure 13.19. In this case, I call the Fahrenheit-to-Celsius conversion function with input parameters of 32° and 212° (freezing and boiling, for those who skipped Physics 101).

Figure 13.19. Test script to exercise the fnFahrenheitToCelsius function.

Executing a Custom Test Script

Now that I have a custom test script, I'm ready to test it with Visual Studio or SQL Server Management Studio. Let's step through the process of importing a new test script and executing it.

1.	Right-click the "Test Scripts" folder in your project and choose "Add Test Script". This opens a new test script. Paste the T-SQL from SQL Server Management Studio or T-SQL editor. An alternative to this approach is to choose "Add" \| "Existing Item" and browse to the query script .SQL file you've previously built.

2.	Remember, I'm in Visual Studio now, so you can set breakpoints in this test script file. I'll set one on the first SET statement (line 5 in Figure 13.20). Figure 13.20. Setting a breakpoint in the new SQL test script.
3.	To get Visual Studio to use test1.sql as the default test script, right-click on the selected file and choose "Set as Default Debug Script".
4.	You're ready to step into this code, so press F5. Note that the build and deploy operations are repeated. While this is not that terrible when working with functions and stored procedure CLR executables, when you start working with UDTs, this approach simply won't work. They can't be rebuilt that easily.
5.	The test script stops at the breakpoint (line 5). Watch the Output (Debug) window as you step into the test script. The next line (line 6) executes the newly deployed function, which causes SQL Server to load a long set of DLLs to support the CLR executable. These include: Mscorlib.DLL SqlAccess.DLL System.Data.DLL System.DLL System.Transactions.DLL System.Security.DLL System.XML.DLL Microsoft.VisualBasic.DLL The application symbols (when debugging)

Once the DLLs are loaded, Visual Studio executes an "auto-attach" to attach the debugger to the target SQL Server process. I show you how to attach to a specific SQL Server instance later in this chapter.

If you don't stop execution with a breakpoint, the text displayed in the Output (debug) window might not appear in sequential order. For example, without a breakpoint, the DLLs appear to be loaded and the auto-attach occurs after the SELECT executes.

The SELECT statements executed in the test script output their data to the Output (debug) window, as shown in Figure 13.21.

Figure 13.21. The Output (debug) window displays output from the test script.

As you can see, creating a CLR executable function is not that hard. However, the question remains, is this more efficient than executing the equivalent logic in T-SQL?

Evaluating CLR Executable Performance

Let's take this simple (thus, easily understood) temperature-conversion function and build a test script that exercises the function and records the length of time it takes to perform it in a way that can be used to evaluate the performance. Ah, you need to keep in mind that when testing performance, you need to simulate the actual production configuration as closely as possible. No, you won't always be able to set up a multi-million-row test database, but one thing you'll need to do is avoid using Visual Studio test scripts to evaluate performance. As you'll see, Visual Studio adds considerable overhead while debugging any SQL executable, whether it's a stored procedure or CLR executableas long as you've enabled "debug" mode on the connection.

Tip

Debug mode with SQL Server debugging enabled can impact the behavior of the connection pool.

The test script used to evaluate the performance difference between calling the temperature conversion CLR executable function and the equivalent T-SQL is shown in Figure 13.22. This script was run on SQL Server Management Studio and again as a test script in Visual Studio. The difference was dramatic. In Visual Studio, the script ran in debug mode took 10 times longer to execute the SQL CLR code. This is due (in part) to the extra overhead Visual Studio imposes when you ask to debug the CLR procedure using the designated test script.

Figure 13.22. Test script to compare CLR executable vs. T-SQL execution times.

This means you'll want to build test scripts to test and debug (and step through) CLR executables using Visual Studio, but that's it. Don't depend on the timings you get back from Visual Studio for performance metrics. I created a report to show the results of these test runs as shown in Figure 13.23. Note the scale of the X-axis.

Figure 13.23. Comparing CLR executable and T-SQL performance.

In this test (using SQL Server Management Studio), the CLR-based temperature conversion function runs over seven times slower than the equivalent T-SQL code. This means that the relatively simple task of performing a temperature conversion should not be done in a CLR executableyes, this is an example of what not to do.

Let's take a look at a few situations where CLR executables are supposed to make more sense.

IMHO

Are CLR executables a solution looking for a problem?

Evaluating the GPS Example

I keep hearing that the best use of CLR executables is where T-SQL is forced to perform "CPU-intensive" or complex mathematical functions. I've also heard that the calculations needed to work with Global Positioning Satellite (GPS) data are sufficiently complex and that they could best be done in a CLR executable. To this end, I created a sample database^[7] that has the (rough) GPS coordinates of about 1,200 U.S. cities. This data was fed into a formula to determine the distance between each of these cities. The resulting table has about 1.5 million rows. The distance calculation is performed in both a T-SQL function and an equivalent CLR functionthe test suite compares the total execution time of both techniques. In this test, the CLR implementation is slower, but only by .005% when processing the 1,478,523 rowsbasically, a dead heat. However, in this case, I found it easier to code the CLR function, as it affords a far richer choice in conditional logic and error checking, as well as a familiar development toolset. I expect T-SQL developers might find just the opposite experience. I also expect that as the computational needs of your project increase, the difference in performance will be more pronouncedthat is, the CLR implementation might make more sense.

^[7] This is the GPS database on the book's DVD.

The bottom line? Well, given that the CLR GPS calculation came pretty close to matching T-SQL performance, there are going to be situations where CLR executables can make a (positive) performance difference. I would expect these "better" solutions to be more complex than the single GPS distance calculation before you could justify their use on the basis of performance alone. Let's keep experimenting to see if there are other ways to use a CLR executable that can really justify their use.

Creating a Table Value Function

One of the more interesting (and useful) features first enabled in SQL Server 2000 is Table-valued functions (TVF), which permit T-SQL functions to return a code-generated rowset to other T-SQL logic, such as a SELECT statement. I talked about this functionality before when I showed you how to implement a T-SQL IN clause that takes a dynamic list of arguments passed as a delimited string. In this example, I used a T-SQL TVF to convert a delimited string to a SQL Server Table that's fed into the IN clause via a SELECT. Based on the number of times developers ask for this functionality, I'm surprised that Microsoft has not included this functionality in T-SQL itself.

At this point, I want to show you how to implement a TVF using a CLR function. It's only a bit more involved than a typical CLR function, as you have to implement the interfaces that the CLR compiler and SQL Server expect to see when you return a "table" structure. An important difference between T-SQL TVFs and those you're about to build in a CLR function is that a T-SQL TVF is persisted (at least temporarily) to TempDb before the results are returned to the caller. This means disk I/O is required to implement them. In the CLR case, the rows are streamed into memory instead of stored to disk^[8], so the rows can be passed immediately to the caller as they're constructed.

^[8] I expect that the data "inserted" into a T-SQL TVF is actually cached to memorynot hard disk, at least for smaller tables.

Coding the TVF

In order to implement a CLR streaming TVF, you'll have to add support for the IEnumerable Framework interface. This is really pretty easy (despite what you might have read in some of the online articles). You'll also need to set specific attributes on the CLR function so that the complier knows what the function is trying to return.

Start a new project and set up a CLR function as I did in the previous exercise. In this case, I need to tell the compiler that I plan to return a table. This is accomplished by defining the FillRowMethodName and TableDefinition in addition to the function Name. To help remind you of these argument attributes on the SqlFunction definition, Visual Studio exposes an Intellisense dialog, as shown in Figure 13.24.

Figure 13.24. Visual Studio helps define the function attributes with Intellisense.

In this case, I'm going to call the TVF "DelimitedStringToTable" and name the routine that returns "items" from the generated table by a routine named "GetItems". The table returned is defined with a single nvarchar column that limits the individual delimited items to 500 (Unicode) characters. That should be plenty. All of the work is done in the InitMethod routine that simply uses the Visual Basic (or C#) Split function to break up the delimited string into individual elements. Note that InitMethod is exposed as Ienumerable, which deals with all of the plumbing behind the scenes. The entire CLR function is shown in Figure 13.25.

Figure 13.25. The DelimitedStringToTable CLR function.

I also wrote a T-SQL routine that duplicates the CLR Split function, shown in Figure 13.27, so I could compare performance between the two implementations. To test the two TVF routines, I created a test harness, as shown in Figure 13.26.

Figure 13.26. Testing the CLR and T-SQL TVF functions.

In stark contrast to earlier tests, when I ran this test, I discovered that the T-SQL version is dramatically slower than the CLR version (by over three times). The fact that CLR languages know how to strip a delimited string with a single line of code probably plays an important role in the performance boost.

Figure 13.27. The DelimitedStringToTableTSQL T-SQL function implementation.

Evaluating Performance with the SQL Profiler

The difference in performance was measured by watching both queries execute with the SQL Profilerthe results are shown in Figure 13.28. If you inspect the Profiler trace, you can see that the CLR function test consumed 391 CPU milliseconds, while the T-SQL version took 1,219a big difference. But also note that the CLR function required no disk-I/O, while the T-SQL version required 25,000 I/O reads. Remember, a T-SQL TVF uses TempDB to store its data. It's also important to note that the duration was skewed to the CLR sidebut not as much as the difference in CPU time.

Figure 13.28. Using the Profiler to compare CLR and T-SQL TVF performance.

It looks like we've found a good use for CLR executables. Yes, I agree that when you need to perform string manipulation with T-SQL (as when parsing XML), the Framework's native string-handling capabilities are far richer than those supported by T-SQL. This can help improve server-side executable performance.

Let's see how a TVF can help in an ordinary query. One of the ways to use TVFs is to process pick lists. For example, when using a multi-select ListBox used to pick one or more items to focus the WHERE clause, the T-SQL can use an IN expression to narrow the resultset. One of the problems with the IN expression is that it can't accept a parameter. However, it can accept a rowset from a SELECT. Here's where I use the CLR TVF (DelimitedStringToTable) to feed the IN expression. The example in Figure 13.29 shows how this is implemented in a stored procedure.

Figure 13.29. Using a CLR Table Value Function to feed an IN expression.

And Figure 13.30 shows how I tested this stored procedure by passing a delimited string as a stored procedure parameter.