Now that you understand how to build CLR functions, you're ready to step through the implementation of the other CLR executables. Thankfully, most of the other executable types share a few attributes in common, so you won't have to start from scratch. We start with stored procedureslike CLR functions, CLR stored procedures can be prototyped and tested quite easily with Visual Studio 2005. However, there are a number of tips I can provide to make the process of coding, deploying, and debugging your CLR stored procedure easier. Let's start by comparing functions and stored procedures. Table 13.1 highlights the fundamental differences.
Note that functions are generally used in expressions and aren't meant to return more than a single rowset (as a SQL Table). The function's return value is more flexible, in that you can return almost any datatype, while a stored procedure RETURN returns only a SqlInt32 value. However, that's deceptive, in that a stored procedure can return virtually any number of OUTPUT parameters of almost any type. When stored procedures are coded, the arguments passed in the "parameters collection" (the arguments passed to the CLR stored procedure) must be specifically designated so the CLR compiler knows which parameters are input and which are output. Sure, an output parameter can also accept an input value, and in testing, you might have to force a value to get Visual Studio to accept it. Later, I'll show you how to return one or more rowsets and "PRINT" messages from a CLR stored procedureagain, something you can't return from a function. Tip Later in this chapter, I illustrate how to code and debug a CLR stored procedure that accepts a CLR UDT as an input parameter. The example stored procedure I'm going to build addresses one of the reputed weaknesses in T-SQLnamely, string handling. The challenge is to search a varchar(max)[11] column for a specific set of words and report the hit count. The CHARINDEX function in T-SQL does a pretty good job of this, but it's only part of the coded needed to manage the incoming list of search words and the resulting output rowset.
IMHO When building any CLR executable, you have to balance the performance cost or benefit with the amount of developer productivity you gain or lose. In the process of building this example, I created four versions of the codeeach a bit more sophisticated than the last. This approach should help you get a better understanding of the issues one at a time and not overwhelm you with a completed (albeit somewhat complex) end-product. Each of these examples is driven from a delimited list of words to locate in the BookContent column of the BookText table in the BookSource database (created just for this chapter). These examples[12] can be briefly described as:
Building the Example CLR Stored ProceduresCreating a CLR stored procedure using Visual Studio is not much different than creating a CLR function, so I'll skip down to the stuff that is different. Thankfully, testing is usually pretty easy as welluntil Visual Studio decides that it's quitting time when it seems that nothing will get it to work right. I'll show you how to avoid those situations and when to just reboot and start over. Again, it's always a good idea to keep saving your applications and exit Visual Studio from time to time to let it clean its house of leftover application chad. Let's walk through the process of creating your first example CLR stored procedure.
Using the SqlContext Class to Return ResultsetsNew for the 2.0 Framework, the SqlContext class is used to expose properties and methods that are passed directly to the local SQL Server instance hosting the CLR code you're executingthe local "context". This context is used whenever working with stored procedures or any other CLR code that needs to access the host server. For example, when implementing a CLR trigger (which is simply a form of stored procedure), the SqlContext exposes the pseudo "inserted" and "deleted" tablesjust like the pseudo tables exposed to T-SQL triggers. Let's take a short tour of the SqlContext class and highlight the most interesting points of interest:
I use the SqlContext class to send the example's persisted rowset back to SQL Server as a resultset. Note that you don't need to have an open Connection for the SqlContext properties and methods to work. An open Connection (pointing to the local context) is required only when using ADO.NET to execute a SqlCommand in the local context. Let's continue stepping through phase 2 of the example.
Basically, that's it for the CLR stored procedure code. Testing the procedure is our next challenge. Testing the CLR Stored ProcedureYou'll need a test script to try out the new CLR stored procedure. Visual Studio has created one of these for you (sort of)at least, it's given you a reasonable starting point to build a test script. Figure 13.45 is my test.sql script. This script should be installed in the projectassuming you want to debug the CLR stored procedure with Visual Studio. Nope, don't plan to evaluate performance using this techniqueas discussed earlier, Visual Studio adds a ton of overhead to the process of executing and stepping through the CLR code. Figure 13.45. The test.sql test script for V1 of my CLR stored procedure.
Since I plan to debug using Visual Studio, this test script should be installed in the project using the Project Explorer. Yes, I discussed this earlier when building my first CLR function, so refer back there for more details on setting up a debug script if you skipped that section. When you start the debug process (as described earlier), Visual Studio starts the test script. If you didn't set a breakpoint, the entire test script is executed and the resultsets are displayed in the Visual Studio "Output" (debug) window. The results are not formatted very neatly, but you can tell if the stored procedure threw an exception or returned the right set of data. As you step through the code and reach the ExecuteReader code (line 38), Visual Studio displays a new window that contains the T-SQL about to be executed, along with the value of any parameter supplied. This feature is new for the RTM versionI can't recall seeing it in the Beta or CTP versions...but I could be wrong. Anyway, this is a handy way to determine just exactly what's getting executed. You can also inspect the contents of the input parameter using the new Visual Studio Intellisense feature (as shown in Figure 13.46). Figure 13.46. The T-SQL being executed by the ExecuteReader method.You'll also discover that as you debug your CLR executable, Visual Studio walks you right into the T-SQL functions invoked by the T-SQL code being executed. In this case, you're taken into the source code for the CountWordsInString function; again, you can use Intellisense to see the contents of the T-SQL variablesjust select the variable. No, this version does not support edit and continue. That's a bit of a tougher problem to crack when the code is running on a remote SQL Server. I don't expect that I'll see that feature before we're out of Iraq. Figure 13.47. Debugging with Visual Studio walks you into the underlying functions.Another interesting point that should be mentioned is how SQL Server manages the execution thread of this code. When debugging this routine (well before it was working), I found that stepping into the routines that worked with the resultset returned by the SqlDataReader was interrupted by jumps back into the CountWordsInString T-SQL function. That's because as SQL Server processed the query, it was running separate threads to manage both the "server side" of the query (the function invocation and row-fetch operations) and the "client side" (the CLR stored procedure code). This can be a bit confusing at first and might require that you trap exceptions and data value snapshots as you go. Exploring the Remaining CountWords ExamplesThe second (V2) "count words" example is very similar to the first versionit simply uses a DataTable to persist the intermediate data instead of using a collection. I took this route because I wanted to illustrate how to use the DataTable object's DataView to sort the rowset before passing it back to the client. As I said before, I also wanted to illustrate the use of filling the SqlPipe with the results from a DataReader. Unfortunately, this approach failed, as the DataTable class's CreateDataReader does not extrude a data reader consumable by the SqlPipe method. This means that you can pass only a SqlDataReader to the SqlPipe object. The third (V3) example is worth exploring a bit, as I took an entirely different approach to handling the job of locating words in the target column strings (varchar(max)). Instead of calling a T-SQL routine to find the words (as was done in the baseline T-SQL example), V3 takes on that task itself. The query used to return data to the CLR stored procedure is shown in Figure 13.48. Figure 13.48. Returning the varchar(max) data to process.
The rowset returned from this query would overwhelm the client memory space if I created a DataTable to hold it, but I'm only going to work with a single varchar(max) data column at a timeas quickly as I can. No, this data is not being moved over the Net, so this approach should be dramatically faster than a client/server program that does the same job. You should also save time and CPU cycles by implementing the "search for words in a string" code locally instead of having SQL Server call a T-SQL routine. Sure, one alternative approach would have been to replace the T-SQL CountWordsInString function with an equivalent CLR function. Just to get an idea of what's going on behind the scenes, let's compare the two approaches. First, the T-SQL CountWordsInString function (shown in Figure 13.49). Figure 13.49. The T-SQL CountWordsInString function.
The equivalent CLR CountWordsInString function (called as I out-process the rowset) is shown in Figure 13.50. Note that this routine would have been harder in earlier versions of .NET, as the IndexOf method was introduced in the 2.0 Framework. It's a pretty powerful method, as it permits you to choose a case-sensitive or non-case-sensitive search (along with other options). Figure 13.50. The CLR CountWordsInString function.
Evaluating CLR vs. T-SQL PerformanceAs, with all of these versions, which do you expect to run faster than the T-SQL code? Given the pitiful performance of the temperature-conversion example, perhaps none of them. To test the routines, I set up a number of individual test scripts to run that include dumps of the operational statistics. You'll see what I mean in a minute. Let's take a look at one of the test scriptsthe one used to test and evaluate the T-SQL version. Remember, you can't use Visual Studio to test these routinesit adds too much overhead. The base test script is shown in Figure 13.51. Each example can be called from this templateI simply change the name of the stored procedure to execute. Figure 13.51. Test script to exercise the CountWords examples.
To fairly measure the four examples, I ran this same script using SQL Server Management Studio with the "Show Client Statistics" option enabled, which is exposed as a button on the toolbar shared by the Execute button (as shown in Figure 13.52). Figure 13.52. Enabling client statistics diagram in SQL Server Management Studio.
Once enabled, SQL Server Management Studio captures and persists operational statistics about each query executed in this window. By changing the name of the stored procedure executed for each test, you can to get a rough idea of the overall performance among the various implementations. I ran each version once, which might favor the first or last routine executed because of caching, so I re-ran the T-SQL example again to check the validity of the resultsthey were identical. Sure, I had run the test a dozen times beforehand, so the data was probably all sitting in the data cache, so disk I/O would not be a factor in this evaluation. The results from all four tests (T-SQL and the three CLR implementations) are shown in Figure 13.53. Figure 13.53. The client statistics chart for the CountWords examples.If you look closely, you can see that the bottom-line results are pretty startling. Microsoft was actually rightthe CLR code is faster, a lot faster. The final CLR example shown as Trial 4 in the chart took only 78ms to execute, while the T-SQL version took 640msover eight times faster. Note that the other CLR implementations did not do any better (but not really any worse) until I incorporated my own CountWordsInString function in the CLR stored procedure. |