Implementing Basic User-Defined Type (UDT) CLR Executables

One of the more appealing features of CLR executables is the ability to define custom user-defined datatypes, or UDTs, for short. Like the sirens of Homer's Odyssey, CLR UDTs can be a very attractive way to gain functionality, but at the same time, it can be hard to prevent your development team from sailing up on the rocks. No, I don't recommend putting wax in their ears. While it worked for Odysseus, his crew was too deaf to avoid the Charybdis and Scylla later in the voyage. The challenges of choosing to implement CLR UDTs include:

CLR UDTs can add significant complexity to your database and the applications, tools, and training needed to access and manage them.
Once a table is defined with a CLR UDT, you won't be able to change the UDT without dropping the table column. The current technology does not support any "versioning," so you'll have to perform all data transmogrifications yourself. However, as I'll illustrate, it's possible to create a UDT that can accept data in a number of formats. This means that you'll be able to more easily migrate from one implementation to the next.
While a UDT can be referenced in code not hosted in the SQL Server database, it requires another set of steps. This involves manually adding a reference to the host project and understanding how to reference, format input data, and convert from one format to another. All of these points are illustrated in this section, as well as in subsequent examples later in the chapter, where I use the UDT in an aggregate, in a Windows form, and in a CLR trigger.
Referencing UDT public properties exposed by the CLR code can be problematic, as these are hidden from developers that don't have the source code for the UDT. Without the source code, you won't be able to use Visual Studio or SQL Server Management Studio to explore a UDT. Yes, I expect that a UDT can be explored with a decompiler like Reflector.
You'll discover that you can't SET more than one UDT property at a time in an UPDATE or INSERT statement. This means you'll need to write custom code to make changes to rows containing UDTs.
Some UDT implementations prohibit use of SELECT * to return all rows and all columns because of the way that NULL is handled. Yes, the SQL Server Management Studio and Visual Studio tools bypass this by simply referencing the columns by name in the SELECT.
The Visual Studio 2005 IDE designers and wizards, like the Data Source or TableAdapter Configuration wizards, don't support UDTs. This means you won't be able to auto-generate strongly typed DataSet classes against tables that contain a UDT. However, you can create a VIEW on a table that contains a UDT and exposes the properties individuallythese can be used to generate read-only strongly typed TableAdapter objects.
You'll likely invoke the CLR code each time the UDT column is accessed. This means that the CLR UDT can add a significant performance penalty to your application, depending on its implementation and the complexity.

Okay, given these challenges, why would one want to bother creating a CLR User-Defined type? Well, suppose you have a data structure that lends itself to being contained in a single column. Examples that come immediately to mind include:

A "currency" datatype: In this case, you could store a currency value along with the monetary unit associated with the type. That is, you could store 234.56 currency value along with other properties to indicate the value is in British pounds. To implement this UDT, you could store the value using a high-precision monetary datatype, a currency symbol (£, in this case), the currency type (GBP), and the currency name ("British Pounds"). Once this UDT is implemented, you can aggregate (sum) columns regardless of the currency type by converting to a base type before adding the value to the sum. A currency UDT would also permit you to expose your data anywhere without fear of a localized value setting misinterpreting the intended value. Yes, this could be implemented with separate columns, but this assumes that the applications written to manage these values carefully tracked the currency type. Later in this chapter, I implement a currency UDT described here.
A "point" datatype: Many of the examples Microsoft uses in the documentation illustrate UDTs with a data structure that manages the x/y coordinates of a point. Since these values should be handled as an atomic unit, this is an ideal (albeit expensive) way to manage these values.
A "GPS" or latitude/longitude datatype: When referring to a GPS location in latitude and longitude, you're really dealing with eight values: latitude and longitude in hours, minutes, seconds, and declination. For example, 47° 40' 10" N 122° 3' 26" W is the GPS location for Redmond, Washington. Incidentally, this "degrees, minutes, and seconds" notation is referred to as "Sexagesimal," in case they ask you on Jeopardy. Again, GPS data is an ideal candidate to incorporate into a single atomic unit of information. The individual minutes, degrees, and seconds of latitude and longitude don't have much meaning in of themselvesit's only when they are used to form a complete address that they make sense for the data row to locate a specific point on the earth. This UDT is also implemented and discussed later in this chapter.

So, yes, there are specific cases where CLR UDTs seem to make sense. If you consider that the UDT implementation can embody comprehensive validation logic to ensure that each component meets specific range criteria, you can see how a CLR UDT can help keep data validregardless of the source.

Coding CLR UDTs

As with the other CLR executables I've created so far, you start building a CLR UDT using the same Visual Studio "Database" project type as previously discussed, but in this case, you should choose "User-Defined type" from the "Add New Items" menu. Name your UDT as neededin this case, I call it "typGPS". Our UDT is going to store the latitude and longitude for GPS data, as discussed earlier.

IMHO

I don't really expect this CLR UDT to improve overall performance.

Once you choose the UDT template, Visual Studio generates a block of code that exposes prototypes for all of the required properties and functions, as well as a couple of optional methods that you might choose to implement. This generated template (in compressed form) is shown in Figure 13.54. As you can see, a UDT is defined as a Visual Basic .NET Structure or C# struct that implements INullable.

Figure 13.54. The default CLR UDT template.

Along with the critical functions and properties, the CLR UDT prototypes a pair of methods that you can flesh out if your UDT needs them. For example, if you need to call an initializer, you might want to implement it in one of these methods.

Defining the CLR Public Properties

When it comes time to use your UDT in code, you'll get direct access to the public properties you define. One of the most complex issues I had to figure out was how to make sure these properties are exposed as the desired datatype. As the CLR code is exposed to both SQL Server and .NET Framework applications, transmogrifying between these types can be problematic. I'll point out the extra code and special attributes I had to add to get this UDT working as it should.

One important point: When you instantiate any SqlClient datatype (SqlTypes), its default value is not 0it's NULL. This can be confusing, as without presetting a value, you'll discover that numbers added to a SqlDecimal don't seem to accumulate. That's because NULL + anything = NULL.

You have some flexibility when defining a CLR UDT, or it would appear so. It turns out that if you choose to store fixed-length component values (like integers or floating-point numbers) that can be stored in data structures that don't expand and contract as values change, the task of creating a CLR UDT is a lot easier. However, if you need to store variable-length values such as strings, SqlDecimal, SqlMoney (and others), or XML in a UDT, you'll have to do the serialization yourself and expose appropriate functions (IBinarySerialize.Write and IBinarySerialize.Read) in the CLR UDT code to do the work. Don't get excited. Serialization can be done quite easily. The GPS UDT implemented next uses fixed-length values and "native" serialization. I'll use "custom" serialization for the typICurrencyV2 UDT built later in this chapter.

Constructing the Custom Property Code

At this point, you'll need to define the properties used to hold the component values. There are several important points here that you should not overlook:

I recommend using the Visual Studio Class Diagram tool (shown in Figure 13.55) to help construct the component Properties and Fields. Basically, this tool provides a dialog-based code generator that can make the tedious task of entering all of the component properties easy. One strange side-effect of this tool is that it creates a class diagram and adds it to your projecteach time you click on "View Class Diagram". Before long, you'll have a half-dozen class diagrams in your project. It's best to simply click on the auto-generated class diagram to view its contents.

Figure 13.55. The class diagram for my initial GPS UDT.
Note there are two parts to the CLR Propertythe private "Field" used to store the data and the public Property used to Set and Get the value. When you decide how your CLR UDT is to be defined, you'll need one of each of these for each component. For the UDT, I created eight public properties to hold values for the GPS latitude and longitude values. The UDT template had already created both IsNull and Null public properties.
The Property constructed by the Class Diagram tool is not complete. You'll have to visit each property and manually set the Get and Set code. In this case, these are all coded in exactly the same way (as shown in Figure 13.56), so it should not be hard to populate these Property routines.

Figure 13.56. These properties were partially generated by the Class Diagram tool.
The names you assign to the CLR UDT Properties are the names used to reference the component parts of the UDT in T-SQL. No, none of the Visual Studio or SQL Server GUI "RAD" tools exposes these properties, so if you don't have access to the source code, there is no way to determine these property names later.
Yes, when you instantiate the UDT in an application, Intellisense does expose the public properties and their associated datatypes.
You should try to use SqlTypes in lieu of .NET Framework types, as they don't require transmogrifying as the code navigates between layers. This means better performance and more functionality.
Remember that if you want the UDT public properties to support nullability, you'll need to define them using SqlTypes.NET Framework types cannot be set to NULL.
The order in which the local (private) memory Field objects are defined sets the order in which the values are stored in SQL Server. This might seem like a "who cares" issue, but when it comes time to debug your UDT, you'll want to be able to know which property is which.

Setting the UDT SqlUserDefinedTypeAttribute

The UDT attributes tell the compiler how the UDT is to be formatted and accessed. These attributes are coded within < > symbols ahead of the Function declaration, as shown in Figure 13.57. Basically, there are three types of "formatting" (set by the Format attribute) when defining UDTs that determine how SQL Server stores and retrieves data in the database column: Native (the default), UserDefined, and Unknown, as detailed shortly. The SqlUserDefinedTypeAttribute attribute is also used to define the StructLayout, which determines how the UDT is serialized (saved and restored from binary). The SqlUserDefinedTypeAttribute properties include:

Format: Dictates the serialization format as described earlier. The default is Native, which permits the UDT to store fixed-length value types and implements a very simple serialization formatperhaps too simple for your needs. If your UDT includes reference types (including String), you must specify UserDefined as the Format and implement your own custom serialization routines. Don't be alarmed by this requirement; a serializer/deserializer is not that tough to write, as I illustrate later.
IsByteOrdered: Indicates whether the UDT is byte-ordered (in the same way as SQL Server Byte types are ordered in memory). If True, the UDT can create an index on the UDT column. This attribute determines how SQL Server performs binary comparisons on the data. This setting also determines how your custom serializer constructs the binary version of the UDT before you write it using IBinarySerialize.Write.
IsFixedLength: Indicates whether all instances of the UDT are defined as the same length (a value = to MaxByteSize. This attribute must be set if you specify a UserDefined format. Again, this value is used by your code to write N bytes as you serialize.
MaxByteSize: No UDT can be longer than 8,000 bytes. This is a version-specific requirement that's a function of the size of the SQL Server data page. Having any single column this long is going to be problematic in any number of ways, so if 8,000 bytes is an issue, you need to think about normalization. You don't have to set this when working with Native formatted UDTs.
Name: The name used by Visual Studio tools to reference the CLR UDT. I discuss how to code T-SQL to reference your new UDT later in this section. Once deployed, you'll see this name appear in the datatype chooser dialogs in the Visual Studio and SQL Server tools. The name used by SQL Server is defined by the Structure or struct name.
ValidationName: This string addresses a method in the UDT to validate the UDT instance when deserialized from an untrusted binary source.

Figure 13.57. CLR UDT attributes and Structure declaration.

I'll implement the GPS UDT to illustrate how to code most of these attributes.

The CLR UDT also includes the StructLayout attribute that defines how SQL Server should lay out the UDT structure in memory. This structure permits you to set the LayoutKind, CharSet, Pack, and Size. I won't be illustrating these finer points here. Suffice it to say that there are three options:

Sequential: In this case (the default), objects are laid out sequentially. You can dictate the packing size using the StructLayoutAttribute Pack value.
Auto: In this case, the Framework runtime automatically chooses an "appropriate" memory layout. If you choose Auto, any attempt to expose the UDT outside of managed code throws an exception.
Explicit: In this case, you choose how each element of the UDT is laid out in memory using the FiledOffsetAttribute.

As you can see, you have a great deal of control over how the CLR UDT is configured, serialized, and laid out in memory. If you implement a UDT, I suggest you experiment with these options to find the optimal storage formatit can materially impact storage space, indexing performance (or ability), and fetch/store/search performance.

The Public Structure Declaration

The code shown in Figure 13.57 simply names the UDT structure^[15] and defines the serialization scheme. It's the name used by T-SQL to refer to the UDT when extracting data or defining a SQL Server column using the UDT. Directly after this structure declaration, you'll likely want to declare your local variables that hold the UDT component elements.

^[15] Note that in pre-RTM versions of the 2.0 Framework, the template simply used the Class name and did not define a structure to define the UDT.

The ToString Function

This code is responsible for implementing requests to return the binary UDT data as a string. You decide how the data is to be exposed. For example, if your UDT stores a phone number (country code, area code, prefix, number, extension), you could return a string such as "+011 (425) 555-1212 X123" when the UDT ToString method is called in code. For the GPS UDT example, you store eight parts: latitude and longitude degrees, minutes, seconds, and declination. These are returned as a string in the ToString function shown in Figure 13.58.

Figure 13.58. Constructing a string to display the contents of the GPS UDT.

The UDT ToString function is called when a T-SQL query executes a CONVERT function to retrieve the column, as shown in Figure 13.64, later in this section.

The IsNull Property

This property is exceptionally simple; it just returns the state of the private null property.

The Null Property

The Null property returns a new instance of the CLR UDT and sets the private null property to True, as shown in Figure 13.59.

Figure 13.59. Managing NULL state with UDT properties.

The Parse Function

The Parse function is by far the most complex part of the UDT. It's used to parse a formatted string submitted by a T-SQL query and transmogrify the values to the UDT public properties. Typically, this means your code has to convert from a string and save to another datatype. However, in this example, I simply parse a sexagesimal-formatted latitude and longitude string (such as "47° 40' 10" N 122° 3' 26" W") extracting the individual properties. Note that the string shown here in the book uses "smart quotes". This means that the single quote and double quote are different symbols than the ASCII single and double quote symbols. I incorporated this functionality into the UDT parse routine to make it easier to enter a GPS location and not have to worry about doubling up the single quotes. Yes, this is a variation in the "O'Malley" rule that I discussed when working with ADO.NET Command Parameter objects.

The Parse function must walk through the input string (whose format you define) and set the value of each and every component of the UDT. In the GPS example, the Parse routine is particularly long, as it has to pick eight elements out of the string. It's shown in Figures 13.60 and 13.61.

Figure 13.60. Finding the argument markers in the inbound formatted GPS string.

Figure 13.61. Set the UDT component properties from the string values supplied.

This code creates a number of integer pointers that are set based on the presence of specific markers for each of the eight component pieces of the latitude and longitude string. For example, you look for the degree symbol after the latitude and longitude degree value. If any of these markers are not found, the routine throws an exception and the operation is cancelled. That's an important part of the Parse function. It must reject any input string that does not comply with the specifications that you define. Sure, you can define as many different input string variations as you choose to support. When I implement the typICurrencyV2 UDT, I'll permit several variations of input string. This permits me to pass strings formatted for older versions to be accepted by the latest version.

When the public properties defined for the GPS UDT are defined as SqlByte, I suggest using SqlTypes throughout to make movement of data into and out of the UDT easier. In several cases, I use the SqlByte.Parse function to convert the string representation of a number to the SqlByte value. If one of the Parse functions fails, the CLR throws an exception and returns control to the SQL Server operation attempting to execute the T-SQL.

Yes, I could also add checks to ensure that the values supplied fall within the range of a proper latitude and longitude location. That is, the degrees latitude must be between 0 and 90, and degrees longitude must be between 0 and 180. While not implemented here for simplicity, I'll show how to add a validation routine to a UDT later in this section.

Debugging the CLR UDT

Because of several serious shortcomings in Visual Studio 2005, you'll find that debugging a CLR UDT under construction is kinda tiresomeand dangerous^[16]. I'll try to make it easy for you, but be prepared for an occasional lockup and Visual Studio crashing from time to time. Yes, save your work often. One of the problems you'll encounter is that the IDE always tries to redeploy your UDTeven though it might already be deployed from an earlier test (or an existing version). While SQL Server supports update-in-place for UDTs, Visual Studio does not use or give you the option of using this feature when starting a debug test run. This means you'll need to set up a separate T-SQL script and the means to execute it separate from the Visual Studio IDE. Nope, you can't just set up an "initialize" script and save it as a separate test.sql scriptthat's because the IDE tries to deploy on all scripts.

^[16] I lost an entire morning's work when Visual Studio crashed and lost the original project. Save often and back up your work frequently.

To build, deploy, and debug your CLR UDT, follow these steps:


1.	When you're ready to test your UDT, you'll need two test scriptsone that Visual Studio executes to exercise your UDT and another to initialize the server back to a pre-install state. That is, you'll need to completely remove the UDT and any references to it before you can debug your UDT using Visual Studio. I'll show you how to set these up. While there are many alternatives, I like to use an instance of SQL Server Management Studio to execute the initialization/cleanup T-SQL batch. The one I use to debug the GPS UDT is shown in Figure 13.62. Figure 13.62. The initialization script used to tear down the table and UDT prior to debugging. Make sure you designate the test.sql (or your chosen test script) as the default test script using the project menu. The problem is, Visual Studio always tries to build and deploy just before executing the test.sql test script. While this works the first time (because the CLR UDT does not exist), it fails after that because Visual Studio does not tear down the dependencies on the CLR UDT objectno matter how simple. CLR UDTs can be replaced "in place" like stored procedures and functions, but as I said, Visual Studio does not know how to enable this option. This means that they must be removed and replaced before you try to debug your code. This means you'll have to build a Test.sql script that creates the target database table and inserts rows into that table, as shown in Figure 13.63. Figure 13.63. The Test.sql test script used to create and populate the UDT table. As the complexity of your projects increases, you'll discover that you'll need to tear down every single table, aggregate, function, and stored procedure that references the UDT before you can replace it with Visual Studio. I'm of the opinion that this should be fixedand it might be very soon. Until it is, you might find it easier to simply construct your own deployment scripts that do permit "update in place" (ALTER). The Test.sql script was created by using the SQL Server Management Studio table designer to define the database table after having created the GPS User-Defined type (typGPS) the first time. I then used the Object Explorer to generate a script for the table and write it to the Clipboard. I subsequently pasted that T-SQL into the Test.sql script executed by Visual Studio in the IDE. Mark this script as the default test script using the project right-click menu. Note that the INSERT statement in the test script is fairly unremarkableexcept for the long (sexagesimal-formatted) string that defines a GPS location. Yes, the INSERT statement can pass data to the UDT in a number of ways, including passing a hex value. I'll show you how when I implement the typICurrencyV2 UDT later in this chapter. One of the problems I faced with this approach (of using single and double quotes to indicate minutes and seconds) is that I had to remember to double the single quotes here in T-SQL and double the double quotes in Visual Basic (and C#). Note that I tested with both ASCII and "smart" quotes.

2.	The debug scenario works like this. First, execute the SQL Server Management Studio tear-down T-SQL (as shown in Figure 13.62).
3.	Next, set a breakpoint in your UDT code. I usually break at the top of the Parse function.
4.	Next, in Visual Studio, click the Start Debug button or press F5. This starts the compile, build, and deploy sequence.
5.	Next, Visual Studio starts the Test.sql debug script that's designated as the default debug script. Make sure your script is so designated.
6.	The Test.sql script (shown in Figure 13.63) creates a new table using the just deployed CLR UDT and starts adding rows to it using the INSERT statements.
7.	If the debug process uncovers a bug in your code, you'll have to stop the debugger and start the process again at step 2.
8.	If the UDT works, you're ready to move on to the "select" phase to further test your UDT.

Using SELECT to Query UDT Data

Early in the development of this functionality (during the numerous beta releases), developers had to jump through a lot of hoops to get T-SQL to return data from their UDTs. Along the way, I saw versions that required C++ addressing technique and a number of very strange ways to simply return the data stored in a UDT. The shipping version (thank goodness) makes it easy to get your data back using a SELECT statement.

Figure 13.64 illustrates several ways to use SELECT to return data from a UDT column. The first technique first refers to the "GPS" column and next uses the T-SQL CONVERT function to return the data. The T-SQL CONVERT function calls the ToString function in your UDT so that the data is formatted as your code dictates. If you don't use CONVERT, the data returned is dumped out in hex, as shown in the "GPS" column in the first resultset.

Figure 13.64. Using SELECT to return data from the GPS column.

This hex notation needs a bit more scrutiny, so I created a detailed view of that data (as shown in Figure 13.65). Note that the SqlByte fields are actually stored as 16-bit values with the leading byte set to "01". If the column element is set to NULL, the leading byte is set to "00". Once you understand how the (all fixed-length datatype) UDT is stored, you can more easily debug and tune your code to optimize its use and database space consumption. As it is, I have been able to store eight (nullable) elements in a data structure that's only 17 bytes long. That's not too bad.

Figure 13.65. Data stored in SQL Server for the GPS CLR UDT.

No, you don't have to use the CONVERT function to return the entire fully formatted GPS location value; you can instead choose to address the public UDT properties or simply supply the raw binary (perhaps to impress one of your coworkers). It could not be much simpler. Each UDT component is addressable by name (the name you chose when you defined the UDT properties) using dot notation, as shown in Figure 13.66. Yes, you might want to CONVERT the raw hex data (as done in the example), but that's up to you and the people (or programs) that have to consume the data. Note that public properties exposed by your UDT can have a NULL value (because I used SqlTypes).

Figure 13.66. Using SELECT to address specific components of the UDT.

Tuning Your UDT for Performance

Every line of code contained in your CLR UDT plays a role in how well or quickly it performs. First, consider that SQL Server's "native" datatypes also consume CPU resourcesI take that for granted. However, some native types consume more resources than others, and I also take that into account as I design my databases. For example, I chose tinyint over integer or varchar to save disk space and reduce I/O traffic and memory consumption. CLR UDTs are no different. As you add components to your type, keep this in mind. Use the smallest, lightest datatypes possible to get the job done.

There are a number of things you can do to improve UDT performance, but these can mean compromised functionality. The version of the CLR UDT described earlier (let's call it "version 1") permits the inbound GPS location to be formatted in a human-readable format that's easy to visually scan for missing or out-of-place elements. It also supports NULL components, as I used the SqlTypes (SqlByte) to implement the type.

If you choose a simpler (albeit harder to read) input string, you can use the Visual Basic .NET and C# Split function to quickly convert the delimited string to an array of strings that can be written to the UDT components. I implemented this change, and the code is shown in Figure 13.67. It's simpler and smaller (and faster).

Figure 13.67. Using Split in the Parse function improves performance.

Note that I also changed the UDT to store the CLR public property values in byte objects. The net result of changing to native Framework byte objects means that instead of 34 bytes per column entry, the same data is stored in 18 bytesa hefty savings. However, I did lose the ability to set the public properties to NULL. Yes, you can still set the entire UDT to NULL (assuming you set up the table column to permit NULL values). Since a GPS location does not really make sense if a component is missing, perhaps not supporting NULL components is a good compromise.

Referencing and Updating UDT Values

You can code SELECT statements against the CLR UDT data in a number of ways. However, note that your UDT ToString is called only when using the CONVERT function. If you SELECT a UDT column, as in Figure 13.68 (line 2), without conversion, SQL Server does not call your UDT codeit simply returns the binary representation of the data stored in the database table column. This means that if you expect to reference the UDT public properties or the UDT column value as a whole, you'll need to include code in your project to decode itso any out-of-process application that consumes the UDT binary value must include a reference to your UDT in the project References section. Sure, T-SQL won't have any problem referencing and writing to your database's UDT. However, a Windows Forms, ASP, or Web Service application won't be able to reference it or even see it (in Intellisense) without adding a reference to the UDT structure.

Figure 13.68. Returning data from a CLR UDT using SELECT.

Handling NULL Values

When working with a structure like a UDT that exposes private properties cast as SqlTypes, remember that any one (or all) of these individual properties can be set to NULL. You have to decide if it makes sense that one of the properties has an "unknown" state. Does a GPS coordinate make sense if either the latitude or longitude hours are NULL? Not really. You can get a rough idea of a location given just the hours values without the minutes and seconds, but you still need the declination of both latitude and longitude for the coordinate to make sense. While I didn't go to the extreme of configuring the NULL handling in the GPS UDT in this manner, it's a good exercise for you to try.

If the GPS column value is NULL, depending on your implementation, you won't be able to fetch the data for that row. In this case, you'll have to use the CONVERT routine (Figure 13.68, line 4) that calls your CLR UDT ToString function that interprets NULLs specifically, or, if the type supports it, SQL Server notes the column as NULL. You'll also get an exception "Mutator 'strSymbol' on '<UDT> variable' cannot be called on a null value." if you try to set a public property when the UDT is set to NULL (its default state). Just make sure to initialize the UDT value before trying to set public property values.

Another approach is to set the OnNullCall method attribute. If this is set to False, the UDT returns NULL without evaluating the method when at least one of its input arguments is NULL. If OnNullCall is True (which is the default), the resulting value is determined by evaluating the method, regardless of the argument inputs.

The public properties of the UDT can also be retrieved using dot notation and CONVERT, as shown in the SELECT in Figure 13.68, line 14. When you fetch specific public properties in code, your UDT property Get code is called. Likewise, when you set a value in code, your property Set code is called. Another approach is to invoke the UDT ToString function directly, as shown in line 24. Later in this chapter, I discuss how to reference a UDT from a "normal" (not hosted in SQL Server) application.

Setting UDT Public Properties

While it's possible to use the T-SQL CONVERT function against all of the public properties of a CLR UDT, it's also possible (and tempting) to use the SET statement to set specific public property values. The problem with this approach is that it's expensive, considering that each trip to the UDT code burns CPU resourceseach SET statement makes a round-trip through the SQL Server to CLR UDT interfaces and back. However, there may be places where you feel this works better as when you have an "instantiated" UDT and want to change a single component property.

The other wrinkle here is that there is no "DECLARE X AS NEW Y" in T-SQL. This means you'll face the same object instantiation issues you face when declaring a variable in a CLR language like Visual Basic .NET or C#. You simply won't be able to reference the object without first creating an instance of the object. Because of this limitation, you'll still have to use the CONVERT function to set up the initial CLR UDT "instance" in a T-SQL applicationafter that, you can reuse it and set individual properties using the T-SQL SET statement, as shown in Figure 13.69. Of course, this is not an issue in a CLR language, as you can instantiate an instance of the UDT in code before referencing it.

Figure 13.69. Using the SET operator to set individual CLR UDT component properties.

The UDT component property names are case-sensitive. This means you'll not only have to know the spelling of the property, but you'll also need to observe the same case as the defined CLR code.

In this case, the code invokes the CLR UDT code four timesonce when I use the CONVERT and three more times when I use the SET statement on the value, symbol, and currency component properties.

Tip

You'll also find that you can't execute an UPDATE statement that sets more than one component property of the UDT.

Stepping Through the UDT Code

When you reference your UDT from T-SQL or some other application, the UDT code might be called many (many) timesor not at all. If SQL Server can figure out what to write to the UDT column without calling your code, it will. For example (as illustrated in Figure 13.70), the UDT code is not called until line 20. At that point (when the code sets the value of the @Price variable), the following sequence takes place:

The Parse function is called with "$50:USD". This creates a new instance of the typICurrency data state and sets its values to the symbol, money value, and currency type provided in the input string. These values are persisted in the private variables.
The Write function is called to return the encoded UDT private variable state to SQL Server. In this case, SQL Server simply saves this in the @Price T-SQL local variable.

Figure 13.70. Setting UDT values.

At this point, the UDT is stateless. It does not keep any values between invocations. When line 18 is called, the following sequence takes place:

SQL Server calls the UDT Read functionsending a binary stream containing the current state of the UDT value. Read deserializes the binary stream that is generated from the data state of the @Price variable (typed as typICurrencyV2). Read calls the Parse function to re-populate the private variable state. In this implementation, I'm able to take this shortcut because the serialized data structure is the same as the input string that Parse expects. This might not be the best option, so you'll probably want to code a more sophisticated serializer.
The public property (Price.strSymbol, in this case) Set routine is called. This saves the incoming value to the specific private variable.
SQL Server calls the Write function to capture the current state of the private variables.

Remember that a UDT does not maintain stateit can't. The data it knows how to parse and manage is kept in memory or stored in a column value in the database. A UDT might be called by a multitude of different processes, procedures, aggregates, and other server-side executables. Its entire function is to simply decode the values sent to it either in string form (when initially set) or in binary form when referenced from a query that returns a column value.

Yes, it's also possible to set the value of a UDT directly via binaryif you feel brave. This bypasses the ToString initialization step, and it works something like Figure 13.71.

Figure 13.71. Setting a UDT column value with a hex literal.

Validating Your UDT Data

The process of storing your UDT data into the database is (of course) managed by SQL Server. It takes certain precautions to ensure that the data is valid and meets the serialization criteria, but if there are business rules or other criteria that you wish to impose on the data, it's up to you to implement those tests. The Framework exposes a mechanism for this, and it's hooked up when you first define your UDT. Unfortunately, it's not called very oftenapparently, it's called only when SQL Server has to do a "conversion," as when you convert from bytes or when a UDT comes in from an external source. The validation routine is not called when you call the CONVERT function or execute INSERT or UPDATE. This means you'll have to add it to your own Parse routine to do a last-second validation test.

The ValidateGPSData function tests each component of the current UDT instance and returns False if it does not comply with the rules dictated by accepted latitude and longitude conventions. This routine and every other routine in your UDT code works on an instance of the UDT passed to it from SQL Server as it processes T-SQL statements. This means that several instances of this code could be running at onceeach processing its own instance of an in-progress UDT. Visual Studio and the language compilers make sure that the code conforms to this instance-centric approach.

To "activate" the validation routine, you'll need to set the ValidationMethodName in the UDT attributes, as shown in Figure 13.73.

Figure 13.72. Validating your UDT data structure using a custom function.

Figure 13.73. Activating UDT validation.

I put a call to the ValidateGPSData function just before returning the populated UDT instance in the Parse function, as shown in Figure 13.74. Your test script should include tests for each leg of your validation routine, to ensure that they all work.

Figure 13.74. Testing the UDT structure just before returning it to SQL Server.

Implementing UDT Methods

You might feel the need to implement methods on your new UDTespecially since the Visual Studio UDT template creates a couple examples for you. The question is, do you want to implement additional code that might better be implemented elsewhere? Remember, each change to the UDT requires a complete tear-down of the column (including any data it contains), redeployment of the tested UDT binaries, and repopulation of the table.

I implemented a simple method (as shown in Figure 13.75) to illustrate how to code and invoke UDT methods. Our method simply returns a string that describes the string format used to describe a GPS location.

Figure 13.75. Implementing a simple CLR UDT method.

Calling the method is also simple (sort of). As shown in Figure 13.76, without the TOP clause, it returns a row for each row in the GPSLocationV2 tablewhich is understandable since I didn't include a WHERE clause in the query.

Figure 13.76. Calling the UDT QueryTemplate method.

Setting Custom Method Attributes

Hidden in the documentation, you might stumble across references to additional method attributes that can be required in some cases and simply helpful in others. These attributes are exposed by the SqlMethodAttribute property, and I've already discussed a few of them. For example, when you create a Table-Value Function (TVF), you must set the FillRowMethodName to identify the method used to return rows. The other property settings are shown in Table 13.2.

Table 13.2. SqlMethodAttribute Property Settings
Name	Default	Description
DataAccess		If set, the function accesses user data in local SQL Server instance. This attribute is required for any Function or Stored Procedure that needs to open a connection. This attribute is not available on CLR UDTs or Aggregates.
FillRowMethodName		The name of a method in the same class as the table-valued function (TVF) that is used to return rows used to populate the table.
InvokeIfReceiverIsNull	False	If set, SQL Server invokes the method on a null reference instances. This is another approach to handling special NULL case situations.
IsDeterministic	False	Indicates whether the user-defined function is deterministic. "Deterministic" means that the routine always produces the same result, given the same inputs. For example, the function Len returns a different value each time it's called, but the same value when a 10-byte string is testedit's deterministic.
IsMutator	False	Indicates whether a method on a UDT is a "mutator"a method that changes a component property in the UDT.
IsPrecise	False	Indicates whether the function involves imprecise computations, such as floating point operations. This property determines if computed columns that reference this function can be indexed.
Name		The name under which the function should be registered in SQL Server. Used by the Visual Studio IDE.
OnNullCall		Indicates whether the method on a UDT is called when null reference input arguments are specified in the method invocation.
SystemDataAccess	False	Indicates whether the function requires access to data stored in the system catalogs or virtual system tables of SQL Server.
TableDefinition		A string that represents the table definition of the results, if the method is used as a table-valued function (TVF). It's used by Visual Studio only to automatically register the specified method as a TVF.
TypeId		When implemented in a derived class, gets a unique identifier for this Attribute.

In the next section, I'll introduce SqlFacet, which is a method or property attribute that's used to configure the datatype being used to return data from a function, method, or attribute. I use SqlFacet to ensure that the SqlBinary Precision and Scale properties returned by the UDT value are coerced to the correct settings.