Implementing Advanced CLR
I wanted to get you grounded in some fundamental principles before wading into some of the more advanced concepts involving UDTs. Yes, UDTs can be very code-
and can introduce a number of more complex issues. As I worked with the
UDT in this section, I
a number of
that are not covered in the documentation but are critical to the ability of the UDT to correctly manage the data as intended.
Woven into the fabric of this section are the following topics:
Coding your own CLR UDT serialization routines
: These can permit you to implement versioning on the UDT. This means when your pointy-haired boss decides that it's time to upgrade or tune your UDT, you won't need to throw out existing datayou simply bump the UDT version. To implement this feature, the custom serializer I'm about to show you
how to morph the data so the old data is automatically converted into the new format. This is a feature that existing SQL Server (aliased) User Defined Types could not support.
Building your own deployment scripts
: Because of the limitations in Visual Studio 2005, I found it easier to build a custom T-SQL script that
the deploymentthis way, I don't (always) have to tear down all dependencies as each new version is built.
Accessing your new UDT from other applications
: This section discusses how to set up a Windows Forms application (or any other type of application) that incorporates the UDT into the registered classes visible to the compiler. I'll show you how to pass a UDT from tier to
and in arguments passed to functions,
, subroutines, or local properties.
Using a UDT in strongly typed data structures
: Since the Visual Studio IDE does not directly support UDTs, you'll have to take a few extra steps to permit you to create strongly typed data structures that leverage this technology. I discuss a couple of easy tricks to accomplish this.
For the example in this section, I'm going to start over with another UDT designed to manage international currency values. It's a variation of the code Bill used to
CLR UDTs in the
leading up to SQL Server 2005's RTM:
. The folks at Microsoft seem to think it's a good example, toothey implemented a lightweight version of it for their example in the MSDN help topics, but I'm going to take that implementation up a few notches.
UDT is designed to manage currency (money) values, where values can be in any currency denomination. For example,
can store dollars, euros, British
, or any currency type in the same column. The UDT keeps track of the currency denomination and other helpful values, like the currency symbol, to make selecting and displaying the data easy. I initially implement this UDT using strings and variable-length datatypes, so I'll have to use custom serialization.
Before long, you're going to outgrow default (
) serialization. Keep in mind that if your UDT stores values can be
bool, byte, sbyte, short, ushort, int, uint, long, ulong, float, double, SqlByte, SqlInt16, SqlInt32, SqlInt64, SqlDateTime, SqlSingle, SqlDouble, SqlMoney, or SqlBoolean
, it's best to simply stick with the default serialization. That is, unless you want to implement custom versioning, which
UDT data to be identified by version. This way, you can upgrade from one implementation to anotherautomatically morphing the data as you go. I'll show you how to implement custom versioning later in this section.
If you use default serialization, you should not (cannot) specify
and all of the component
(the fields) of your UDT must be serializable (all of the datatypes mentioned above are serializable). SQL Server also forces you to lay out the items in your UDT structure sequentiallyI saw how to set this order earlier in this chapter. This regimen can seem a bit stifling, and there is a
alternative: user-defined serialization.
Okay, so you want to expose an unsupported type like
or some other variable-length public property value in the UDT. This complicates things a bit, as now you have to
off the autopilot and move your data into and out of SQL Server without benefit of the automatic formatting and serialization. At first, this might feel like your first solo flight in an OH-23 helicopter with 8 hours of trainingbut not nearly as life-
To start with, you'll need to
set the following UDT attributes when you first declare your UDT Structure (or struct):
: This must be set to "UserDefined" when you intend to code your own serializer. This
the compiler (and SQL Server) to look for the
interface in the UDT. Once set, SQL Server calls the
method to retrieve the binary data and the
method to save it back to the data table column.
: This attribute simply
the UDT so it can be managed with Visual Studio.
: This tells SQL Server how you have stored the data and helps SQL Server manage binary comparisons on the UDT
. In other words,
says that the bytes are written in sequencejust as they are when working with managed code. If you want the UDT to support comparison operators, GROUP BY, ORDER BY, and PARTITION BY, you have to set
. In this case, SQL Server uses the disk representation of the instance in binary comparisons. Operators supported include =, !-, <, <=, >, and >. The default is
: You'll probably want to set this attribute to
when using variable-length data types. However, a fixed-length structure that contains variable-lengths
is possiblebut tough to implement.
: When using the
UDT format, you'll have to set the
attribute to the combined
of the UDT components in its serialized form. That is,
tells SQL Server the size of the longest data stream your
routine returns. Since you're in charge of implementing the serializer, this is a value you determinebut it might be a bit tough if your UDT has a number of variable-length components. For example, if you have a single
column that's 10 bytes long, it consumes 22 bytes, 16 bits/character (to implement the Unicode UTF-16 character), and 2 bytes of overhead used while serializing. Just remember to include the entire data length when computing
the data plus the serialization overhead. The upper limit is about 8,000, but you don't really want to save that much data in any single column; considering the maximum size of a data row is about 8,000, having a couple of
all of the space flies in the face of most relational theory.
Building the iTypCurrencyV2 UDT
Once your structure attributes are set (which is probably the last thing you'll set after the UDT is written), you need to add the
annotation, as shown on line 12 of Figure 13.77. This is used to expose the standard interfaces where the
methods are implemented. These are used to accept and return data from and to SQL Server.
Figure 13.77. The attributes and structure declaration for
Note that I'm implementing "version 2" of the UDT
. Version 1 is a good place for you to start exploring how to code a UDT. Version 2 goes quite a bit farther in the implementation to enable many of the more complex features you might need in a UDT.
The UDT attribute and structure declarations for the new UDT are shown in Figure 13.77.
When you specify which interfaces to implement, be sure to let Visual Studio help you with Intellisense. If you do, Visual Studio automatically adds the implementation prototype routines in your code.
The code used in the
UDT is different enough that I'm going to step through it section by section to let you see the approach I use at each phase. One aspect of the user-defined (hand-rolled) serialization is that if you aren't careful, the UDT data written to the server tends to consume more memory, more I/O channel capacity, and more disk space. This overhead can consume more resourcesperhaps more resources than the implementation is worth in increased functionality. I coded the UDT to keep the size of the stored data and the amount of code needed to manage it to a minimum. No, writing a serializer is not hard. It might seem daunting at first, as you have to decide what's being written (in binary) to the table column. I'll show you how to inspect what you've written to the column value so you can tune it for size and functionality.
Declaring and Parsing the Private
variables declared in the UDT define how the UDT data is stored (temporarily) within the UDT. No, they don't have any
on how the data is stored in the database or how it's exposed to consumers (those applications referencing the UDT data). The private variables and the supporting public properties that
them were templated with the Visual Studio Class Diagram tool. In this case, I chose
variables to manage the currency symbol ("$") and the currency type ("USD") values. I could have used a string for the money value as well, but I wanted to
how to handle a more complex type within the UDT. This implementation (which is different from the "version 1" basic
implementation) supports a
property that's used to help morph data from one implementation footprint of the UDT to the
. Version 2 also knows how to recognize version 1 formatting so it can automatically upgrade to version 2 notation.
Remember, I discussed each of these UDT support routines earlier in this chapter.
Coding the iTypCurrencyV2 ToString Function
Note that the UDT
function implementation uses
to return the string representation of the current UDT private variable stateassuming the UDT
property persisted as
is set to
. Again, this output format does not have to match the input format used when you set the UDT value with the CONVERT statement, but it helps. As a matter of fact, the string you return is not stored anywherejust used to display the contents of the UDT when a SELECT is executed or your code invokes the
method directly. Note that the
function does not return the
Figure 13.78. Declaring the private variables and the
The UDT Parse Function
function's job is no different here than when you use default serialization. The primary goal is to convert the string passed from SQL Server and set the private property values. Yes, the
should make sure that these values meet minimum validity requirements, and, no, this routine does not do thatit blindly accepts any currency name, symbol, or type and any value. However, for
I added a few extra lines of code to deal with an unknown (NULL) currency type and alternate versions.
In this implementation, I depend on the CLR
function to parse the string passed to the
function (see Figure 13.79, line 61). This deals with several
issuesin this case, all I need to do is separate the individual UDT values in the inbound string with a
Figure 13.79. The
Coding the IBinarySerialize Read and Write Subroutines
Because the UDT code is now responsible for serializing the instance data, you need to construct subroutines to convert the private variable data to a data stream (
) and reconstitute the private variables from that stream (
). A few random thoughts come to mind here as you define your
Just exactly how you define the saved data stream is up to you, but consider that every byte you write to the stream is saved to the database table column containing the UDT. No, what I implemented in
is not particularly efficient, but I didn't want to confuse the issue by implementing a complex serialization scheme.
It's up to the
routine to decode the stream extruded by the
routine, so if you make the stream complex, the decoding algorithm must be equally complex to
If you serialize complex variables like
, the complexity of the
dramatically and you might also end up saving more data than you need. Consider that a
datatype can consume from 5 to 17 bytes of space, while the value you store might be as small as 3 bytes ("0.0") when persisted as a string.
Each stream that's sent to your UDT for decoding is prefixed with a "header" byte that indicates the NULL state of the UDT binary value. The 1 bit is set if the value is NULL. In this case,
should simply return
in the code).
routine, I save the length of the string to be serialized followed by the stream data. This makes it easy to capture the stream in the
routine where I simply read the length and pass that value to a fixed-length
function. I added a bit of code to write the length as an (8-bit)
instead of an
as I know the length of the data I'm saving won't exceed 255
. This approach saves 24 bits of space per entry.
routine is shown in Figure 13.80.
Figure 13.80. Implementing the
Yes, it's probably best that you design and code the
routine before starting work on the
routine (as shown in Figure 13.81) is passed an
methods. Again, I have to test the header to see if the stream is NULL, in which case I simply do nothing.
Figure 13.81. The UDT Read routine.
Our task in the
routine is to populate an instance of the
and fill in the UDT instance private variables. I started by reading the byte containing the length of the persisted stream and follow that by reading the entire stream into a string. Another approach might have been to use the
methods to directly populate the individual private variables, but this added unwanted complexity. One benefit of this approach is that I didn't have to repeat the code used to parse the inbound stream (I used the same input string format); I simply
a new instance of the
class and called the
routine to flesh it out.
Basically, that's it. As you can see, user-defined serialization is not that complex. I'm now ready to test the new UDT.
Note that all of the routines have their own Try/Catch exception handlers. I found these essential when trying to figure out what part of the UDT code had problems.
Testing the typICurrencyV2 UDT
The testing strategy here is no different than for the earlier UDT examples, but the scripts are a bit different
Ah, can I mention a few more points before moving on? Note that that Visual Studio 2005 Data Source Configuration Wizard does not support tables that contain UDTs. No, this is not the end of the world, but it does mean that you won't be able to create strongly typed
objects against tables containing UDTsyou'll have to use untyped data structures. And no, you can't use the
designer to build an STD
, either. SorryI guess it was a bit much to ask. Of course, you can create SQL Views that expose the UDT properties as columns. In this case, much of the code-generator functionality in Visual Studio can be used, but only to create read-only STD structures.
One other tiresome issue is that the Visual Studio IDE and SQL Server Management Studio do not know how to expose the public properties of a CLR UDT. This means that unless those developers being asked to consume the UDT have access to the source code (as if), they won't have any way (outside of using Reflector
) to see the private properties you define to access the specific component properties of the UDT. I guess you had best document it and keep the documentation in sync with the deployed version.
Figure 13.82. The Visual Studio Data Source Configuration Wizard chokes on UDTs.