Implementing Advanced CLR
User
Defined Types
I wanted to get you grounded in some fundamental principles before wading into some of the more advanced concepts involving UDTs. Yes, UDTs can be very code-
intensive
and can introduce a number of more complex issues. As I worked with the
typICurrency
UDT in this section, I
discovered
a number of
nuances
that are not covered in the documentation but are critical to the ability of the UDT to correctly manage the data as intended.
Woven into the fabric of this section are the following topics:
-
Coding your own CLR UDT serialization routines
: These can permit you to implement versioning on the UDT. This means when your pointy-haired boss decides that it's time to upgrade or tune your UDT, you won't need to throw out existing datayou simply bump the UDT version. To implement this feature, the custom serializer I'm about to show you
knows
how to morph the data so the old data is automatically converted into the new format. This is a feature that existing SQL Server (aliased) User Defined Types could not support.
-
Building your own deployment scripts
: Because of the limitations in Visual Studio 2005, I found it easier to build a custom T-SQL script that
performs
the deploymentthis way, I don't (always) have to tear down all dependencies as each new version is built.
-
Accessing your new UDT from other applications
: This section discusses how to set up a Windows Forms application (or any other type of application) that incorporates the UDT into the registered classes visible to the compiler. I'll show you how to pass a UDT from tier to
tier
and in arguments passed to functions,
methods
, subroutines, or local properties.
-
Using a UDT in strongly typed data structures
: Since the Visual Studio IDE does not directly support UDTs, you'll have to take a few extra steps to permit you to create strongly typed data structures that leverage this technology. I discuss a couple of easy tricks to accomplish this.
For the example in this section, I'm going to start over with another UDT designed to manage international currency values. It's a variation of the code Bill used to
demonstrate
CLR UDTs in the
years
leading up to SQL Server 2005's RTM:
typICurrency
. The folks at Microsoft seem to think it's a good example, toothey implemented a lightweight version of it for their example in the MSDN help topics, but I'm going to take that implementation up a few notches.
The
typICurrency
UDT is designed to manage currency (money) values, where values can be in any currency denomination. For example,
typICurrency
can store dollars, euros, British
pounds
, or any currency type in the same column. The UDT keeps track of the currency denomination and other helpful values, like the currency symbol, to make selecting and displaying the data easy. I initially implement this UDT using strings and variable-length datatypes, so I'll have to use custom serialization.
User-Defined Serialization
Before long, you're going to outgrow default (
Format.Native
) serialization. Keep in mind that if your UDT stores values can be
persisted
as
bool, byte, sbyte, short, ushort, int, uint, long, ulong, float, double, SqlByte, SqlInt16, SqlInt32, SqlInt64, SqlDateTime, SqlSingle, SqlDouble, SqlMoney, or SqlBoolean
, it's best to simply stick with the default serialization. That is, unless you want to implement custom versioning, which
permits
UDT data to be identified by version. This way, you can upgrade from one implementation to anotherautomatically morphing the data as you go. I'll show you how to implement custom versioning later in this section.
If you use default serialization, you should not (cannot) specify
MaxByteSize,
and all of the component
parts
(the fields) of your UDT must be serializable (all of the datatypes mentioned above are serializable). SQL Server also forces you to lay out the items in your UDT structure sequentiallyI saw how to set this order earlier in this chapter. This regimen can seem a bit stifling, and there is a
viable
alternative: user-defined serialization.
Okay, so you want to expose an unsupported type like
SqlBinary
,
SqlString,
or some other variable-length public property value in the UDT. This complicates things a bit, as now you have to
turn
off the autopilot and move your data into and out of SQL Server without benefit of the automatic formatting and serialization. At first, this might feel like your first solo flight in an OH-23 helicopter with 8 hours of trainingbut not nearly as life-
threatening
.
To start with, you'll need to
specifically
set the following UDT attributes when you first declare your UDT Structure (or struct):
-
Format
: This must be set to "UserDefined" when you intend to code your own serializer. This
tells
the compiler (and SQL Server) to look for the
IBinarySerialize
interface in the UDT. Once set, SQL Server calls the
Read
method to retrieve the binary data and the
Write
method to save it back to the data table column.
-
Name
: This attribute simply
names
the UDT so it can be managed with Visual Studio.
-
IsByteOrdered
: This tells SQL Server how you have stored the data and helps SQL Server manage binary comparisons on the UDT
. In other words,
IsByteOrdered=True
says that the bytes are written in sequencejust as they are when working with managed code. If you want the UDT to support comparison operators, GROUP BY, ORDER BY, and PARTITION BY, you have to set
IsByteOrdered
to
True
. In this case, SQL Server uses the disk representation of the instance in binary comparisons. Operators supported include =, !-, <, <=, >, and >. The default is
True
.
-
IsFixedLength
: You'll probably want to set this attribute to
False
when using variable-length data types. However, a fixed-length structure that contains variable-lengths
components
is possiblebut tough to implement.
-
MaxByteSize
: When using the
UserDefined
UDT format, you'll have to set the
MaxByteSize
attribute to the combined
size
of the UDT components in its serialized form. That is,
MaxByteSize
tells SQL Server the size of the longest data stream your
Write
routine returns. Since you're in charge of implementing the serializer, this is a value you determinebut it might be a bit tough if your UDT has a number of variable-length components. For example, if you have a single
System.Char
column that's 10 bytes long, it consumes 22 bytes, 16 bits/character (to implement the Unicode UTF-16 character), and 2 bytes of overhead used while serializing. Just remember to include the entire data length when computing
MaxByteSize
the data plus the serialization overhead. The upper limit is about 8,000, but you don't really want to save that much data in any single column; considering the maximum size of a data row is about 8,000, having a couple of
columns
consume
all of the space flies in the face of most relational theory.
Building the iTypCurrencyV2 UDT
Once your structure attributes are set (which is probably the last thing you'll set after the UDT is written), you need to add the
IBinarySerialize Implements
annotation, as shown on line 12 of Figure 13.77. This is used to expose the standard interfaces where the
Read
and
Write
methods are implemented. These are used to accept and return data from and to SQL Server.
Note that I'm implementing "version 2" of the UDT
typICurrencyV2
. Version 1 is a good place for you to start exploring how to code a UDT. Version 2 goes quite a bit farther in the implementation to enable many of the more complex features you might need in a UDT.
The UDT attribute and structure declarations for the new UDT are shown in Figure 13.77.
When you specify which interfaces to implement, be sure to let Visual Studio help you with Intellisense. If you do, Visual Studio automatically adds the implementation prototype routines in your code.
The code used in the
typICurrencyV2
UDT is different enough that I'm going to step through it section by section to let you see the approach I use at each phase. One aspect of the user-defined (hand-rolled) serialization is that if you aren't careful, the UDT data written to the server tends to consume more memory, more I/O channel capacity, and more disk space. This overhead can consume more resourcesperhaps more resources than the implementation is worth in increased functionality. I coded the UDT to keep the size of the stored data and the amount of code needed to manage it to a minimum. No, writing a serializer is not hard. It might seem daunting at first, as you have to decide what's being written (in binary) to the table column. I'll show you how to inspect what you've written to the column value so you can tune it for size and functionality.
Declaring and Parsing the Private
Variables
The
private
variables declared in the UDT define how the UDT data is stored (temporarily) within the UDT. No, they don't have any
bearing
on how the data is stored in the database or how it's exposed to consumers (those applications referencing the UDT data). The private variables and the supporting public properties that
Set
and
Get
them were templated with the Visual Studio Class Diagram tool. In this case, I chose
SqlString
variables to manage the currency symbol ("$") and the currency type ("USD") values. I could have used a string for the money value as well, but I wanted to
illustrate
how to handle a more complex type within the UDT. This implementation (which is different from the "version 1" basic
typICurrency
implementation) supports a
Version
property that's used to help morph data from one implementation footprint of the UDT to the
next
. Version 2 also knows how to recognize version 1 formatting so it can automatically upgrade to version 2 notation.
Tip
Remember, I discussed each of these UDT support routines earlier in this chapter.
Coding the iTypCurrencyV2 ToString Function
Note that the UDT
ToString
function implementation uses
String.Format
to return the string representation of the current UDT private variable stateassuming the UDT
IsNull
property persisted as
m_Null
is set to
False
. Again, this output format does not have to match the input format used when you set the UDT value with the CONVERT statement, but it helps. As a matter of fact, the string you return is not stored anywherejust used to display the contents of the UDT when a SELECT is executed or your code invokes the
typICurrencyV2.ToString
method directly. Note that the
ToString
function does not return the
Version
property.
The UDT Parse Function
The
Parse
function's job is no different here than when you use default serialization. The primary goal is to convert the string passed from SQL Server and set the private property values. Yes, the
Parse
should make sure that these values meet minimum validity requirements, and, no, this routine does not do thatit blindly accepts any currency name, symbol, or type and any value. However, for
typICurrencyV2,
I added a few extra lines of code to deal with an unknown (NULL) currency type and alternate versions.
In this implementation, I depend on the CLR
Split
function to parse the string passed to the
Parse
function (see Figure 13.79, line 61). This deals with several
string-parsing
issuesin this case, all I need to do is separate the individual UDT values in the inbound string with a
colon
(":").
Coding the IBinarySerialize Read and Write Subroutines
Because the UDT code is now responsible for serializing the instance data, you need to construct subroutines to convert the private variable data to a data stream (
Write
) and reconstitute the private variables from that stream (
Read
). A few random thoughts come to mind here as you define your
Read
and
Write
routines:
-
Just exactly how you define the saved data stream is up to you, but consider that every byte you write to the stream is saved to the database table column containing the UDT. No, what I implemented in
typICurrencyV2
is not particularly efficient, but I didn't want to confuse the issue by implementing a complex serialization scheme.
-
It's up to the
Read
routine to decode the stream extruded by the
Write
routine, so if you make the stream complex, the decoding algorithm must be equally complex to
unscramble
it.
-
If you serialize complex variables like
SqlBinary
or
SqlMoney
, the complexity of the
Write
and
Read
routines
increases
dramatically and you might also end up saving more data than you need. Consider that a
SqlMoney
datatype can consume from 5 to 17 bytes of space, while the value you store might be as small as 3 bytes ("0.0") when persisted as a string.
-
Each stream that's sent to your UDT for decoding is prefixed with a "header" byte that indicates the NULL state of the UDT binary value. The 1 bit is set if the value is NULL. In this case,
Write
should simply return
Null
(as
illustrated
in the code).
-
In the
Write
routine, I save the length of the string to be serialized followed by the stream data. This makes it easy to capture the stream in the
Read
routine where I simply read the length and pass that value to a fixed-length
Read
function. I added a bit of code to write the length as an (8-bit)
byte
instead of an
integer,
as I know the length of the data I'm saving won't exceed 255
characters
. This approach saves 24 bits of space per entry.
The UDT
Write
routine is shown in Figure 13.80.
Yes, it's probably best that you design and code the
Write
routine before starting work on the
Read
routine. The
Read
routine (as shown in Figure 13.81) is passed an
open
BinaryReader
accessible via
Read
methods. Again, I have to test the header to see if the stream is NULL, in which case I simply do nothing.
Our task in the
Read
routine is to populate an instance of the
typICurrency
and fill in the UDT instance private variables. I started by reading the byte containing the length of the persisted stream and follow that by reading the entire stream into a string. Another approach might have been to use the
type-specific
Read
methods to directly populate the individual private variables, but this added unwanted complexity. One benefit of this approach is that I didn't have to repeat the code used to parse the inbound stream (I used the same input string format); I simply
instantiated
a new instance of the
typICurrency
class and called the
Parse
routine to flesh it out.
Basically, that's it. As you can see, user-defined serialization is not that complex. I'm now ready to test the new UDT.
Note that all of the routines have their own Try/Catch exception handlers. I found these essential when trying to figure out what part of the UDT code had problems.
Testing the typICurrencyV2 UDT
The testing strategy here is no different than for the earlier UDT examples, but the scripts are a bit different
.
Ah, can I mention a few more points before moving on? Note that that Visual Studio 2005 Data Source Configuration Wizard does not support tables that contain UDTs. No, this is not the end of the world, but it does mean that you won't be able to create strongly typed
DataTable
or
DataSet
objects against tables containing UDTsyou'll have to use untyped data structures. And no, you can't use the
DataSet
designer to build an STD
, either. SorryI guess it was a bit much to ask. Of course, you can create SQL Views that expose the UDT properties as columns. In this case, much of the code-generator functionality in Visual Studio can be used, but only to create read-only STD structures.
One other tiresome issue is that the Visual Studio IDE and SQL Server Management Studio do not know how to expose the public properties of a CLR UDT. This means that unless those developers being asked to consume the UDT have access to the source code (as if), they won't have any way (outside of using Reflector
) to see the private properties you define to access the specific component properties of the UDT. I guess you had best document it and keep the documentation in sync with the deployed version.
|