The Common Language Runtime | Microsoft SQL Server 2005: The Complete Reference: Full Coverage of all New and Improved Features

Your SQL Server applications, class libraries, and components live in two realities. The design-time reality is where you write source code, create classes and objects, design applications, and debug and compile your code. The runtime reality is an external environment, and for .NET applications, this external runtime environment is the Common Language Runtime, better known as the CLR (commonly referred to as just the runtime environment, or RTE, by the .NET architects), which typically runs on the operating system. In this book we are concerned about how the CLR operates within the operating environment of SQL Server. Figure 11–2 demonstrates the relationship of the CLR with the operating system and its various layers.

image from book
Figure 11–2: The CLR and its relationship to SQL Server

The code you write to target the CLR is called managed code. This means that the execution of the code in the runtime environment is managed by the CLR. What exactly the CLR manages is discussed shortly.

When I started programming in Java and Visual Basic in the mid-nineties, I was perplexed by the need to pay so much attention to the runtime environment. It took an effort to gather up all the runtime elements and make sure they were properly installed just to run the simplest application. I was always shocked to have to build a CD just to ship an application that could fit on a quarter of the space of a floppy disk.

As a Delphi programmer I did not need to concern myself with the need to ensure that a runtime layer on the target operating system would be able to support my application. But then the big Delphi applications produced executables and Dynamic linked libraries (DLL) files that became rather bloated.

When I moved to VB and Java, I found it disturbing that a tiny executable of no more than 100K needed many megabytes of supporting libraries just to run. In the early days of Java making sure I had the support of the correct VM was a painful chore. Since I was only writing Windows applications, I learned to rather program against the Java components of Internet Explorer, to be sure that my Visual J++ apps would work. Testing for IE’s JVM was actually the easiest way to deploy VJ++ apps back in 1997 or 1998.

After a few years, however, it became clear that the target operating systems my clients were running already had the supporting runtime environment I needed. This was more the case with the JVM than for VBRUN, mind you, because just about everyone already had the latest versions of Internet Explorer on their machines. In the new millennium, as long as your operating systems are well patched and service packs are kept up to date by your IT staff, worrying about the runtime for classic apps is a thing of the past. This is how it is with SQL Server. You don’t need to worry about any supporting engine components, as we will soon see.

Microsoft Intermediate Language

When you compile your SQL Server code, it is changed to an intermediate code that the CLR understands and all other .NET development environments understand. All .NET languages compile code to this IL, which is known as Microsoft Intermediate Language, better known as MSIL or just IL for convenience in various places in this book. The idea of compiling to an IL is not new. As you know, two popular languages compile to an intermediate language (or level): Java and Smalltalk.

There are many advantages to IL (and several disadvantages we will discuss shortly). For starters, on the pro side, compilation is much quicker because your code does not have to be converted to machine code. Another of the major advantages of IL is that the development environments of the other .NET languages can consume components and class libraries from the other languages because at the IL level all .NET code is the same.

Note

MSIL represents a major paradigm shift in the compilation of code for the Windows platform. Gone are the days when vendors touted compiler speeds, robustness of linkers, and so on. Today thanks to Java and .NET, most of the code we write is first compiled to IL, and we don’t have to do anything else as programmers to compile our code to the machine code level.

Cross-language debugging and profiling is also possible, and so is cross-platform debugging as long as the CLR is the code management authority end to end. Exceptions caused by code that was originally written in Visual Basic can be handled by a C# application, and vice versa. Specifically, IL provides the following benefits:

It provides cross-language integration. This includes cross-language inheritance, which means that you can create a new class by deriving it from a base class written in another language.
It facilitates automatic memory management, which is fondly known as garbage collection. Garbage collection manages object lifetimes, rendering reference counting obsolete.
It provides for self-describing objects, which means that complex APIs, like those requiring Interface Definition Language (IDL) for COM components, are now unnecessary.
It provides for the ability to compile code once and then run it on any CPU and operating system that supports the runtime.

Figure 11–3 shows what happens to your code from the time your write and compile it in Visual Studio to execution.

image from book
Figure 11–3: Follow the IL

Metadata

Once you have built an application, a class library, or a component and compiled it, the IL code produced is packaged up with its metadata in an assembly. The assemblies will have either an exe or a .dll extension, depending on whether they are executables or class libraries.

But the code cannot be executed just yet, because before the CLR can compile it to machine code, it first needs to decide how to work with the assembly The metadata in the IL directs how all the objects in your code are laid out; what gets loaded; how it is stored; which methods get called; and contains a whole slew of data on operations, control-flow, exception handling, and so on.

The metadata also describes the classes used, the signatures of methods, and the referencing required at runtime (which is what gives you such powerful stuff as reflection and delegation, with its AddressOf operator). It also describes the assembly by exposing the following information about the IL code in the assembly:

The identity of the assembly (name, version, public key, culture context, and so on)
Dependencies, or what other assemblies this assembly depends on
Security permissions, which are set by an administrator
Visibility of the type
The parent of the type, or what it inherits from
Type membership (methods, fields, properties, events, and so on)
Attributes, which are additional elements used on types and their members at runtime.

All this data is expressed in the metadata and essentially allows the assembly contents to be self-describing to the CLR. Self-describing code makes all the hassles of registration, type libraries, and Interface Definition Language (IDL), as discussed, a thing of the past. But metadata does much more.

Self-describing files do not need to be identified or registered with the operating system. By packaging metadata within the executable file itself, the identification is a self-describing ability on the part of the assembly. You can also trust a self-describing assembly more implicitly than you can a file that publicizes itself in the registry, because registry entries date rapidly and their integrity can be easily compromised. Registry entries and their implementation counterparts (the DLLs and executables installed on the system) also can become easily separated.

If you intend your classes to be totally language agnostic, they need to conform to the CLS and not include elements not supported by all CLS languages. Because so many CLS languages are here now, and because many more CLS languages are on their way, you might want to further study the CLS in the .NET SDK.

Executable Code

Assemblies do not have carte blanche run of the CLR. Code is not always passed directly to the just-in-time (JIT) compiler. First, the IL code may undergo a thorough inspection if deemed necessary by the platform administrator. The code is given a verification test that is carried out according to the wishes of the network administrator, who might have specified that all .NET code on the machine must be executed according to a certain security policy. The IL code is also checked to make sure nothing malicious has been included. How the checking is carried out is beyond the scope of this book, but we will look at various security settings a little later in the chapter.

The code is also checked to determine whether it is type safe, that the code does not try to access memory locations it is restricted from accessing, and that references reference what they are supposed to reference. Objects have to meet stringent safety checks to ensure that objects are properly isolated from one another and do not access each other’s data. In short, if the verification process discovers that the IL code is not what it claims to be, it is terminated and security exceptions are thrown.

Managed Execution

The .NET JIT compiler has been engineered to conserve both memory and resources while performing its duties. It is able, through the code inspection process and self-learning, to figure out what code needs to be compiled immediately and what code can be compiled later, or when it is needed. This is what we mean by JIT compilation-the code is compiled as soon as we need it.

Applications and services thus may appear to be slow to start up the first time, because subsequent execution obviates the need to pass the code through the “JIT’er” again. You can also force compilation or precompile code if necessary. But for the most part, or at least until you have a substantial .NET project underway, you will not need to concern yourself about cranking up the JIT compiler, or keep it idling.

During execution, the CLR manages the execution processes that allocate resources and services to the executable code. Such services include memory management, security services, cross-language interop, debugging support, and deployment and versioning.

Managed execution also entails a lot more than reading IL, verification, JIT, and so on. It also describes what the CLR does once it has loaded and executed an application. Three sophisticated operations of the CLR worth noting are side-by-side execution, isolating applica tions and services into application domains, and garbage collection.

Side-by-Side Execution

The autonomous, independent, self-describing, unique, versioned nature of an assembly allows you to execute multiple versions of the same assembly simultaneously This is a phenomenon known as side-by-side execution. This is not something that has never been done before. It is, moreover, something that could never be done easily, and it could not be done with just any application.

Side-by-side execution has brought about the end of DLL hell, because you no longer have to maintain backward compatibility of libraries and components when new applications and assemblies are installed on a machine. Instead, applications that depend on yesterday’s version of Smee’s component will not break because a new application was installed with today’s version of Smee’s component. And when you need to junk the various versions of Smee’s component when they are no longer being used, you can hit DELETE. However, you will need to explicitly reregister a DLL in SQL Server every time you make changes to the code and then re-register the component you are going to call, such as a function, a stored procedure, a trigger.

Side-by-side execution is possible because an executable assembly expresses a dependence on a particular assembly (the old Smee component). So as long as the old component is still around, any application that needs it still works. However, versioning on .NET is a little more intelligent than simple version numbers and assemblies that can be gone in a SHIFT-DELETE. Version policy can specifically force an application to upgrade to the new version of Smee’s component.

Note

Just because you can run applications and assemblies side by side on the same computer, and even in the same process, it doesn’t mean that conflicts won’t crop up. You need good application design and proven patterns of software development to ensure that code is safe and reentrant.

Automatic Memory Management

A boon for developers coding to .NET is the automatic memory management that it provides. This has been in achieved using a sophisticated memory-management algorithm called a garbage collector (GC).

Let’s set the scene with an analogy. If you are a single person, you know what drag it is to schlep the garbage out in the morning. If you are not single, you may also know what a drag it is to be asked to schlep the garbage out in the morning. And if you have kids, you know what it is like to argue with them and still have to take the garbage out yourself.

See yourself in that picture? Programming and managing memory without a GC can be a drag. Now imagine that every morning, the garbage bag simply dissolves and you no longer have to worry about it. This is what the GC does for you. It eliminates the chores of managing memory in programming.

When you no longer need the object and nix the reference variable, when you assign the reference variable to another object, or when something just happens to cut the reference variable from the object, the object gets lost (it has gone out of scope). This means that you no longer have a way of referencing the object to reuse it.

In VB 6.0 and earlier days, objects that went out of scope, got lost, or simply were not needed anymore had to be explicitly disposed of (remember Terminate events [VB], Destroy or Free [Delphi], or DeleteRef [C++]). The problem in manual memory management is that when you have a lot of objects, you sometimes forget to dispose of them, you lose track of how many were created, and so on. So some of these objects never get cleaned up and you slowly start to “leak out memory.” The .NET GC does not let this happen, because these “lost” objects are removed and the memory they occupied is freed.

This, of course, could mean that you can write a heck of a lot of code without having to worry about memory management. However, we need to say “yes, but” and add a big disclaimer: You can write a lot of code and never have to worry about freeing objects. And you will see that in the examples provided in this book. But the concept of not having to worry about memory management ever again is simply untrue-untrue for the .NET languages and untrue for Java.

To demonstrate, let’s say you create an application with GC that is opening up sockets all over the Internet and about ten threads are running, each in its own little “slice” on the system, activating objects and basically being very, very busy. The algorithms in the application that work the threads need to create objects, work with them, and then dump them (for whatever reason, they cannot be reused by the thread). In this case, chances are that you are going to run out of memory just as quickly as you would in the unmanaged world because the GC cannot clean up after your threads as quickly as you need.

You might think that you could just call Finalize after each object is done with. But, sorry folks, GC algorithms do not work that way. You see, the finalization of objects in the GC world of managed execution is nondeterministic, which means that you cannot predict exactly when an object will be cleaned out. Objects aren’t removed chronologically, so those that died earlier than others may end up getting removed out of order. GCs do not just stop and rush over to do your bidding. Like kids, they don’t come running immediately when the garbage bag is near bursting.

There is something else you need to think about. Garbage collection can itself be a bottleneck. The boon of not having to set objects free has this trade-off: The GC is under the control of the CLR, and when the collector stops to take out the garbage, your threads have to mark time. This means that not only do you have garbage stinking up the place, but your threads get put on hold while the GCs dumpster pulls up at your back door. So now you no longer have memory leaks to worry about, but you might have “time leaks” instead.

Before you tear up this book and decide to go into shrimp farming, know this: The CLR allows you some management over the GC. A collection of GC classes and methods are at your disposal. This does not mean that you can force collection or make the cleanup deterministic, but it does mean that you can design your applications and algorithms in such a way that you have some degree of control over resource cleanup.

Here is something else to consider. Just because managed code is garbage-collected does not mean you can ignore application design and common sense. If you are coding applications that lose or nix objects, the GC is not going to work for you. In fact, you should return this book to the store (don’t tear it up, though) and go into shrimp farming. Your patterns and design should be using the objects you create until the application or service shuts down. And objects that have to be removed should be kept to a minimum.

Despite our warnings, the GC is actually very fast. The time you might lose to collection is measured in milliseconds in the life of the average application on a fast machine. In addition, the GC can be deployed on multiprocessor machines, allowing its threads to be allocated to one processor while yours run on the other. And because the GC is such an important part of the CLR, you can bet that Microsoft will often send it back to the workshop for tune-ups, oil changes, tire-rotation, and so on.