Overview of the Hosting Managers | Customizing the Microsoft .NET Framework Common Language Runtime

Earlier in the chapter, I described the manager architecture, listed each manager and its interfaces, and talked about how the CLR and the host go about obtaining manager implementations. In this section, I take a brief look at each manager to understand how it can be used by an application to customize a running CLR.

Assembly Loading

The CLR has a default, well-defined set of steps it follows to resolve a reference to an assembly. These steps include applying various levels of version policy, searching the global assembly cache (GAC), and looking for the assemblies in subdirectories under the application's root directory. These defaults include the assumption that the desired assembly is stored in a binary file in the file system.

These resolution steps work well for many application scenarios, but there are situations in which a different approach is required. Remember that CLR hosts essentially define a new application model. As such, it's highly likely that different application models will have different requirements for versioning, assembly storage, and assembly retrieval. To that end, the assembly loading manager enables a host to customize completely the assembly loading process. The level of customization that's possible is so extensive that a host can implement its own custom assembly loading mechanism and bypass the CLR defaults altogether if desired.

Specifically, a host can customize the following:

The location from which an assembly is loaded
How (or if) version policy is applied
The format from which an assembly is loaded (assemblies need not be stored in standalone disk files anymore)

Let's take a look at how Microsoft SQL Server 2005 uses the assembly loading manager to get an idea of how these capabilities can be used.

As background, SQL Server 2005 allows user-defined types, procedures, functions, triggers, and so on to be written in managed languages. A few characteristics of the SQL Server 2005 environment point to the need for customized assembly binding:

Assemblies are stored in the database, not in the file system. Managed code that implements user-defined types, procedures, and the like is compiled into assemblies just as you'd expect, but the assembly must be registered in SQL before it can be used. This registration process physically copies the contents of the assembly into the database. This self-contained nature of database applications makes them easy to replicate from server to server.
The assemblies installed in SQL are the exact ones that must be run. SQL Server 2005 applications typically have very strict versioning requirements because of the heavy reliance on persisted data. For example, the return value from a managed user-defined function might be used to build an index used to optimize performance. It is imperative that only the exact assembly that was used to build the index is used when the application is run. If a reference to that assembly were somehow redirected through version policy, the index that was previously stored could become invalid.

To support these requirements, SQL Server 2005 makes extensive use of the assembly loading manager to load assemblies out of the database instead of from the file system and to bypass many of the versioning rules that the CLR follows by default.

It's important to notice, however, that not all assemblies are stored and loaded out of the database by SQL. The assemblies used in a SQL Server 2005 application fall into one of two categories: the assemblies written by customers that define the actual behavior of the application (the add-ins), and the assemblies written by Microsoft that ship as part of the Microsoft .NET Framework. In the SQL case, only the add-ins are stored in the databasethe Microsoft .NET Framework assemblies are installed and loaded out of the global assembly cache.

In fact, it is often the case that a host will want to load only the add-ins in a custom fashion and let the default CLR behavior govern how the Microsoft .NET Framework assemblies are loaded. To support this idea, the assembly loading manager enables the host to pass in a list of assemblies that should be loaded in the normal, default CLR fashion. All other assembly references are directed to the host for resolution.

Those assemblies that the host resolves can be loaded from any location in any format. These assemblies are returned from the host to the CLR in the form of a pointer to an IStream interface. For hosts that implement the assembly loading manager (that is, provide an implementation of IHostAssemblyManager when queried for it through IHostControl::GetHostManager), the process of binding generally works like this:

As the CLR is running code, it often finds references to other assemblies that must be resolved for the program to run properly. These references can be either static in the calling assembly's metadata or dynamic in the form of a call to Assembly.Load or one of the other class library methods used to load assemblies.
The CLR looks to see if the reference is to an assembly that the host has told the CLR to bind to itself (a Microsoft .NET Framework assembly in our SQL example). If so, binding proceeds as normal: version policy is applied, the global assembly cache is searched, and so on.
If the reference is not in the list of CLR-bound assemblies, the CLR calls through the interfaces in the Assembly Manager (IHostAssemblyStore, specifically) to resolve the assembly.
At this point, the host is free to load the assembly in any way and returns an IStream * representing the assembly to the CLR. In the SQL scenario, the assembly is loaded directly from the database.

Figure 2-4 shows the distinction between how add-ins and the Microsoft .NET Framework assemblies are loaded in SQL Server 2005.

Figure 2-4. Assembly loading in the SQL Server 2005 host

Details of how to implement an assembly loading manager to achieve the customizations described here is provided in Chapter 8.

Customizing Failure Behavior

The CLR hosting APIs are built to accommodate a variety of hosts, many of which will have different tolerances for handling failures that occur while running managed code in the process. For example, hosts with largely stateless programming models, such as ASP.NET, can use a process recycling model to reclaim processes deemed unstable. In contrast, hosts such as SQL Server 2005 and the Microsoft Windows shell rely on the process being stable for a logically infinite amount of time.

The CLR supports these different reliability needs through an infrastructure that can keep a single application domain or an entire process consistent in the face of various situations that would typically compromise stability. Examples of these situations include a thread that fails to abort properly (because of a finalizer that loops infinitely, for example) and the inability to allocate a resource such as memory.

In general, the CLR's philosophy is to throw exceptions on resource failures and thread aborts. However, there are cases in which a host might want to override these defaults. For example, consider the case in which a failure to allocate memory occurs in a region of code that might be sharing state across threads. Because such a failure can leave the domain in an inconsistent state, the host might choose to unload the entire domain instead of aborting just the thread from which the failed allocation occurred. Although this action clearly affects all code running in the domain, it guarantees that the rest of the domains remain consistent and the process remains stable. In contrast, a different host might be willing to allow the questionable domain to keep running and instead will stop sending new requests into it and will unload the domain later.

Hosts use the failure policy manager to specify which actions to take in these situations. The failure policy manager enables the host to set timeout values for actions such as aborting a thread or unloading an application domain and to provide policy statements that govern the behavior when a request for a resource cannot be granted or when a given timeout expires. For example, a host can provide policy that causes the CLR to unload an application domain in the face of certain failures to guarantee the continued stability of the process as described in the previous example.

The CLR's infrastructure for supporting scenarios requiring high availability requires that managed code library authors follow a set of programming guidelines aimed at proper resource management. These guidelines, combined with the infrastructure that supports them, are both needed for the CLR to guarantee the stability of a process. Chapter 11 discusses how hosts can customize CLR behavior in the face of failures and also describes the coding guidelines that library authors must follow to enable the CLR's reliability guarantees.

Programming Model Enforcement

The .NET Framework class libraries provide an extensive set of built-in functionality that hosted add-ins can take advantage of. In addition, numerous third-party class libraries exist that provide everything from statistical and math libraries to libraries of new user interface (UI) controls.

However, the full extent of functionality provided by the set of available class libraries might not be appropriate in particular hosting scenarios. For example, displaying user interface in server programs or services is not useful, or allowing add-ins to exit the process cannot be allowed in hosts that require long process lifetimes.

The host protection manager provides the host with a means to block classes, methods, properties, and fields offering a particular category of functionality from being loaded, and therefore used, in the process. A host can choose to prevent the loading of a class or the calling of a method for a number of reasons including reliability and scalability concerns or because the functionality doesn't make sense in that host's environment, as in the examples described earlier.

You might be thinking that host protection sounds a lot like a security feature, and in fact we typically think of disallowing functionality to prevent security exploits. However, host protection is not about security. Instead, it's about blocking functionality that doesn't make sense in a given host's programming model. For example, you might choose to use host protection to prevent add-ins from obtaining synchronization primitives used to coordinate access to a resource from multiple threads because taking such a lock can limit scalability in a server application. The ability to request access to a synchronization primitive is a programming model concern, not a security issue.

When using the host protection manager to disallow certain functionality, hosts indicate which general categories of functionality they're blocking rather than individual classes or members. The classes and members contained in the .NET Framework class libraries are grouped into categories based on the functionality they provide. These categories include the following:

Shared state Library code that exposes a means for add-ins to share state across threads or application domains. The methods in the System.Threading namespace that allow you to manipulate the data slots on a thread, such as Thread.AllocateDataSlot, are examples of methods that can be used to share state across threads.
Synchronization Classes or members that expose a way for add-in to hold locks. The Monitor class in the System.Threading namespace is a good example of a class you can use to hold a lock.
Threading Any functionality that affects the lifetime of a thread in the process. Because it causes a new thread to start running, System.Threading.Thread.Start is an example of a method that affects thread lifetime within a process.
Process management Any code that provides the capability to manipulate a process, whether it be the host's process or any other process on the machine. System.Diagnostics.Process.Start is clearly a method in this category.

Classes and members in the .NET Framework that have functionality belonging to one or more of these categories are marked with a custom attribute called the HostProtectionAttribute that indicates the functionality that is exposed. The host protection manager comes into play by providing an interface (ICLRHostProtectionManager) that hosts use to indicate which categories of functionality they'd like to prevent from being used in the process. The attribute settings in the code and the host protection settings passed in through the host are examined at runtime to determine whether a particular member is allowed to run. If a particular member is marked as being part of the threading category, for example, and the host has indicated that all threading functionality should be blocked, an exception will be thrown instead of the member being called.

Annotating code with the category custom attributes and using the host protection manager to block categories of functionality is described in detail in Chapter 12.

Memory and Garbage Collection

The managers we've looked at so far have allowed the host to customize different aspects of the CLR. Another set of managers has a slightly different flavorthese managers enable a host to integrate its runtime environment deeply with the CLR's execution engine. In a sense, these managers can be considered abstractions over the set of primitives or resources that the CLR typically gets from the operating system (OS) on which it is running. More generally, the COM interfaces that are part of the hosting API can be viewed as an abstraction layer that sits between the CLR and the operating system, as shown in Figure 2-5. Hosts use these interfaces to provide the CLR with primitives to allocate and manage memory, create and manipulate threads, perform synchronization, and so on. When one of these managers is provided by a host, the CLR will use the manager instead of the underlying operating system API to get the resource. By providing implementations that abstract the corresponding operating system concepts, a host can have an extremely detailed level of control over how the CLR behaves in a process. A host can decide when to fail a memory allocation requested by the CLR, it can dictate how managed code gets scheduled within the process, and so on.

Figure 2-5. The hosting APIs as an abstraction over the operating system

The first manager of this sort that we examine is the memory manager. The memory manager consists of three interfaces: IHostMemoryManager, IHostMalloc, and ICLRMemoryNotificationCallback. The methods of these interfaces enable the host to provide abstractions for the following:

Win32 and the standard C runtime memory allocation primitives Providing abstractions over APIs such as VirtualAlloc, VirtualFree, VirtualQuery, malloc, and free allow a host to track and control the memory used by the CLR. A typical use of the memory manager is to restrict the amount of memory the CLR can use within a process and to fail allocations when it makes sense in a host-specific scenario. For example, SQL Server 2005 operates within a configurable amount of memory. Oftentimes, SQL is configured to use all of the physical memory on the machine. To maximize performance, SQL tracks all memory allocations and ensures that paging never occurs. SQL would rather fail a memory allocation than page to disk. To track all allocations made within the process accurately, the SQL host must be able to record all allocations made by the CLR. When the amount of memory used is reaching the preconfigured limit, SQL must start failing memory allocation requests, including those that come from the CLR. The consequence of failing a particular CLR request varies with the point in time in which that request is made. In the least destructive case, the CLR might need to abort the thread on which an allocation is made if it cannot be satisfied. In more severe cases, the current application domain or even the entire process must be unloaded. Each request for additional memory made by the CLR includes an indication of what the consequences of failing that allocation are. This gives the host some room to decide which allocations it can tolerate failing and which it would rather satisfy at the expense of some other alternative for pruning memory.
The low-memory notification available on Microsoft Windows XP and later versions Windows XP provides memory notification events so applications can adjust the amount of memory they use based on the amount of available memory as reported by the operating system. (See the CreateMemoryResourceNotification API in the Platform SDK for background.) The memory management interfaces provided by the CLR hosting API enable a host to provide a similar mechanism that allows a host to notify the CLR of low- (or high-) memory conditions based on a host-specific notion, rather than the default operating system notion. Although the mechanism provided by the operating system is available only on Windows XP and later versions, the notification provided in the hosting API works on all platforms on which the CLR is supported. The CLR takes this notification as a heuristic that garbage collection is necessary. In this way, hosts can use this notification to encourage the CLR to do a collection to free memory so more memory is made available from which to satisfy additional allocation requests.

In addition to the memory manager, the CLR hosting API also provides a garbage collection manager that allows you to monitor and influence how the garbage collector uses memory in the process. Specifically, the garbage collection manager includes interfaces that enable you to determine when collections begin and end and to initiate collections yourself.

We discuss the details of implementing both the memory and garbage collection managers in Chapter 13.

Threading and Synchronization

The most intricate of the managers provided in the hosting APIs are the threading manager and the synchronization manager. Although the managers are defined separately in the API, it's hard to imagine a scenario in which a host would provide an implementation of the threading manager without implementing the synchronization manager as well. These managers work together to enable the host to customize the way managed code gets scheduled to run within a process.

The purpose of these two managers is to enable the host to abstract the notion of a unit of execution. The first two versions of the CLR assumed a world based on physical threads that were preemptively scheduled by the operating system. In .NET Framework 2.0, the threading manager and synchronization manager allow the CLR to run in environments that use cooperatively scheduled fibers instead. The threading manager introduces the term task as this abstract notion of a unit of execution. The host then maps the notion of a task to either a physical operating system thread or a host-scheduled fiber.

The scenarios in which these managers are used extensively are likely to be few, so I don't spend too much time discussing them in this book. However, the subject is interesting if for no other reason than the insight it provides into the inner workings of the CLR.

The set of capabilities provided by the threading manager is quite extensiveenough to model a major portion of an operating system thread API such as Win32, with additional features specifically required by the CLR. These additional features include a means for the CLR to notify the host of times in which thread affinity is required and callbacks into the CLR so it can know when a managed task gets scheduled (or unscheduled), among others.

The general capabilities of the threading manager are as follows:

Task management Starting and stopping tasks as well as standard operations such as join, sleep, alert, and priority adjustment.
Scheduling Notifications to the CLR that a managed task has been moved to or from a runnable state. When a task is scheduled, the CLR is told which physical operating system thread the task is put on.
Thread affinity A means for the CLR to tell the host of specific window during which thread affinity must be maintained. That is, a time during which a task must remain running and must stay on the current thread.
Delayed abort There are windows of time in which the CLR is not in a position to abort a task. The CLR calls the host just before and just after one of these windows.
Locale management Some hosts provide native APIs for users to change or retrieve the current thread locale setting. The managed libraries also provide such APIs (see System.Globalization.CurrentCulture and CurrentUICulture in the Microsoft .NET Framework SDK). In these scenarios, the host and the CLR must inform each other of locale changes so that both sides stay synchronized.
Task pooling Hosts can reuse or pool the CLR-implemented portion of a task to optimize performance.
Enter and leave notifications Hosts are notified each time execution leaves the CLR and each time it returns. These hooks are called whenever managed code issues a PInvoke or Com Interoperability call or when unmanaged code calls into managed code.

One feature that perhaps needs more explanation is the ability to hook calls between managed and unmanaged code. On the surface it might not be obvious how this is related to threading, but it ends up that hosts that implement cooperatively scheduled environments often must change how the thread that is involved in the transition can be scheduled.

Consider the scenario in which an add-in uses PInvoke to call an unmanaged DLL that the host knows nothing about. Because of the information received by implementing the threading and synchronization abstractions, the host can cooperatively schedule tasks running managed code just fine. However, when control leaves that managed code and enters the unmanaged DLL, the host no longer can know what that code is going to do. The unmanaged DLL could include code that takes a lock on a thread and holds it for long periods of time, for example. In this case, managed code should not be cooperatively scheduled on that thread because the host cannot control when it will next get a chance to run. This is where the hooks come in. When a host receives the notification that control is leaving the CLR, it can switch the scheduling mode of that thread from the host-control cooperative scheduling mode to the preemptive scheduling mode provided by the operating system. Said another way, the host gives responsibility for scheduling code on that thread back to the operating system. At some later point in time, the PInvoke call in our sample completes and returns to managed code. At this point, the hook is called again and the host can switch the scheduling mode back to its own cooperatively scheduled state.

I mentioned earlier that the threading manager and synchronization manager are closely related. The preceding example provides some hints as to why. The interfaces in the threading manager provide the means for the host to control many aspects of how managed tasks are run. However, the interfaces in the synchronization manager provide the host with information about how the tasks are actually behaving. Specifically, the synchronization manager provides a number of interfaces the CLR will use to create synchronization primitives (locks) when requested (or needed for internal reasons) during the execution of managed code. Knowing when locks are taken is useful information to have during scheduling. For example, when code blocks on a lock, it's likely a good time to pull that fiber off a thread and schedule another one that's ready to run. Knowing about locks helps a host tune its scheduler for maximum throughput.

There's another scenario in which it's useful for a host to be aware of the locks held by managed tasks: deadlock detection. It's quite possible that a host can be running managed tasks and tasks written in native code simultaneously. In this case, the CLR doesn't have enough information to resolve all deadlocks even if it tried to implement such a feature. Instead, the burden of detecting and resolving deadlocks must be on the host. Making the host aware of managed locks is essential for a complete deadlock detection mechanism.

Primarily for these reasons, the synchronization manager contains interfaces that provide the CLR with implementations of the following:

Critical sections
Events (both manual and auto-reset)
Semaphores
Reader/writer locks
Monitors

We dig into more details of these two managers in Chapter 14.

Other Hosting API Features

We've now covered most of the significant functionality the CLR makes available to hosts through the hosting API. However, a few more features are worth a brief look. These features are discussed in the following sections.

Loading Code Domain Neutral

When assemblies are loaded domain neutral, their jit-compiled code and some internal CLR data structures are shared among all the application domains in the process. The goal of this feature is to reduce the working set. Hosts use the hosting interfaces (specifically, IHostControl) to provide a specific list of assemblies they'd like to have loaded in this fashion. Although domain-neutral loading requires less memory, it does place some additional restrictions on the assembly. Specifically, the code that is generated is slightly slower in some scenarios, and a domain-neutral assembly cannot be unloaded until the process exits. As such, hosts typically do not load all assemblies domain neutral. In practice, the set of assemblies loaded in this way often are the system assembliesadd-ins are almost never loaded domain neutral so they can be dynamically unloaded while the process is running. This is the exact model that hosts such as SQL Server 2005 follow. Domain-neutral code is covered in detail in Chapter 9.

Thread Pool Management

Hosts can provide the CLR with a thread pool by implementing the thread pool manager. The thread pool manager has one interface (IHostThreadPoolManager) and provides all the functionality you'd expect including the capability to queue work items to the thread pool and set the number of threads in the pool. The thread pool manager is described in detail in Chapter 14.

I/O Completion Management

Overlapped I/O can also be abstracted by the host using the I/O completion manager. This manager enables the CLR to initiate asynchronous I/O through the host and receive notifications when it is complete. For more information on the I/O completion manager, see the documentation for the IHostIoCompletionPort and ICLRIoCompletionPort interfaces in the .NET Framework SDK.

Debugging Services Management

The debugging manager provides some basic capabilities that enable a host to customize the way debuggers work when attached to the host's process. For example, hosts can use this manager to cause the debugger to group related debugging tasks together and to load files containing extra debugging information. For more information on the debugging manager, see the ICLRDebugManager documentation in the .NET Framework SDK.

Application Domain Management

Application domains serve two primary purposes as far as a host is concerned. First, hosts use application domains to isolate groups of assemblies within a process. In many cases, application domains provide the same level of isolation for managed code as operating system processes do for unmanaged code. The second common use of application domains is to unload code from a process dynamically. Once an assembly has been loaded into a process, it cannot be unloaded individually. The only way to remove it from memory is to unload the application domain the assembly was in. Application domains are always created in managed code using the System.AppDomain class. However, the hosting interface ICLRRuntimeHost enables you to register an application domain manager^[1] that gets called by the CLR each time an application domain is created. You can use your application domain manager to configure the domains that are created in the process. In addition, ICLRRuntimeHost also includes a method that enables you to cause an application domain to be unloaded from your unmanaged hosting code.

^[1] The term "manager" as used here can be a bit confusing given the context in which we've used it in the rest of the chapter. An application domain manager isn't a "manager" as specifically defined by the CLR hosting interfaces. Instead, it is a managed class that you implement to customize how application domains are used within a process.

Application domains are such a central concept to hosts and other extensible applications that I dedicate two chapters to them. The first chapter (Chapter 5) provides an overview of application domains and provides guidelines to help you use them most effectively. The second chapter (Chapter 6) describes the various ways you can customize application domains to fit your application's requirements most closely.

CLR Event Handling

Hosts can register a callback with the CLR that gets called when various events happen when running managed code. Through this callback, hosts can receive notification when the CLR has been disabled in the process (that is, it can no longer run managed code) or when application domains are unloaded. More details on the CLR event manager are provided in Chapter 5.