Loading Assemblies by Assembly Identity

The .NET Framework class libraries provide several APIs you can use to load an assembly dynamically by assembly identity. Throughout this chapter, I refer to these APIs, as well as the APIs that enable you to load an assembly given a filename, as the assembly loading APIs. The assembly loading APIs that take an assembly identity enable you to specify an assembly's identity in one of two ways. First, you can supply the identity as a string that follows a welldefined format. Or you can supply an instance of the System.Reflection.AssemblyName class. This class contains properties for the textual name, public key, culture, and version components of the assembly identity. In this section I describe how to use the assembly loading APIs to load add-ins into an extensible application dynamically.

Before I get into the details of how to call these APIs, however, it's worth taking a step back and revisiting the extensible application architecture introduced in Chapter 1. Making the most effective use of the assembly loading APIs involves more than just knowing the details of how to call the APIs. Your extensible application will have a much cleaner design and will perform much better if you think through how your use of application domains relates to assembly loading. This involves understanding your application domain boundaries and taking advantage of the application domain manager infrastructure discussed in Chapter 5. By looking back at the basic architecture, you can see how best to take advantage of the assembly loading APIs without introducing unintended side effects such as assemblies loaded into the wrong application domain.

After I've discussed how the assembly loading APIs fit into the overall architecture of an extensible application, I cover the details involved in calling these APIs. In addition to looking at the APIs themselves, I discuss briefly the CLR's rules for locating assemblies and the impact of partially specified references.

Architecture of an Extensible Application Revisited

In Chapter 1, I introduced the typical architecture of an extensible application. In the last few chapters, I've made this architecture more concrete by describing the role that application domains play in applications that are extensible. Application domains exist for one purpose: as containers for assemblies. The main goal of this architecture is to provide the infrastructure in which to load assemblies dynamically. Let me review this architecture now and highlight the key design points that affect how the add-in assemblies are loaded (see Figure 7-2). The key points include the following:

Multiple application domains are used Extensible applications typically create multiple application domains to isolate add-ins (or groups of add-ins) from others loaded in the same process. It's important to call the assembly loading APIs from the domain in which an add-in is to be loaded. This results in the cleanest design and the best overall performance.
An application domain manager is created for each domain In Chapter 5 I introduced the notion of an application domain manager. An application domain manager is a convenient place from which to call the assembly loading APIs because the CLR automatically creates an instance of your application domain manager in each new application domain rather than requiring you to write code to load your domain manager explicitly into each new domain you create.
Add-in assemblies are not loaded in the default application domain Most extensible applications avoid loading add-in assemblies into the default application domain primarily because the default domain cannot be unloaded without shutting down the entire process. As a result, you typically see just the application domain manager class loaded into the default domain. From there, other domains are created to contain the add-ins.
Communication between application domains is limited to the application domain managers In Chapter 5 I discuss how to design your application to make the most effective use of application domains. One design goal is to limit communication between application domains as much as possible. This includes two aspects. First, the volume of communication should be limited both in terms of the number of calls made and the amount of data exchanged by those calls. Limiting the volume of communication helps your application perform better because less cross-domain marshaling is required. The other aspect of cross-domain communication that you aim to limit is the number of assemblies involved in calls across application domains. If an assembly is to participate in a call between two application domains, that assembly must be loaded into both application domains. To be loaded into two application domains, an assembly must be deployed in such a way as to be visible to both domains. Furthermore, both domains must be unloaded to remove the assembly from the process completely. In short, the fewer assemblies that are involved in cross-domain communication, the better, from the perspective of both performance and ease of deployment. Because an application domain manager is automatically loaded into each new domain for you, it must already be deployed in such a way that it is visible to multiple application domains. So it's natural to use application domain managers to communicate across application domains.

Figure 7-2. The architecture of an extensible application

As described, the primary goal to keep in mind when designing your assembly loading infrastructure is always to call the assembly loading APIs from the application domain in which you intend the add-in assembly to be loaded. This design is shown in Figure 7-2 by the calls to AppDomain.Load originating in the application domain manager and resulting in the add-in assembly being loaded into the same domain. To get a clear picture of why this design goal is desirable, take a look at how assemblies are represented in the .NET Framework class libraries and how that representation relates to the CLR's infrastructure for calling methods on a class in a different application domain.

System.Reflection.Assembly and CLR Remote Calls

Recall from Chapter 5 that calling a method on a type in another application domain is a remote call. The mechanics for a remote call are different depending on the marshaling characteristics of the type you are calling. Generally speaking, types are either considered marshaled by value or marshaled by reference in CLR remoting terminology. Types that are marshaled by reference are those types that derive from the System.MarshalByRefObject base class. When you call a method on a type derived from MarshalByRef in a different application domain, the CLR creates a proxy for that type in the calling application domain. All calls are made through the proxy to the actual type as shown in Figure 7-3.

Figure 7-3. Calling a MarshalByRefObject in a different application domain

In contrast, when a call is made to an object in another application domain that is marshaled by value, a copy of the instance is made in the calling domain. All objects that are marshaled by value must be marked with the [Serializable] custom attribute so the CLR knows how to transfer the object into the new domain. All calls on the type are made to the copy instead of through a proxy to the original as shown in Figure 7-4.

Figure 7-4. Calling a type in a different application domain that is marshaled by value

At this point, you might be wondering what this discussion about CLR remoting has to do with loading assemblies. This matters because the type used to represent assemblies in the .NET Framework class libraries, System.Reflection.Assembly, is marshaled by value, not by reference. Because instances of System.Reflection.Assembly are copied between application domains, it's easy to inadvertently end up loading an assembly into an application domain unintentionally. Look at a concrete example to see how easy it is to make this mistake.

In Chapters 5 and 6, we used a CLR host called boatracehost.exe as an example of how to make effective use of application domains in an extensible application. We continue that example in this chapter as we discuss how to use the assembly loading APIs in conjunction with application domains. As described, you can use several APIs in the .NET Framework class libraries to load assemblies dynamically. AppDomain.Load is one of these methods that enables you to load an assembly into an application domain. Let's say for purposes of example that a new boat is entering a race hosted by boatracehost. We'd like to load this add-in into a new application domain, so we use AppDomain.CreateDomain to create the new domain. We then call AppDomain.Load to load the boat add-in in the new domain as shown in the following code:

static void Main(string[] args) {    AppDomainSetup adSetup = new AppDomainSetup();    adSetup.ApplicationBase = @"c:\Program Files\BoatRaceHost\Addins";    AppDomain ad = AppDomain.CreateDomain("Alingi Domain",                                           null,                                           adSetup);    Assembly alingiAssembly = ad.Load("Alingi, Version=5.0.0.0,                                      PublicKeyToken=5cf360b40180107c,                                      culture=neutral"); }

The call to AppDomain.Load in the preceding code is a remote call from the default application domain (in which main() is running) to the new domain held in the variable of type AppDomain named ad. AppDomain.Load takes as input the name of the assembly we'd like to load and returns an instance of System.Reflection.Assembly. Because Assembly is marshaled by value, a copy of the instance of the Assembly type is made in the default domain when the call to App-Domain.Load returns as shown in Figure 7-5.

Figure 7-5. An assembly inadvertently loaded into two application domains

Instances of Assembly contain data describing the underlying assembly they represent. For example, given an instance of Assembly, you can determine the assembly's name, the assemblies it depends on, and so on. The underlying assembly represented by an instance of Assembly must be loaded into the application domain where the instance resides. In this example, this means that the Alingi assembly must be loaded into both the default application domain and the Alingi Domain. This side effect of calling AppDomain.Load affects our application design in a few key ways. First, the fact that the add-in assembly has been loaded into our default application domain means we can't unload that assembly from our application without terminating the entire process. This is clearly undesirable from both the perspectives of memory usage and type visibility. Because we can never unload the assembly, we might be stuck dealing with the additional memory it consumes even when we no longer need the assembly within the application. Also, once an assembly is loaded into a given application domain, it can discover all other assemblies in that same domain using the GetAssemblies method on the AppDomain type. Once another assembly is discovered, it can be reflected upon using the types in the System.Reflection namespace. If the code access security policy for that domain isn't configured to disallow it, code in an add-in assembly could even invoke methods on any other assembly loaded in the same domain.

The other reason loading an add-in into the default application domain affects our design is that it complicates the deployment of the add-in. Recall from Chapter 6 that each application domain has an ApplicationBase that establishes a root directory in which assemblies for that domain can be deployed. In the preceding code sample, the ApplicationBase for Alingi Domain has been set to c:\program files\boatracehost\addins. By deploying the Alingi assembly to that directory, it is found by the CLR when we call AppDomain.Load. However, because we've also inadvertently added Alingi to the default application domain, the add-in must be deployed to a location where the CLR will find it for that domain as well. This means deploying the add-in to another ApplicationBase or adding it to a global location such as the global assembly cache (GAC). This subtlety often results in unexpected failures to load an assembly. For example, in looking at the previous code, it's obvious that we need to deploy our add-in to c:\program files\boatracehost\addins. However, if we did just that, we'd get a FileNotFoundException telling us that the assembly we're loading cannot be found. When I see these errors, I typically look in the directory in which I expect the assembly to be found, and, seeing it there, I'm at a loss for a few minutes before I realize that the CLR is trying to load my assembly into an application domain I never intended. Because of all this, it is far better to call the assembly loading APIs from within the domain in which you intend the add-in to be loaded.

Recommendations for Loading Assemblies in Extensible Applications

Most extensible applications leverage the application domain manager concept introduced in Chapter 5 to load add-ins from within the desired application domain. As described, the application domain manager is a natural place from which to initiate assembly loads because the CLR takes care of creating an instance of the application domain manager in each new application domain you create. In leveraging this design, most extensible applications follow a series of steps similar to the following when loading a new add-in to the application:

1.	The extensible application is made aware of the new add-in.
2.	An application domain is chosen in which to load the new add-in.
3.	The application domain manager in the target domain is called to load the add-in.
4.	The application domain manager in the target domain loads the add-in.

These steps are described in the following sections.

Step 1: The Extensible Application Is Made Aware of the New Add-In

The means by which add-ins are introduced to an extensible application are completely up to the application. So there is no general approach to recommend. Instead, I discuss some common examples.

Typically, an extensible application either presents a user interface or provides a configuration system that enables a user to add a new add-in to the application. For example, new managed types, procedures, and so on are added to SQL Server by editing the SQL catalog, whereas some graphical applications include dialog boxes that enable users interactively to specify the add-ins they'd like to load. In other examples, add-ins are specified in code that the application interprets and runs. For example, add-ins are included in client-side Web pages using the <object> tag in a Hypertext Markup Language (HTML) source file.

Step 2: An Application Domain Is Chosen in Which to Load the New Add-In

In Chapter 5 I discuss the criteria to consider when partitioning a process into multiple application domains. These criteria include the need to isolate assemblies from others that are loaded in the same process, to unload code dynamically from a running process, and to limit the amount of communication that occurs between objects loaded in different application domains. When a new add-in is introduced to your extensible application, you must examine the add-in and load it into an application domain that meets your requirements for partitioning. Depending on your scenario, you might load the add-in into an existing application domain, or you might create a new one in which to load the add-in. For example, in Chapter 5 I describe how the Microsoft Internet Explorer host partitions a process into application domains based on Web sites. That is, all controls that are downloaded from the same site are loaded into the same application domain. As a result, when Internet Explorer comes across a reference to a control while parsing a Web page, it looks to see if it has already created an application domain corresponding to the site from which the control originates. If it has, the control is loaded into that domain. If not, a new application domain is created in which to load the control. Your application will likely follow similar logic when deciding how to load a new add-in. Most extensible applications keep an internal data structure that holds the list of application domains in the process along with some descriptive data for each domain that is used to determine the appropriate domain for new add-ins (in the Internet Explorer case, this extra piece of data is the name of a Web site).

Step 3: The Application Domain Manager in the Target Domain Is Called to Load the Add-In

After you've chosen an application domain in which to load the new add-in, you must transfer control into that target domain so the actual loading of the assembly can take place. As described, calling the assembly loading APIs from within the domain in which you'd like the add-in to run makes for a cleaner design. The easiest way to transition into a different application domain is to call a method on the application domain manager in the target domain. Look at some code from our boatracehost to see how this is done. Recall from Chapter 5 that the application domain manager for boatracehost is implemented in a class called BoatRaceDomainManager. BoatRaceDomainManager derives from an interface called IBoatRaceDomainManager, which includes a method called EnterBoat that we'll use to load a new add-in into the application. Here's a portion of BoatRaceDomainManager and the interface it derives from:

   public interface IBoatRaceDomainManager    {       // loads the boat identified by boatTypeName from the       // assembly in assemblyName into the application domain       // in which this instance of the domain manager is       // running.       void EnterBoat(string assemblyName, string boatTypeName);    }    public class BoatRaceDomainManager : AppDomainManager,                                         IBoatRaceDomainManager    {       void EnterBoat(string assemblyName, string boatTypeName)       {          // load the boat into this application domain...       }    }

The following code uses the BoatRaceDomainManager class to load an assembly into a new application domain:

   AppDomainSetup adSetup = new AppDomainSetup();    adSetup.ApplicationBase = @"c:\Program Files\BoatRaceHost\Addins";    AppDomain ad = AppDomain.CreateDomain("Alingi Domain",                                        null,                                        adSetup);    BoatRaceDomainManager adManager = (BoatRaceDomainManager)ad.DomainManager;    adManager.EnterBoat("AlingiBoat", "Alingi, Version=5.0.0.0,                         PublicKeyToken=5cf360b40180107c,                         culture=neutral);

In this example, we use the DomainManager property on System.AppDomain to get the instance of BoatRaceDomainManager that the CLR has created for us in the new domain. Given our domain manager instance, we simply call the EnterBoat method to transition into the new application domain.

Step 4: The Application Domain Manager in the Target Domain Loads the Add-In

Once inside the new application domain, using the assembly loading APIs to load the add-in is easy. Just as we used the AppDomain.Load method earlier in the chapter to load an assembly into a different application domain, you can use it now to load an assembly in the domain in which you're running. The application domain manager from boatracehost does just this. The implementation of BoatRaceDomainManager.EnterBoat determines the current application domain using the static CurrentDomain property on System.AppDomain. It then calls the AppDomain.Load method, passing in the name of the assembly to load as shown in the following code:

   public class BoatRaceDomainManager : AppDomainManager,                                         IBoatRaceDomainManager    {       void EnterBoat(string assemblyName, string boatTypeName)       {          // load the assembly containing boat into this          // application domain...          Assembly alingiAssembly = AppDomain.CurrentDomain.Load(assemblyName);          // load the type from the new assembly...       }    }

Using Assembly.Load and Related Methods

Now that I've shown how best to make use of the assembly loading APIs within your application, let's dig into the details of the APIs themselves. As described, several methods in the .NET Framework provide the ability to load an assembly dynamically given an assembly identitythe AppDomain.Load method used in the previous section is just one such API. In some cases, multiple APIs provide the same functionality and are therefore redundant, but in other cases the APIs offer different capabilities. For example, some APIs enable you to load an assembly into a different application domain, whereas some load an assembly only into the current domain. Following are the methods in the .NET Framework that enable you to load an assembly and brief descriptions of each method's capabilities. Keep in mind that these are the APIs that enable you to load an assembly given an assembly name. Several APIs enable you to load an assembly by providing the name of the file containing the manifest. I cover these APIs later on in the section, "Loading Assemblies by Filename."

AppDomain.Load This is the only method that enables you to load an assembly into an application domain other than the one in which you're currently running. As discussed earlier in this chapter, it's easy to load an assembly into the current application domain inadvertently if you're not careful.
The overloads for AppDomain.Load are as follows:
```
public Assembly Load (AssemblyName assemblyRef) public Assembly Load(AssemblyName assemblyRef, Evidence assemblySecurity) public Assembly Load(String assemblyString)
```
AppDomain.ExecuteAssemblyByName This is the only method in the group that causes code to be executed when it is called. ExecuteAssemblyByName is used to launch managed executable files programmatically. You provide the pathname to the executable, and the CLR runs its main() method. This method doesn't return until the executable has finished running.
The overloads for AppDomain.ExecuteAssemblyByName are as follows:
```
public int ExecuteAssemblyByName(String assemblyName) public int ExecuteAssemblyByName(String assemblyName,    Evidence assemblySecurity) public int ExecuteAssemblyByName(String assemblyName,    Evidence assemblySecurity,    String[] args) public int ExecuteAssemblyByName(AssemblyName assemblyName,    Evidence assemblySecurity,    String[] args)
```
Assembly.Load This is the most commonly used API for loading an assembly into the current application domain. Because it is static, there's no way to use this method to load an assembly in an application domain other than the one in which you're currently running.
The overloads for Assembly.Load are as follows:
```
public static Assembly Load(String assemblyString) public static Assembly Load(string assemblyString, Evidence assemblySecurity) static public Assembly Load(AssemblyName assemblyRef) static public Assembly Load(AssemblyName assemblyRef,    Evidence assemblySecurity)
```
Assembly.LoadWithPartialName This method has been deprecated in .NET Framework 2.0 and will be removed in a future version of the .NET Framework. Assembly.LoadWithPartialName enables you to load a strongly named assembly from the GAC using a partial reference. This was commonly used to implement a use latest version policy, whereby the caller would omit a version number from the reference and would load the latest version of the assembly from the GAC that matched the name and public key. Blindly loading the latest version of a shared assembly brings back the world of DLL Hell by exposing you to versioning conflicts between different releases of an assembly. For that reason, this method is being removed from the .NET Framework.
The overloads for Assembly.LoadWithPartialName are as follows:
```
static public Assembly LoadWithPartialName(String partialName) static public Assembly LoadWithPartialName(String partialName,    Evidence securityEvidence) static public Assembly LoadWithPartialName(String partialName,    Evidence securityEvidence, bool oldBehavior)
```

As you can see from the preceding list, the assembly loading APIs enable you to specify the assembly to load either by supplying its identity as a string or by providing an instance of System.Reflection.AssemblyName. In addition, each API has an overload that lets you associate security evidence with the assembly you are loading. I cover the details of using this parameter in Chapter 10.

Note

It's often the case that the first thing you'd like to do after loading an assembly is create a type from that assembly. To support this scenario, the .NET Framework provides several APIs that enable you to load an assembly and create a type with a single method call. When using these APIs, you pass the name of the type you'd like to create in addition to the name of the assembly you'd like to load. These convenience methods eliminate a lot of boilerplate code you'd find yourself writing over and over again. The methods that enable you to load an assembly and create a type are these:

System.AppDomain.CreateInstance
System.AppDomain.CreateInstanceAndUnwrap
System.Activator.CreateInstance

With respect to assembly loading, these methods work just like the ones in the preceding list, so I don't talk about them explicitly in this chapter. Documentation of these methods can be found in the .NET Framework SDK.

Specifying Assembly Identities as Strings

When specifying an assembly identity by string, you must follow a well-defined format that the CLR understands. This format enables you to specify all four parts of an assembly's name: the friendly name, version number, culture, and information about the public portion of the cryptographic key pair used to give the assembly a strong name. The string form of an assembly is as follows:

"<friendlyName>, Version=<version number>, PublicKeyToken=<publicKeyToken>,    Culture=<culture>"

When specifying identities in this format, the <friendlyName> portion of the identity must come first. The PublicKeyToken, Version, and Culture elements can be specified in any order. Strings that follow this format are passed directly to the assembly loading APIs as shown in the following simple example:

    public class BoatRaceDomainManager : AppDomainManager,                                          IBoatRaceDomainManager     {        void EnterAlingi()        {           // load the assembly into this application domain...           Assembly a = Assembly.Load("Alingi, Version=5.0.0.1,                                       PublicKeyToken=3026a3146c675483,                                       Culture=neutral");           // load the type from the new assembly...         }      }

As I explained earlier, it is possible to reference an assembly by supplying less than the full identity. I cover the details of how such references work in the section "Partially Specified Assembly References" later in the chapter.

Specifying assembly identities using the string format is generally straightforward as long as the CLR can correctly parse the string you supply. Any extra characters in the string (such as duplicate commas) or misspelled element names will cause a FileLoadException exception to be raised and your assembly will not be loaded.

Note

The CLR error checking process when parsing assembly identities in .NET Framework 2.0 is much stricter than it was in previous versions of the CLR. For example, any extra characters or unrecognized element names (such as those caused by misspellings) were simply ignored instead of flagged as errors in previous versions. As always, make sure you thoroughly test your application on all versions of the CLR you intend to support to catch subtle differences like this.

Failures to load an assembly because of errors in parsing the assembly identity are easy to diagnose because the instance of FileLoadException that is thrown contains a specific message and HRESULT. The HRESULT indicating a parsing error is 0x80131047 (this error code is defined as FUSION_E_INVALID_NAME in the file corerror.h from the include directory in the .NET Framework SDK). The message property of the exception will say, "Unknown error -HRESULT 0x80131047."

In addition to forming the string correctly, it's important that the values you supply for each element are valid. The following points summarize the valid values for friendly name, culture, and version.

Friendly name Friendly names can contain any characters that are valid for naming files in the file system. Friendly names are not case sensitive.
Version Assembly version numbers consist of four parts as specified in the following format:
```
Major.minor.build.revision
```
When specifying a version number, it's best to include all four parts because the CLR will make sure that the version number of the loaded assembly exactly matches the version number you specify. You can omit values for some portions of the version number, but doing so results in a partial reference. When resolving an assembly based on a partial version number reference, the CLR matches only those portions of the version number you provide. This looseness in binding semantics can cause you to load an assembly inadvertently. For example, given the following reference:
```
"Alingi, Version=5, PublicKeyToken=3026a3146c675483, Culture=neutral"
```
the CLR only makes sure that the major number of the assembly you load is 5none of the other portions of the version number are checked. In other words, the first assembly found in the application directory (the global assembly cache is not searched when resolving a partial reference) whose major number is 5 will be loaded regardless of the values for the other three portions of the version number. I cover partial references in more detail later in the chapter.
Culture Values for the culture element of the assembly name follow a format described by RFC 1766. This format includes both a code for the language and a more specific code for the region. For example, "de-AT" is the culture value for German-Austria, whereas "de-CH" represents German-Switzerland. See the documentation for the System.Globalization.CultureInfo class in the .NET Framework SDK for more details.

The public key token portion of an assembly name requires a bit more explanation. Typing an entire 1024-bit (or larger) cryptographic key when referencing an assembly would be overly cumbersome. To make referencing strong-named assemblies easier, the CLR enables you to provide a shortened form of the key called a public key token. A public key token is an 8-byte value derived by taking a portion of a hash of the entire public key. Fortunately, the .NET Framework SDK provides a tool called the Strong Name utility (sn.exe) so you don't have to be a cryptography wizard to obtain a public key token. The easiest way to obtain the public key token from an assembly is to use the -T option of sn.exe. For example, issuing the following command at a command prompt:

C:\Projects\Alingi> sn T Alingi.dll

yields the following output:

Microsoft (R) .NET Framework Strong Name Utility Version 2.0.40301.9 Copyright (C) Microsoft Corporation 1998-2004. All rights reserved. Public key token is 3026a3146c675483

From here, you can paste the public key token value into your source code.

Specifying Assembly Identities Using System.Reflection.AssemblyName

Calling the assembly loading APIs by passing the assembly identity as a string is the most common approach because it's so easy to use. However, as shown earlier, most of the assembly loading APIs also allow you to pass an instance of System.Reflection.AssemblyName to identify the assembly you want to load. AssemblyName has properties and methods that enable you to specify those elements of the assembly identity you wish to load in your assembly. Table 7-1 describes these members.

Table 7-1. Members of System.Reflection.AssemblyName Used to Load Assemblies
AssemblyName Member	Description
Name	A string used to specify the assembly's friendly name
Version	An instance of System.Version that identifies the version of the assembly you'd like to load
CultureInfo	An instance of System.Globalization.CultureInfo that describes the assembly's culture
SetPublicKey SetPublicKeyToken	Methods that accept an array of System.Byte and that hold either the public key or the public key token of the assembly you wish to load

The following example shows how to call the assembly loading APIs by passing an instance of AssemblyName. In this example, I specify a partial reference using the friendly name only by constructing a new instance of AssemblyName, setting its Name property, and passing it to Assembly.Load:

public class BoatRaceDomainManager : AppDomainManager,                                      IBoatRaceDomainManager    {        void EnterBoat()        {           // load the assembly into this           // application domain...           AssemblyName name = new AssemblyName();           name.Name = "Alingi";              Assembly a = Assembly.Load(name);           // load the type from the new assembly        }    }

How the CLR Locates Assemblies

The CLR follows a consistent, well-defined set of steps to locate the assembly you've specified when calling one of the assembly-loading APIs. These steps are different based on whether you are referencing a strong-named assembly or a weakly named one.

Note

All aspects of the CLR's behavior for loading assemblies can be customized by you as the author of an extensible application. In Chapter 8, I write a CLR host that shows the extent of customization possible.

Understanding the steps the CLR follows to load an assembly is essential when building an extensible application that can work well in a variety of add-in deployment scenarios. Several factors influence both the version and the location of the assembly the CLR loads given your reference. Some factors are aspects of the deployment environment that you can control, such as the base directories for the application domains you create. Other factors, such as the specification of version policy by an administrator or the author of a shared assembly, are beyond your control. Fortunately, great tools are available for you to understand how the CLR locates assemblies and diagnose any problems you might encounter. I cover how to use these tools after describing the steps the CLR uses to locate assemblies.

The factors the CLR considers when resolving an assembly reference include deployment locations and the presence of any version policy or assembly codebase locations as shown in Figure 7-6 and described in the following points.

Figure 7-6. Factors that influence how the CLR locates assemblies

ApplicationBase As described in Chapter 6, an application domain's ApplicationBase establishes a root directory in which the CLR looks for assemblies intended to be private to that domain. You'll almost always want to set this property when creating a new application domain.
Global assembly cache The GAC is a repository for assemblies that are meant to be shared by several applications. The CLR looks in the GAC first when resolving a reference to a strong-named assembly, as I discuss in a bit.
Version policy As described, version binding redirects can be specified either in an application configuration file, by the publisher of a strong-named assembly, or by the administrator of the machine. As the creator of an application domain, you can completely control whether application-level version redirects existyou can turn off such version redirects either by not specifying a ConfgurationFile for your application domain or by setting the DisallowBindingRedirects property described in Chapter 6. However, there is no way for you to control whether binding redirects specified by the machine administrator are applied. It is possible that the CLR will load a version of an assembly other than the one you specify.
Codebases The same configuration files used to specify version policy can also be used to provide a codebase location at which a given version of an assembly can be found. This is done using the <codebase> XML element (for more information on using <codebase> to supply an assembly location, see the .NET Framework SDK documentation). As with version redirects, you can control whether an application configuration file can be used to supply a codebase, but you can't prevent a codebase location from being supplied by an administrator. Therefore, it is possible that the CLR will load a given assembly from a location other than what you expect. Later in the chapter, I show you how to determine where an assembly was loaded by using the properties and methods of the Assembly class.

How the CLR Locates Assemblies with Weak Names

Weakly named assemblies can be loaded only from an application domain's ApplicationBase or a subdirectory thereof. As a result, the CLR's rules for finding such an assembly are relatively straightforward. The CLR follows two steps when resolving a reference to an assembly with a simple name:

1.	Look for a codebase in the application configuration file.
2.	Probe for the assembly in the ApplicationBase and its subdirectories.

Step 1 rarely applies when loading add-ins into extensible applications. This is primarily because authoring a configuration file to specify a location for an assembly requires up-front knowledge that such an assembly will be loaded. As I mentioned, this isn't the case with extensible applications because the add-ins are typically loaded dynamically. So the only way to use a configuration file to specify a codebase in this case is if somehow the configuration file was shipped along with the add-in and you set the ConfigurationFile property of your application to use it. Although possible, this scenario is unlikely to occur in practice.

Given that step 1 isn't likely to apply, loading weakly named add-ins into extensible applications typically involves looking for the assembly in the ApplicationBase and its subdirectories. This process, termed probing, is described in detail in Chapter 6. Remember, too, that weakly named assemblies are loaded by name onlyno other elements of the assembly name, such as the assembly's version, are checked.

Note

It is possible to end up loading a strongly named assembly given a reference that appears to be to a weakly named assembly. This happens if you have a strong-named assembly deployed somewhere in your ApplicationBase directory structure whose friendly name matches the name you are referencing using one of the assembly loading APIs. For example, given the following reference:

Assembly a = Assembly.Load("Alingi");

The CLR will load the first file it finds named alingi.dll, regardless of whether it has a strong name or a weak name. If the assembly it loads has a strong name, the CLR essentially starts over by taking the identity of the assembly and loading and treating that as a strong-name reference to resolve. In the next section I describe the steps involved in loading a strong-named assembly. A strong-named assembly could get loaded given the preceding reference because that reference is partialno value is supplied for the public key token. This situation would not occur in scenarios in which the reference is fully specified, such as when an assembly is referenced in an early-bound fashion. If you want to be sure that only a weakly named assembly is loaded in this case, you must specify a null public key token like this:

Assembly a = Assembly.Load("Alingi, PublicKeyToken=null");

How the CLR Locates Assemblies with Strong Names

The process of loading a strongly named assembly is much more involved because of the potential for version policy and the existence of the GAC. The CLR takes the following steps to resolve a reference to a strong-named assembly:

1.	Determine which version of the assembly to load.
2.	Look for the assembly in the GAC.
3.	Look in the configuration files for any codebase locations.
4.	Probe for the assembly in the ApplicationBase and its subdirectories.

When loading a strongly named assembly, by default the CLR loads the version you specify in your reference. However, as described earlier in the chapter, that version can be redirected to another version of the same assembly by one of the three levels of version policyapplication, publisher, or administrator. The first step the CLR takes in resolving a reference to a strongly named assembly is to compare the identity specified in the reference to the binding redirect statements in the three version policy files to determine whether an alternative version of the assembly should be loaded.

Next, the CLR looks for the assembly in the GAC. The CLR always prefers to load strongnamed assemblies from the GAC primarily for performance reasons. There are a few different reasons why loading from the GAC is better for overall system performance. First, if several applications are using the same strong-named assembly, loading the assembly from the same location on disk uses less memory than if each application were to load the same DLL from private locations. When a DLL is loaded from the same location multiple times, the operating system loads the DLL's read-only pages only once and shares them among all instances. The second reason is related to how an assembly's strong name is verified. Recall that a strong name involves a cryptographic signature. This signature must be verified to guarantee that the assembly hasn't been altered since it was built. Verifying a cryptographic signature involves computing a hash of the entire contents of the file and other mathematically intense operations. As a result, it's best to verify the signature at a time when its cost will be noticed the least (without compromising security, of course). An assembly's strong name is verified during installation into the GAC. The cache is considered secure, so once the assembly has been successfully installed, its signature doesn't have to be reverified. In contrast, because assemblies placed elsewhere in the file system (such as in an ApplicationBase directory) aren't explicitly installed into a secure location, their strong-name signatures must be verified every time the assembly is loaded. By loading from the GAC, the CLR attempts to reduce the number of times these cryptographic signatures must be verified.

If an assembly cannot be found in the GAC, the CLR next looks to see whether a codebase location for the assembly has been provided in any of the configuration files. If such a location is found, the CLR uses it. If not, the CLR probes in the ApplicationBase directory just as it does for simply named assemblies.

Using System.Reflection.Assembly to Determine an Assembly's Location on Disk

Once an assembly has been loaded, you can use the CodeBase, Location, and GlobalAssemblyCache properties of the Assembly class to determine information about where the CLR found it.

The Location and CodeBase properties are very similar in that they both provide information about the physical file from which the assembly was loaded. In fact, these two properties have the same value when an assembly is loaded from the local computer's disk into an application domain that does not have shadow copy enabled. In this scenario, these two properties simply give you the name of the physical file on the local disk from which the assembly was loaded. The CodeBase and Location properties differ in two scenarios, however. First, if the assembly was downloaded from an HTTP server, the CodeBase property gives the location of the file on the remote server, whereas the Location property gives the location of the file in the downloaded files cache on the local machine. These properties also have different values if an assembly is loaded into an application domain in which shadow copy is enabled. In this scenario, CodeBase gives you the original location of the file, whereas Location tells you the location to which the file was shadow copied. See Chapter 6 for more information about how to enable shadow copy for the application domains you create.

Note

The Assembly class also has a property called EscapedCodeBase that gives you the same pathname as CodeBase, except the value returned has the original escape characters.

The GlobalAssemblyCache property is a boolean value that tells you whether the CLR loaded the assembly from the GAC.

Using Fuslogvw.exe to Understand How Assemblies Are Located

The .NET Framework SDK includes a tool called the Assembly Binding Log Viewer (fuslogvw.exe) that is great not only for diagnosing errors encountered when loading assemblies, but also to help understand the assembly loading process in general. Fuslogvw.exe works by logging each step the CLR completes when resolving a reference to an assembly. These logs are written to .html files that can be viewed using the fuslogvw.exe user interface. The logging is turned off by default because of the expense involved in generating the log files. You can turn on logging in one of two modes: you can choose to log every attempt to load an assembly or log only those attempts that fail. Logging is enabled using the Settings dialog box from the fuslogvw.exe user interface as shown in Figure 7-7.

Figure 7-7. Enabling logging using fuslogvw.exe

Take a look at how the output generated by fuslogvw.exe helps you understand how the CLR locates assemblies. After turning logging on, I ran boatracehost.exe and had it load an add-in from an assembly called TeamNZ. In this simple example, fuslogvw.exe logged that we attempted to load three assemblies as shown in Figure 7-8.

Figure 7-8. Fuslogvw.exe after running boatracehost.exe

Double-clicking the row labeled TeamNZ displays the log generated while the CLR resolved the reference to that assembly. The log text is as follows:

0| *** Assembly Binder Log Entry (4/2/2004 @ 4:30:15 PM) *** 1| The operation was successful. 2| Bind result: hr = 0x0. The operation completed successfully. 3| Assembly manager loaded from:    C:\WINDOWS\Microsoft.NET\Framework\v2.0.40301\mscorwks.dll 4| Running under executable    C:\Program Files\BoatRaceHost\BoatRaceHost\bin\Debug\BoatRaceHost.exe 5| --- A detailed error log follows. 6| === Pre-bind state information === 7| LOG: DisplayName = TeamNZ  (Partial) 8| LOG: Appbase = file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/Debug/ 9| LOG: Initial PrivatePath = NULL 10| LOG: Dynamic Base = NULL 11| LOG: Cache Base = NULL 12| LOG: AppName = BoatRaceHost.exe 13| Calling assembly : BoatRaceHost, Version=1.0.1553.29684, Culture=neutral,    PublicKeyToken=null. === 14| LOG: Attempting application configuration file download. 15| LOG: Download of application configuration file was attempted from    file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/    Debug/BoatRaceHost.exe.config. 16| LOG: Application configuration file does not exist. 17| LOG: Using machine configuration file from    C:\WINDOWS\Microsoft.NET\Framework\v2.0.40301\config\machine.config. 18| LOG: Policy not being applied to reference at this time (private, custom,    partial, or location-based assembly bind). 19| LOG: Attempting download of new URL    file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/Debug/TeamNZ.dll. 20| LOG: Attempting download of new URL    file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/    Debug/TeamNZ/TeamNZ.dll. 21| LOG: Assembly download was successful. Attempting setup of file:    C:\Program Files\BoatRaceHost\BoatRaceHost\bin\Debug\TeamNZ\TeamNZ.dll 22| LOG: Entering run-from-source setup phase. 23| LOG: A partially- specified assembly bind succeeded from the application    directory. Need to re-apply policy. 24| LOG: Policy not being applied to reference at this time (private, custom,    partial, or location-based assembly bind).

I annotated the log text with line numbers so we can step through this in detail.

Lines 12 show whether the attempt to load the assembly succeeded. In error conditions, you can look up the HRESULT in the corerror.h file in the .NET Framework SDK to help determine what went wrong. However, the rest of the log explains the failure in detail.

Line 3 shows the directory from which the CLR was loaded. You can use this to determine which version of the CLR was running when this assembly bind was attempted.

Line 4 displays the name of the executable that initiated the assembly load. In our case, the executable is boatracehost.exe.

Line 7 shows the identity of the assembly we are trying to load. In late-bound cases such as this, this is the assembly identity that was passed to the assembly loading APIs. In addition to providing the identity, line 7 tells you whether the reference is partial or fully specified. This particular reference is partial. It was initiated with a simple call to Assembly.Load such as this:

Assembly a = Assembly.Load("TeamNZ");

Line 8 displays the ApplicationBase directory for the application domain in which the assembly load was initiated.

Lines 912 show some of the application domain properties that can affect how assemblies are loaded. These properties are covered in Chapter 6.

Line 13 gives the name of the assembly from which this assembly load was made. This information is useful for debugging in cases in which you might make the same attempt to load an assembly in several places throughout your application.

Lines 1416 show the CLR attempting to find the configuration file associated with the application domain making the request. As described, this configuration file is consulted both for version policy information and for codebase locations.

Line 17 gives the location of the administrator configuration file. Again, this file can contain either version policy or codebase information.

Line 18 states that version policy is not being applied to this reference. In our case, version policy isn't being applied because we have a partial reference. I discuss the output generated when resolving a fully qualified reference to a strong-named assembly in a bit. Version policy gets applied in that example.

Lines 1922 show how the CLR probes for the assembly in the ApplicationBase directory. In this example, you can see that the first attempt to find the assembly failed, but the second one succeeded. The statement "Entering run-from-source setup phase" means that the CLR is loading the assembly directly from its location on disk. In contrast, if the assembly were located on an HTTP server, it would have to be downloaded first before it could be loaded.

Lines 2324 state that the assembly was loaded from the ApplicationBase and that version policy is not being applied. In our case, version policy isn't being applied because the assembly that was found has a weak name. If we had happened to load a strong-named assembly from the ApplicationBase, the CLR would look at the identity of the assembly that was loaded and go back and reapply version policy to determine whether a different version of the assembly should be loaded. If so, it would start the process of finding the assembly over again with the new reference.

You will see two primary differences in the log when you load a strong-named assemblyversion policy is applied to the reference, and the CLR looks in the GAC as shown in the following output from fuslogvw.exe:

0| *** Assembly Binder Log Entry (4/4/2004 @ 12:27:26 PM) *** 1| The operation was successful. 2| Bind result: hr = 0x0. The operation completed successfully. 3| Assembly manager loaded from:    C:\WINDOWS\Microsoft.NET\Framework\v2.0.40301\mscorwks.dll 4| Running under executable    C:\Program Files\BoatRaceHost\BoatRaceHost\bin\Debug\BoatRaceHost.exe 5| --- A detailed error log follows. 6| === Pre-bind state information === 7| LOG: DisplayName = Alingi, Version=5.0.0.0, Culture=neutral,    PublicKeyToken=ae4cc5eda5032777    (Fully specified) 8| LOG: Appbase = file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/Debug/ 9| LOG: Initial PrivatePath = NULL 10| LOG: Dynamic Base = NULL 11| LOG: Cache Base = NULL 12| LOG: AppName = BoatRaceHost.exe 13| Calling assembly : BoatRaceHost, Version=1.0.1555.20566, Culture=neutral,    PublicKeyToken=null. === 14| LOG: Attempting application configuration file download. 15| LOG: Download of application configuration file was attempted from    file:///C:/Program Files/BoatRaceHost/BoatRaceHost/bin/    Debug/BoatRaceHost.exe.config. 16| LOG: Application configuration file does not exist. 17| LOG: Using machine configuration file from    C:\WINDOWS\Microsoft.NET\Framework\v2.0.40301\config\machine.config. 18| LOG: No redirect found in host configuration file. 19| LOG: Machine configuration policy file redirect found: 5.0.0.0 redirected    to 6.0.0.0. 20| LOG: Post-policy reference: Alingi, Version=6.0.0.0, Culture=neutral,    PublicKeyToken=ae4cc5eda5032777 21| LOG: Found assembly by looking in the GAC.

In this example, I used the .NET Framework Configuration tool to specify machine-level version policy to redirect the version of the assembly I'm referencing from 5.0.0.0 to 6.0.0.0. The differences between this assembly load and the previous one are shown in lines 7, 19, 20, and 21.

Line 7 shows that the reference is fully specified. Values are supplied for all four parts of the assembly's name.

Line 19 shows that the CLR found my version policy statement in the machine configuration file.

Line 20 shows my reference after policy has been applied. Notice that the CLR is now looking for version 6.0.0.0 of Alingi.

Line 21 shows that the assembly was found in the GAC.

As you can see, stepping through the logs generated by fuslogvw.exe removes the mystery behind how the CLR locates assemblies. Fuslogvw.exe has several other options I haven't discussed here. See the .NET Framework SDK documentation for more details.

Common Assembly Loading Exceptions

Failures to load assemblies typically show up in your application as one of three types of exceptions:

System.IO.FileNotFoundException The FileNotFoundException is thrown when the assembly you specify in your reference cannot be found by the CLR.

System.IO.FileLoadException As discussed earlier in this chapter, the FileLoadException is thrown when the CLR encounters an error while parsing the assembly name string you passed to one of the assembly loading APIs. This exception is also thrown when the CLR finds an assembly to load, but the assembly it finds doesn't match all of the criteria specified in the reference. This scenario occurs most often when resolving partial references to strong-named assemblies located in the ApplicatonBase directory structure. For example, given the following reference:

Assembly a = Assembly.Load(Alingi, PublicKeyToken=45d39a21bc3ff098 );

the CLR will load the first file named alingi.dll it finds in the ApplicationBase directory structure. If the assembly it loads has a public key other than the one specified by the PublicKeyToken value in the reference, the CLR will throw a FileLoadException stating that the assembly it found doesn't match the reference.

Note

The GAC is not searched in this case because the reference is partial. I explain more about how the CLR resolves partial references such as this later in the chapter (see "Partially Specified Assembly References").

System.BadImageFormatException If the CLR finds a file to load, but the file is not a managed code assembly, a BadImageFormatException is thrown. This doesn't happen often, but could occur if you have a native code file in your ApplicationBase directory structure with a filename matching that of an assembly you are referencing. More commonly, this exception occurs when loading an assembly by a filename as discussed later in the "Loading Assemblies by Filename" section.

All three exceptions have a string property called FusionLog that contains the text of a log file like those you viewed earlier in the discussion of fuslogvw.exe. In this way, you get the diagnostic information about why your call to the assembly loading APIs failed without having to enable logging using the fuslogvw.exe user interface.

Partially Specified Assembly References

As described, only the assembly's friendly name is required when you're using late-bound references. Values for the public key token, version, and culture can be omitted. Such partially specified assembly references are convenient to use, especially when your intent is to load weakly named assemblies, regardless of version, from your application directory. To do so, all you need to do is called Assembly.Load with the assembly's friendly name as I've done several times throughout this chapter:

Assembly a = Assembly.Load(TeamNZ);

However, a few complexities might cause you to load an assembly unintentionally. As always, you can use the fuslogvw.exe tool to find out exactly what's going on.

The following points summarize how the CLR treats a partially specified reference:

A partially specified reference always causes the ApplicationBase directory structure to be searched first. Searching never starts with the GAC. However, if a strong-named assembly is found in the ApplicationBase as a result of a partial reference, the CLR opens the file and extracts the strong-named assembly's full identity. That identity then essentially is treated as a fully specified reference in that the CLR follows all the steps described earlier when looking for a strongly named assembly. Specifically, the CLR will evaluate version policy, look back in the GAC, and so on. If no policy is found, and the assembly is not found in the GAC, the file from the ApplicationBase is loaded. This is another example of how the CLR prefers to load an assembly from the GAC if possible.
If a public key token is specified in addition to the assembly's friendly name, the value you specify is checked against any assemblies found in the ApplicationBase directory structure. If the keys don't match, the CLR throws a FileLoadException stating that the identity of the assembly found didn't match the reference.
If a version is specified in addition to the assembly's friendly name, the behavior is different depending on whether the assembly found in the ApplicationBase has a strong name or a weak name. If the assembly has a strong name, the version number in the reference must match that of the assembly that is loaded. If not, a FileLoadException is thrown. If the assembly that is found has a weak name, the version number is not checkedthe assembly is loaded regardless of version.

Architecture of an Extensible Application Revisited

Figure 7-2. The architecture of an extensible application

System.Reflection.Assembly and CLR Remote Calls

Figure 7-3. Calling a MarshalByRefObject in a different application domain

Figure 7-4. Calling a type in a different application domain that is marshaled by value

Figure 7-5. An assembly inadvertently loaded into two application domains

Recommendations for Loading Assemblies in Extensible Applications

Step 1: The Extensible Application Is Made Aware of the New Add-In

Step 2: An Application Domain Is Chosen in Which to Load the New Add-In

Step 3: The Application Domain Manager in the Target Domain Is Called to Load the Add-In

Step 4: The Application Domain Manager in the Target Domain Loads the Add-In

Using Assembly.Load and Related Methods

Specifying Assembly Identities as Strings

Specifying Assembly Identities Using System.Reflection.AssemblyName

Table 7-1. Members of System.Reflection.AssemblyName Used to Load Assemblies

How the CLR Locates Assemblies

Figure 7-6. Factors that influence how the CLR locates assemblies

How the CLR Locates Assemblies with Weak Names

How the CLR Locates Assemblies with Strong Names

Using System.Reflection.Assembly to Determine an Assembly's Location on Disk

Using Fuslogvw.exe to Understand How Assemblies Are Located

Figure 7-7. Enabling logging using fuslogvw.exe

Figure 7-8. Fuslogvw.exe after running boatracehost.exe

Common Assembly Loading Exceptions

Partially Specified Assembly References