Assemblies | Introducing Microsoft .NET (Pro-Developer)

The .NET Framework makes extensive use of assemblies for .NET code, resources, and metadata. All code that the .NET common language runtime executes must reside in an assembly. In addition, all security, namespace resolution, and versioning features work on a per-assembly basis. Since assemblies are used so often and for so many different things, I need to discuss assemblies in some detail.

.NET makes extensive use of a new packaging unit called an assembly.

Concept of an Assembly

An assembly is a logical collection of one or more EXE or DLL files containing an application’s code and resources. An assembly also contains a manifest, which is a metadata description of the code and resources “inside” the assembly. (I’ll explain those quotes in a second.) An assembly can be, and often is, a single file, either an EXE or a DLL, as shown in Figure 2-4.

click to expand
Figure 2-4: Single-file and multifile assemblies.

When we built the simple example of a time server earlier in this chapter, the DLL that our compiler produced was actually a single-file assembly, and the EXE client application that we built in that example was another one. When you use tools such as Visual Studio .NET, each project will most likely correspond to a single assembly.

Our simple example produced two single-file assemblies.

Although an assembly often resides in a single file, it also can be, and often is, a logical, not a physical, collection of more than one file residing in the same directory, also shown in Figure 2-4. The manifest specifying the files that make up the assembly can reside in one of the code-containing EXEs or DLLs of the assembly, or it can live in a separate EXE or DLL that contains nothing but the manifest. When dealing with a multifile assembly, you must remember that the files are not tied together by the file system in any way. It is entirely up to you to ensure that the files called out in the manifest are actually present when the loader comes looking for them. The only thing that makes them part of the assembly is that they are mentioned in the manifest. In this case, the term assembly, with its connotation of metal parts bolted together, is not the best term. Perhaps “roster” might be a better one. That’s why I put quotes around the term “inside” the assembly a few paragraphs ago. You add and remove files from a multifile assembly using the command line SDK utility program AL.exe, the Microsoft Assembly Linker.

An assembly can also be a logical collection of more than one file.

You can view the manifest of an assembly using the IL Disassembler (ILDASM.exe). Figure 2-5 shows the manifest of our time component. You can see that it lists the external assemblies on which this assembly depends. In this case, we depend on mscorlib.dll, the main .NET common language runtime DLL, and on an assembly called Microsoft.VisualBasic, which contains Visual Basic’s internal functions such as Now. It also lists the assembly names that we provide to the world, in this case, TimeComponent.

You can view an assembly’s manifest with ILDASM.exe

click to expand
Figure 2-5: Assembly manifest of our sample time component.

In addition to the code objects exposed by and required by the assembly, the manifest also contains information that describes the assembly itself. For example, it contains the assembly’s version information, expressed in a standardized format described later in this section. It can also describe the culture (fancy name for human language and sublanguage, say, Australian English) for which the assembly is written. In the case of a shared assembly, of which more anon, the manifest also contains a public cryptographic key, which is used to ensure that the assembly can be distinguished from all other assemblies regardless of its filename. You can even add your own custom attributes to the manifest, which the common language runtime will ignore, but which your own applications can read and use. You set manifest attributes with the Assembly Linker mentioned previously or with Visual Studio.

Assemblies and Deployment

The central question in dividing your code among assemblies is whether the code inside the assembly is intended solely for your own application’s use or will be shared with any other application that wants it. Microsoft .NET supports both options, but it requires more footwork in the latter case. In the case of code that you write for your own applications, say, the calculation engine for a complex financial instrument, you’d probably want to make the assembly private. On the other hand, a general utility object that could reasonably be used by many applications—a file compression engine, for example—might be more widely used if you make it shared.

You need to think carefully about whether your assemblies should be private or public.

Suppose you want your assemblies to be private. The .NET model couldn’t be simpler. In fact, that’s exactly what I did in the simplest example shown previously. You just build a simple DLL assembly, and copy it to the directory of the client assembly that uses it or to a subdirectory of that client. You don’t have to make any entries in the system registry or Active Directory as you had to do when using COM components. None of the code will change unless you change it, so you will never encounter the all-too-familiar situation in which a shared DLL changes versions up or down and your app breaks for no apparent reason.

Assemblies can be private to an application, which simplifies your life in certain cases.

The obvious problem with this approach is the proliferation of assemblies, which was the problem DLLs were originally created to solve back in Windows 1.0. If every application that uses, say, a text box, needs its own copy of the DLL containing it, you’ll have assemblies breeding like bacteria all over your computer. Jeffrey Richter argued (in MSDN Magazine, March 2001) that this isn’t a problem. With 40 gigabyte hard drives selling for under $200 (then; today for $200 you can get 200 GB), everyone can afford all the disk space they need, so most assemblies should be private; that way your application will never break from someone else messing with shared code. That’s like an emergency room doctor saying that the world would be a far better place if people didn’t drink to excess or take illegal drugs. They’re both absolutely right, but neither’s vision is going to happen any time soon in the real world. Richter’s idea is practical for developers, who usually get big, fast PCs, but a customer with a large installed base of two-year-old PCs that it can’t junk or justify upgrading at that point in its budget cycle isn’t going to buy that argument or bigger disks. Fairly soon in your development process, you will need to share an assembly among several applications, and you want the .NET Framework to help you do that.

However, sometimes you want the code in assemblies to be shared.

The .NET Framework allows you to share assemblies by placing them in the global assembly cache (GAC, pronounced like the cartoon exclamation). This is a directory on your machine, currently \winnt\assembly or \windows\assembly, in which all shared assemblies are required to live. You can place assemblies into the cache, view their properties, and remove them from the cache using a .NET Framework SDK command line utility called GACUTIL.exe, which works well when run from scripts and batch files. Most human users will prefer to use the Assembly Cache Viewer, which is a shell extension that installs with the .NET Framework SDK. It automatically snaps into Windows Explorer and provides you with the view of the GAC shown in Figure 2-6.

Shared assemblies live in the global assembly cache, administered by a number of tools.

click to expand
Figure 2-6: Global assembly cache viewer.

Whenever you share any type of computer file, you run up against the problem of name collisions. Because all .NET shared assemblies have to go in the GAC so that they can be managed, we need some way of definitively providing unique names for all the code files that live there, even if their original file names were the same. This is done with a strong name, otherwise known as a shared name. A strong name uses public key cryptography to transparently produce a name that is guaranteed to be unique among all assemblies in the system. The manifest of a shared assembly contains the public key of a public/private key pair. The combination of the file’s name, version, and an excerpt from this public key is the strong name.

Shared assemblies use public key cryptography to ensure that their names are unique.

Suppose we want to write a shared assembly that lives in the GAC. I’ve switched to Visual Studio .NET for this example, both to demonstrate it and because I find it easier to operate than the command line tools. I’ve written a different .NET component that does the same thing as our simplest time example, except that it adds the version number to its returned time string. Once I build the component, I need to generate and assign a strong name for it, also known as signing the component. Visual Studio .NET can be configured to do this automatically if you provide a file containing the public/private key pair. You generate this file with the SDK command line utility program SN.exe. You tell Visual Studio about the key file by specifying the filename in the AssemblyInfo.vb file in the project, as shown in Listing 2-3. When I build the component, Visual Studio .NET signs it automatically. I then manually put it in the GAC by using Windows Explorer.

This paragraph contains instructions for generating a shared assembly.

Listing 2-3: AssemblyInfo.vb file entry specifying key pair for generating strong name.

<Assembly: AssemblyKeyFileAttribute("..\..\mykeys.snk")>

I’ve also provided a client that uses the shared assembly. I tell Visual Studio to generate a reference to the server DLL by right-clicking on the References folder in Solution Explorer, selecting Add Reference to open the Add Reference dialog box (shown in Figure 2-7), and then clicking Browse and surfing over to the shared assembly file that resides in a standard directory. Visual Studio generates a reference accessing that assembly.

This paragraph contains instructions for writing a client that uses an object from the GAC.

click to expand
Figure 2-7: Adding a reference to a shared component.

Visual Studio cannot currently (version 2003) add a reference to an assembly in the GAC, although this feature has been proposed for a future release. This happened because in the first version of Visual Studio .NET, the GAC’s design hadn’t yet stabilized by the time the developers needed to design their reference mechanism. (Why they haven’t fixed this in Visual Studio .NET 2003 isn’t clear.) Therefore, unless they’re building client and server together as part of the same project, developers must install two copies of their components, one in a standard directory to compile against and another in the GAC for their clients to run against. Users will require only the latter. When you add a reference to an assembly marked with a strong name, Visual Studio automatically sets the CopyLocal property of the newly-added reference to False, thereby telling Visual Studio that you don’t want it to make a local copy. It figures that, since the assembly has a strong name and is therefore able to go into the GAC, that you probably want to run with the GAC copy.

As an added benefit of the public key cryptography scheme used for signing shared assemblies, we also gain a check on the integrity of the assembly file. The assembly generator performs a hashing operation on the contents of the files contained in the manifest. It then encrypts the result of this hash using our private key and stores the encrypted result in the manifest. When the loader fetches an assembly from the GAC, it performs the same hashing algorithm on the assembly’s file or files, decrypts the manifest’s stored hash using the public key, and compares the two. If they match, the loader knows that the assembly’s files haven’t been tampered with. This doesn’t get you any real identity checking because you can’t be sure whose public key it really is, but it does guarantee that the assembly hasn’t been tampered with since it was signed.

The public/private key algorithm also provides a check on the integrity of the assembly’s files.

Assemblies and Versioning

Dealing with changes to published code has historically been an enormous problem, often known as DLL Hell. Replacing with a newer version a DLL used by an existing client bit you two ways, coming and going. First, the new code sometimes broke existing applications that depended on the original version. As hard as you try to make new code backward compatible with the old, you can never know or test everything that anyone was ever doing with it. It’s especially annoying when you update a new DLL and don’t run the now-broken old client until a month later, when it’s very difficult to remember what you might have done that broke it. Second, updates come undone when installing an application copies an older DLL over a newer one that’s already on your computer, thereby breaking an existing client that depended on the newer behavior. It happens all the time, when an installation script says, “Target file xxx exists and is newer than the source. Copy anyway?” and 90 percent of the time the user picks Yes. This one’s especially maddening because someone else’s application caused the problem, but your app’s the one that won’t work, your tech support line is the one that receives expensive calls and bomb threats, and you better hope you haven’t sold any copies of the program to the Postal Service. Problems with versions cost an enormous amount of money in lost productivity and debugging time. Also, they keep people from buying upgrades or even trying them because they’re afraid the upgrade will kill something else, and they’re often right.

Versioning of code is an enormous, painful, unsexy problem.

Windows has so far ignored this versioning problem, forcing developers to deal with it piecemeal. There has never been, until .NET, any standardized way for a developer to specify desired versioning behavior and have the operating system enforce it. In .NET, Microsoft seems to have realized that this is a universal problem that can be solved only at an operating system level and has provided a system for managing different versions of code.

.NET finally incorporates some functionality for versioning.

Every assembly contains version information in its manifest. This information consists of a compatibility version, which is a set of four numbers used by the common language runtime loader to enforce the versioning behavior requested by a client. The compatibility version number consists of a major and minor version number, a build number, and a revision number. The development tools that produce an assembly put the version information into the manifest. Visual Studio .NET produces version numbers for its assemblies from values that you set in your project’s AssemblyInfo.vb (or .cs) file, as shown in Listing 2-4. Command line tools require complex switches to specify an assembly’s version. You can see the version number in the IL Disassembler at the bottom of Figure 2-8. You can also see it when you install the assembly in the GAC, as shown previously in Figure 2-6.

Each assembly contains information telling the runtime what version number it represents.

Listing 2-4: AssemblyInfo.vb file showing version of component assembly.

‘ Version information for an assembly consists of the following ‘ four values: ‘ ‘ Major Version ‘ Minor Version ‘ Revision ‘ Build Number ‘ ‘ You can specify all the values or you can default the Build and ‘ Revision Numbers by using the ‘*’ as shown below: <Assembly: AssemblyVersion("2.0.*")>

click to expand
Figure 2-8: ILDASM showing version of a server component.

The manifest can also contain an informational version, which is a human readable string like “Microsoft .NET 1.1 April 2003.” The informational version is intended for display to human viewers and is ignored by the common language runtime.

When you build a client assembly, you’ve seen that it contains the name of the external assemblies on which it depends. It also contains the version number of these external assemblies, as you can see in Figure 2-9.

Every client assembly contains information about the versions it was built against.

click to expand
Figure 2-9: ILDASM showing required version in a client.

When the client runs, the common language runtime looks to find the version that the client needs. The default versioning behavior requires the exact version against which the client was built; otherwise the load will fail. Since the GAC can contain different versions of the same assembly, as shown in Figure 2-6, you don’t have the problem of a new version breaking old clients, or an older version mistakenly replacing a new one. You can keep all the versions that you need in the GAC, and each client assembly will request and receive the one that it has been written and tested against.

By default, a client requires the exact version of the server against which it was built.

Occasionally this exact-match versioning behavior isn’t what you want. You might discover a fatal defect, perhaps a security hole, in the original version of the server DLL, and need to direct the older clients to a new one immediately. Or maybe you find such a bug in the new server and have to roll the new clients back to use the old one. Rather than have to recompile all of your clients against the replacement version, as would be the case with a classic DLL, you can override the system’s default behavior through the use of configuration files.

You can override default versioning behavior by using configuration files.

The most common way to do this is with a publisher policy, which changes the versioning behavior for all clients of a GAC assembly. You set a publisher policy by making entries in the master configuration file machine.config, which holds the .NET administrative settings for your entire machine. Machine.config is an XML-based file, and you might be tempted to go at it with Notepad or your favorite XML editor. I strongly urge you to resist this temptation; wipe out one angle bracket by accident or get the capitalization wrong on just one name and your entire .NET installation may become unusable (shades of the registry, except no one used Notepad on that, at least not for long). Instead, use the .NET Framework Configuration utility mscorcfg.msc, shown in Figure 2-10, which comes with the .NET SDK. This utility allows you to view the GAC, similar to the Windows Explorer add-in I showed in Figure 2-6. In addition, it allows you to configure the behavior of assemblies in the GAC.

A publisher policy changes the versioning behavior of a GAC assembly for all its clients.

click to expand
Figure 2-10: The .NET Framework Configuration utility.

You set a publisher policy by making your server assembly into a configured assembly, which is a GAC assembly for which a configuration file holds entries that change the assembly’s behavior from default GAC assembly behavior. You do this by right-clicking the Configured Assemblies tree item, selecting Add, and either entering a specific assembly or selecting the assembly you want from the list you’re offered. Once you have made the assembly into a configured assembly, you can then change its behavioral properties by right-clicking on the assembly in the right-hand pane and choosing Properties from the context menu. Figure 2-11 shows the resulting dialog box. You enter one or more binding policies, each of which consists of a set of one or more old versions that the assembly loader will map to exactly one new version. The configuration utility will write these entries into the configuration file in the proper format. When a client creates an object requesting one of the specified older versions, the loader checks the configuration file, detects the publisher policy, and automatically makes the substitution. You can also enter a codebase for an assembly, which tells the loader from where to download a requested version if it isn’t already present on the machine.

You set a publisher policy using the .NET Framework Configuration utility mscorcfg.msc.

click to expand
Figure 2-11: Setting binding policies.

If you need to redirect all the clients of one object version to a different version instead of leaving the original version on the machine for its original clients, then the machine-wide substitution that I just described is probably what you want. Occasionally, however, you might need to tell one old client to use a newer version of its server without changing the behavior of other old clients. You can do this with an application configuration file, which is an XML-based configuration file that modifies the behavior of a single application. The name of this file is the full name of the application that it configures, including the extension, with the additional extension “.config” tacked onto the end (e.g., SharedAssemblyClientVB.exe.config). This file lives in the client application’s own directory. While masochists can produce it by hand, anyone who values her time will use the configuration utility on the client application itself. You add the client application to the Applications folder in the .NET Framework Configuration window. You must then add assemblies individually to the application’s Configured Assemblies section. You aren’t moving the assemblies anywhere, you are simply adding configuration information to the local application’s configuration file. The settings modify the default behavior of the assembly loader when accessed by this client only, even if the assembly lives in the GAC. When you set the configured assembly’s properties, you’ll see an option to allow you to ignore publisher policies. Selecting this option writes this information into your app configuration file, which will cause the loader to give you the app’s original versioning behavior regardless of publisher policies. You can also specify a different version redirection, pointing your client app to someplace completely different. Just so you have an idea of what it looks like internally, Listing 2-5 shows the relevant portions of an application configuration file that ignore publisher policies and provide its own version redirection:

An individual application can override a publisher policy’s versioning behavior with its own configuration file.

Listing 2-5: Sample configuration file.

<configuration> <runtime> <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1"> <publisherPolicy apply="no" /> <dependentAssembly> <assemblyIdentity name="SharedAssemblyComponentVB" publicKeyToken="496ed8bd1d362eb2" /> <publisherPolicy apply="no" /> <bindingRedirect oldVersion="1.0.0.0-1.9.9.9" newVersion="2.0.0.0" /> </dependentAssembly> </assemblyBinding> </runtime> </configuration>

Tips from the Trenches

My customers report that they often use the publisher policy to redirect all clients of a server version, but they almost never override the publisher policy with a private configuration file. “But what happens when we change the file but don’t change the version number?” they often ask. One word: Don’t. A new file means a new version. Stick to that.