Section 2.2. Packaging and Deployment: Assemblies | Programming .NET Components, 2nd Edition

2.2. Packaging and Deployment: Assemblies

.NET assemblies were developed to try to improve the ways previous technologies packaged and deployed components. To make the most of .NET assemblies, it's best to first understand the rationale behind them. Understanding the "why" will make the "how" a lot easier.

2.2.1. DLLs and COM Components

Microsoft's first two attempts at component technologies (first, raw DLLs exporting functions, and then later, COM components) used raw executable files for storing binary code. In COM, component developers compiled their source code, usually into a DLL or sometimes into an EXE, and then installed these executables on the customer's machine. Higher-level abstractions or logical attributes shared by all the DLLs had to be managed manually by both the component vendor and the client-side administrator. For example, even though all DLLs in a component-oriented application should be installed and uninstalled as one logical operation, developers had to either write installation programs to repeat the same registration code for every DLL used, or copy them one by one. Most companies didn't invest enough time in developing robust installation programs and procedures, and this in turn resulted in orphaned DLLs bloating the clients' machines after uninstallation. Even worse, after a new version was installed, the application might still try to use the older versions of the DLLs.

Another attribute that should have applied logically to all DLLs that were part of the same application was a version number. Imagine a particular vendor providing a set of interacting components in two DLLs, both labeled version 1.0. When a new version (1.1) of these components became available, the vendor had to manually update the version number of both DLLs to 1.1. A change to the version number of one DLL didn't trigger an automatic change in the other DLL, even though logically both were part of the same deployment unit.

A third logical attribute typically associated with a set of DLLs from the same vendor was their security credentialswhat the DLLs were allowed to access, what the DLLs were allowed to do share with other applications, and so on. The client application administrator needed to manage the way he trusted these components and had to repeat this process for all the DLLs, even though they shared the same security origin. Client-side developers and system administrators used clumsy tools such as DCOMCFG to manage these attributes in a fragile and error-prone manner.

Why not simply put all the components that logically comprise a single deployment unit into the same DLL? The answer is simple: doing so would result in monolithic applications, sacrificing many of the benefits of component-oriented programming. In contrast, when components are deployed in separate DLLs, the client application has to pay the time penalty for loading a DLL only when it requires its component. Moreover, the memory footprint of the components of an application is kept to a minimum, because only the DLLs actually used are kept in memory. If the client application needs to download the DLLs dynamically, it pays the download latency penalty only for those it requires.

2.2.2. .NET Assemblies

Clearly there's a need to separate the logical attributes shared by a set of components (such as version, security, and deployment) from their physical packaging (the file that actually contains each component), while avoiding the problems of traditional DLLs. The solution is the .NET concept of the assembly: a single deployment, versioning, and security unit. The assembly is the basic packaging unit in .NET. It's called an assembly because it assembles multiple physical files into a single logical unit. An assembly can be a class library (DLL) or a standalone application (EXE) and can contain multiple physical modules, each with multiple components. An assembly usually contains just one file (a single DLL or a single EXE), but it still offers the component developer significant versioning, sharing, and security advantages. These are described later in this book.

Think of an assembly as a logical library: a metafile that can contain more than one physical file (see Figure 2-2).

Figure 2-2. Assemblies as logical packaging units

The physical DLLs in an assembly are also referred to as modules. For example, in Figure 2-2, Assembly A contains a single module, while Assembly B contains two. The multi-module assembly option exists to support two scenarios. The first is a pay-as-you-go approach for assembly download, so that when a client downloads an assembly it can download only the required code modules, in a trickle-down fashion. The second scenario enables multi-language, multi-file assemblies: you can develop each module in a different language and simply link them together. As it turns out, these two scenarios are fairly rare. Assemblies are relatively small, and bandwidth today is cheap and readily available. Also, most teams are homogenous when it comes to their programming language, and for many other practical reasons (such as accountability and management), the team boundary is also the assembly boundary. Because of that, .NET doesn't promote the use of multiple modules unnecessarily. In fact, Visual Studio 2005 won't generate multi-module assemblies. To do that, you have to step outside the visual environment, compile your code using a command-line compiler, and then use the Assembly Linker (AL.exe) command-line utility or the MSBuild engine. AL.exe offers switches you can use to incorporate more than one DLL into your assembly. MSBuild is a rich environment that offers some integration with Visual Studio 2005. See the MSDN Library for more information on using AL.exe and MSBuild.

An assembly can contain as many components as required. All code in the assembly is IL code. An assembly can also contain resources such as icons, pictures, or localized strings.

Assemblies and CPU Architectures

In theory, any IL-based assembly can run on any target CPU, because of the two-phase compilation processif the IL has nothing in it that pertains to specific CPU architecture or machine languages, the JIT compiler will generate the machine instructions for the target CPU at runtime. However, in practice, it is possible to write un-portable assemblies. For example, C# lets you explicitly dictate the memory layout of structures. If you use explicit x86 memory layout, your code will not work on Itanium or any other 64-bit machines. In addition, if the assembly imports legacy COM objects it will not work on 64-bit machines, because 64-bit Windows does not support COM natively. For that, your assembly will have to execute in the Win32 emulation environment (known as the WOW, or Windows-on-Windows). However, if you simply load that assembly on a 64-bit Windows machine, it will run in the native 64 environment, not the WOW. The only solution for such CPU-specific assemblies is to incorporate the information on the target CPU into the binary executable that contains the assembly. That way, if an assembly that requires the 32-bit WOW emulation is loaded on a 64-bit machine, the loader will launch it in the WOW, where the 32-bit JIT compiler will compile it correctly.

If you develop an assembly that requires a particular CPU architecture, you need to inform Visual Studio 2005 about that CPU so that it can incorporate the information into the binary. In every project in Visual Studio 2005, in the Build tab under the project properties is the Platform Target drop-down list. The default is AnyCPU, but you can select x86, x64, or Itanium. When you specify a particular CPU, you are guaranteed that the assembly will always execute on that CPU architecture (or an emulation of it).

Applications that display a user interface shouldn't store their resources in the same assembly as the code using those resources. For localization, it's better to generate a separate satellite assembly that contains only resources. You can then generate one such satellite resource assembly per locale (culture) and load the resources from the assembly corresponding to the locale of the specific customer site. Visual Studio 2005 automates most of this process when localizing applications.

2.2.3. Assemblies and Visual Studio 2005

.NET components can reside in either EXE- or DLL-based assemblies. An EXE assembly is called an application assembly, and a DLL assembly is called a library assembly. As a component developer, you will usually develop components that reside in library assemblies. Visual Studio 2005 has a dedicated project template called Class Library that you should use as a starting point for a server-side assembly. A Visual Studio 2005 Class Library project generates a single DLL class library assembly.

All the .NET Framework base classes are available in the form of class libraries, and they can be used by component and client application developers.

To add a binary component to a class library, all you have to do is declare a class in one of the project source files using a .NET-compliant language. For existing Class Library projects, Visual Studio provides an Add Class option and an Add New Item dialog window.

To create a new C# library assembly in Visual Studio 2005, select the File New Project... menu item. When the New Project dialog window appears, select Visual C# under "Project types," then select Windows. Under Windows, select the Class Library template, as shown in Figure 2-3. Name the library MyClassLibrary in the Name box, and specify a location for the project files in the Location box. If you want the solution files to be in a root directory with the project files underneath it, make sure to check the "Create directory for solution checkbox, name the solution, and click OK.

These actions create a project named MyClassLibrary, which should be visible in the Solution Explorer window along with a number of files, including one named Class1.cs. Class1.cs defines a single class named Class1 in the default MyClassLibrary namespace. There is no connection between namespaces and assemblies: a single assembly can define multiple namespaces, and multiple assemblies can all contribute components to the same namespace.

To prepare for Example 2-1, rename Class1.cs to MyClass.cs in the Solution Explorer window and modify the code in the MyClass.cs file (comments excluded) to:

     namespace AssemblyDemo     {        public class MyClass        {           public MyClass(  )           {}           public string GetMessage(  )           {              return "Hello";           }        }     }

Figure 2-3. A Visual Studio 2005 Class Library project

2.2.3.1 Partial types

C# 1.1 requires you to put all the code of a type (a class or a structure) in a single file. C# 2.0 allows you to split the definition and implementation of a class or a struct across multiple filesthat is, you can put one part of a class in one file and another part of the class in a different file. To do so, use the reserved word partial. For example, you can put this code in the file MyClassMethods.cs:

     public partial class MyClass     {        public void Method1(  )        {...}     }

and this code in the file MyClassFields.cs:

     public partial class MyClass     {        public int Number;     }

In fact, you can have as many parts as you like to any given class. Partial types are a very handy feature. It allows segmenting machine-generated code and user-edited code, placing them in separate files. For example, Windows Forms 2.0 uses partial classes for separating the machine-generated code from the developer's part of the form code. ASP.NET 2.0 also uses partial classes, but the machine-generated code is only generated at compile time. A class (or a struct) can have two kinds of aspects or qualities: accumulative and non-accumulative. The accumulative aspects are things that each part of the class can choose to add, such as interface derivations, properties, indexers, methods, and member variables. The non-accumulative aspects are things that all the parts of a type must agree upon, such as whether the type is a class or a struct, type accessibility (public or internal, discussed later), and the base class. For example, the following code does not compile because not all the parts of MyClass concur on the base class:

     public class MyBase     {}     public class SomeOtherClass     {}     public partial class MyClass : MyBase     {}     //Does not compile     public partial class MyClass : SomeOtherClass     {}

When the compiler builds the assembly, it combines the parts of a type from the various files and compiles them into a single type in the IL. The generated IL has no recollection of which part came from which file, just as the IL contains no trace of which file was used to define which type. Also worth noting is that partial types cannot span assemblies, and that a type can refuse to have other parts by omitting the partial qualifier at its definition. Because all the compiler is doing is accumulating parts, a single file can contain multiple parts, even of the same type (although the usefulness of that is somewhat questionable).

2.2.3.2 Adding a reference

Any client, regardless of the assembly in which it resides (be it a class library or an application assembly), can use the MyClass component, but first the client developer needs to import the definitions of the types and components in the server assembly to the client assembly. This import process is called adding a reference to the server assembly. In the client project, select Project Add Reference... to bring up the Add Reference dialog box (see Figure 2-4).

Figure 2-4. The Add Reference dialog

The Add Reference dialog allows client developers to add references to assemblies from five sources. The .NET tab lists predefined .NET class library assemblies. The COM tab lists all the registered COM objects on the machine (each COM component can be treated as a .NET component). The Projects tab lets you add a reference to a library or application project already defined in the client solution. The Browse tab lets you browse to a specified location and select the assembly to add. The Recent tab lists the assemblies that have most recently been browsed to and added, accumulated across all solutions. References made via the Projects tab are not listed on the Recent tab.

The Add Reference dialog is misleading. The dialog allows you to add references only to other assemblies, yet it refers to assemblies as components (under the Component Name column). There is no way in .NET to add a reference to an individual component inside an assembly. Adding a reference is strictly an assembly-level operation.

To demonstrate how you can add a reference and use a component in a class library, follow these steps:

Create a new C# Windows Application project.
Add a reference to the MyClassLibrary assembly.
Add a using statement for the AssemblyDemo namespace.
Add a button to the form.
Add an event handler to the button's Click event.
Use the component in the referenced assembly as if it's defined in the client assembly.

The resulting client-side code should look similar to Example 2-1. Notice that although the MyClass component resides in another assembly, it can be referenced as if it were local to the client code.

Example 2-1. Using a component defined in another assembly

     using System;     using System.Windows.Forms;     g AssemblyDemo;     partial class ClientForm : Form     {        void OnClicked(object sender,EventArgs e)        {           MyClass obj = new MyClass(  );           string nessage = obj.GetMessage(  );           MessageBox.Show(nessage);        }        /* Rest of the client code  */     }

The ClientForm client creates an object of type MyClass using the new operator and retrieves the message string. The client then uses the static method Show( ) of the class MessageBox to display a message box with the message. The MessageBox class is part of the .NET Framework and is defined in the System.Windows.Forms namespace, in the System.Windows.Forms assembly. Note that Example 2-1 includes the using System.Windows.Forms and using AssemblyDemo statements at the beginning of the file. Without these statements, you need to use fully qualified type names (names that include the containing namespace as part of the type declaration).

The important thing about Example 2-1 is the fact that nothing in the client's code indicates that the components it uses come from other assemblies. Once you've added the references, it's as if these components were defined in the client's assembly. As C/C++ programmers will notice, no header, .def, or .lib files are required.

2.2.3.3 The reference path

When you add a reference to an assembly, Visual Studio 2005 remembers that assembly name and location. During compilation, Visual Studio 2005 will use that path to look for an assembly with a matching name and import the type definitions. However, you can override that behavior and provide Visual Studio 2005 with alternative reference locations. To do so, open the project properties and select the Reference Paths pane (see Figure 2-5).

You can add as many folders as needed as additional reference paths. The pane also lets you change the order of the references and update (or edit) a reference. The reference path is an ordered list. Visual Studio 2005 will use the first referenced folder to try to locate as many of the referenced assemblies as possible. If some of the assemblies are not found in the first folder, it will move to the second path and try to locate the missing assemblies, and so on down the list. If an assembly is present in multiple folders, only its first occurrence will be used. If an assembly is not found in any of the referenced folders, Visual Studio 2005 will use the original location specified when the reference was added.

The global Namespace

By default, all C# 2.0 namespaces nest in a root namespace called global. For example, this definition of the class MyClass:

     class MyClass     {}

is identical to this one:

     namespace global     {        class MyClass        {}     }

because both define the class MyClass in the global namespace.

Whenever you reference a type, either using a fully qualified name or via a using statement, C# 2.0 implicitly starts the name-resolution search at the current enclosing namespace. You can explicitly instruct C# 2.0 to start resolving at the global root by using the :: operator. For example, when referencing the type MyClass in the namespace MyNamespace:

     namespace MyNamespace     {        class MyClass        {}     }     global::MyNamespace.MyClass obj;

The global namespace qualifier is instrumental in resolving nested namespaces conflicts. It is possible to have a nested namespace that has the same name as some other global namespace. In such cases, the compiler will have trouble resolving the namespace reference unless you explicitly instruct it to start resolving at the global root.

Consider the following example:

     namespace MyNamespace     {        namespace System        {           class MyClass           {              public void MyMethod(  )              {                 global::System.Diagnostics.Trace.WriteLine("It                                                       Works!");              }           }        }     }

Without the global qualifier, the call to the trace class would produce a compilation errorwhen the compiler tries to resolve the reference to the System namespace it would use the immediate containing scope, which, although it contains a System namespace, does not contain the Diagnostics namespace. The global qualifier instructs the compiler how to correctly resolve the conflict.

Figure 2-5. The Reference Paths pane

The reference path is strictly a build-time entity and has no bearing whatsoever on where the assembly will be loaded from at runtime. Runtime assembly resolution is covered in Chapter 5.

2.2.3.4 Aliasing a reference

When adding an assembly reference, it is possible to create a conflict with another type already defined by your application in another assembly it references. For example, consider the assemblies MyApplication.exe and MyLibrary.dll, both defining the class MyClass in the namespace MyNamespace:

     //In MyApplication.exe     namespace MyNamespace     {        public class MyClass        {...}     }         //In MyLibrary.dll     namespace MyNamespace     {        public class MyClass        {...}     }

Each definition of MyClass is completely distinct, providing different methods and behaviors. If you add a reference to MyLibrary.dll from within MyApplication.exe, when you try to use the type MyClass like so:

     using MyNamespace;     MyClass obj = new MyClass(  );

the compiler will issue an error, because it does not know how to resolve itthat is, it does not know which definition of MyClass is referenced.

C# 2.0 allows you to resolve the conflict by aliasing the assembly reference. By default, all namespaces are rooted in the global namespace (see the sidebar "The global Namespace" if you are not familiar with this term). When you alias an assembly, the namespaces used in that assembly will be resolved under the alias, not under global. To alias an assembly, first add a reference to it in Visual Studio 2005. Next, expand the Reference folder in the Solution Explorer, and display the properties of the referenced assembly (see Figure 2-6).

Figure 2-6. Aliasing an assembly reference

If you added the reference by browsing to the assembly, the Aliases property will be set explicitly to global. If you added the reference by selecting the assembly from the Projects tab, the Aliases value will be empty (but implicitly global). You can specify multiple aliases, but for addressing most conflicts a single alias will do (unless you also have conflicts with other aliases).

Next, add as the first line of the file the extern alias directive, instructing the compiler to include the types from the alias in the search path. You can now refer to the class MyClass from MyLibrary.dll:

     extern alias MyLibraryAlias;         MyLibraryAlias::MyNamespace.MyClass obj;     obj = new MyLibraryAlias::MyNamespace.MyClass(  );

Note that the extern alias directive must appear before any using directives, and that all types in MyLibrary.dll can only be referred to via the alias, because these types are not imported to the global scope.

Use of aliases and fully qualified namespaces may result in exceedingly long lines. As shorthand, you can also alias the fully qualified name:

     using MyLibrary = MyLibraryAlias::MyNamespace;     MyLibrary.MyClass obj;     obj = new MyLibrary.MyClass(  );

2.2.3.5 The Visual Studio 2005 assembly host

When building an application assembly (either a Windows Forms or a Console application), in addition to the application EXE assembly, Visual Studio 2005 creates an application assembly called <application name>.vshost.exe. That assembly is found in the same folder as your application assembly, both in the Debug and Release folders.

Whenever you're working in a debug session, <application name>.vshost.exe is the process being launched, not your original <application name>.exe. For debugging purposes, Visual Studio 2005 loads your own <application name>.exe into <application name>.vshost.exe and debugs it (hence the name vshost the process that hosts your application).

<application name>.vshost.exe is in fact an identical copy of the vshost.exe file found under <Program Files>\Microsoft Visual Studio 8\Common7\IDE. All Visual Studio 2005 does is copy that file to your Debug and Release folders and rename it. vshost.exe is a simple application assembly with only a Main( ) method. The Main( ) method interacts with a set of .NET hosting management classes that facilitate debugging capabilities not available when simply launching your application assembly directly and attaching the debugger to it. These features are:

Partial-trust debugging: This enables you to test how your application behaves under reduced security permissions. Partial-trust debugging is covered in Chapter 12.
Shorter startup time: Each time you launched your application in Visual Studio 2003, it had to create a new process and attach the debugger to that process before starting the application. This introduced a noticeable delay. In Visual Studio 2005 the hosting process is kept running between debug sessions, significantly shortening startup time.
Design-time expression evaluation: The Intermediate Window lets you test code from your application without launching it. This is done by running the code in the readily available, already running <application name>.vshost.exe file.

Note that you can only run <application name>.vshost.exe from within the debugger, and it only works when placed in the same folder as the application it hosts.

You can also turn off the use of the Visual Studio host process by going to the Debug pane of the project properties and unchecking the "Enable the Visual Studio hosting process" checkbox.

2.2.4. Client and Server Assemblies

Visual Studio 2003 only allowed developers to add references to library assemblies. The client of a component in a class library assembly could reside in the same assembly as the component, in a separate class library assembly, or in a separate application assembly. The client of a component in an application assembly, however, could only reside in the same application assembly as the component. This was analogous to the use of classic Windows DLLs, though there was nothing specific in .NET itself that precluded clients from using components in other application assemblies.

Visual Studio 2005 allows developers to add references to both library and application assemblies. This enables you to treat an EXE application assembly as if it were a DLL library assembly. There is no longer the strict distinction between DLL and EXE assemblies, and the lines between them are very much blurred.

Anything you can do with a DLL library assembly, you can do with an EXE application assembly. For example, nothing prevents you from having a logical application comprised of one EXE application assembly with the user interface in it and several other EXE application assemblies referenced by the user-interface assembly, all loaded in the same process (as well as the same app domain, as explained in Chapter 10).

However, the reverse is not truethere are four things that are specific to EXE application assemblies:

You can only directly launch an application assembly (be it a Windows or a Console application). You cannot launch a class library.
Only an application assembly used to launch the process has a say which CLR version is used. This is discussed at length in Chapter 5.
Partial-trust debugging in Visual Studio 2005 is available only for application assemblies.
ClickOnce publishing and deployment in Visual Studio 2005 is available only for application assemblies.

That said, I still recommend that you put components in library assemblies whenever possible. This will enable the components to be used by different applications with different CLR versioning policies. It will also enable bundling the components with different ClickOnce applications and deploying the components with different security and trust policies (as discussed in Chapter 12).

Figure 2-7 shows one typical topology of a client application assembly using class libraries. If the client is in the same assembly as the component, the client developer can simply declare an instance of the component and use it. However, if the component is in one class library and the client is in another assembly (be it another library assembly or an application assembly), the client developer first needs to add a reference to the assembly library. Once you've added references to them, the client assembly can use as many class libraries as required.

Figure 2-7. A typical client and server topology

2.2.5. Managing Component Visibility in Assemblies

A set of interoperating components often includes components that are intended only for private, internal use by other components in the same assembly. These components aren't intended for outside use and should not be shared with your clients.

In .NET there are two kinds of components: internal and public. An internal component can be accessed only by clients inside its own assembly. If client code in one assembly tries to use an internal component from a different assembly, it will not compile. In the case of a multi-module class library, any client in any module can still access the internal component, because both reside in the same assembly. A public component is accessible to clients from inside and outside its assembly.

.NET supports special component-oriented access modifiers. To mark a component as internal, use the C# internal access modifier (Friend in Visual Basic 2005):

     internal class MyClass     {        public MyClass(  )        {}        public string GetMessage(  )        {           return "Hello";        }     }

If you wish to make the component available to outside clients, use the public access modifier:

     public class MyClass     {        public MyClass(  )        {}        public string GetMessage(  )        {           return "Hello";        }     }

.NET makes exposing components explicit: if you don't provide any access modifier, the default modifier is internal. The public and internal access modifiers can also be applied to any other types defined in the assembly, such as interfaces. You can mark individual members of a class or a structure as internal, too:

     public class MyClass     {        public MyClass(  )        {}        internal string GetMessage(  )        {           return "Hello";        }     }

Internal members (even on public types) are accessible only inside the assembly. To outside clients, internal members appear as private members.

Another form of component usage by outside entities other than object instantiation and method calls is inheritance. Developers may want to allow access to class members to internal clients and external subclasses only. To support this need, .NET adds the protected internal access modifier. For example, consider this class definition:

     public class MyClass     {        public MyClass(  )        {}        public string GetMessage(  )        {           return DoWork(  );        }        protected internal string DoWork(  )        {           return "Hello";        }     }

To subclasses outside the assembly, the DoWork( ) method appears as a protected method, yet inside the assembly the DoWork( ) method behaves as an internal method.

2.2.6. Assembly Metadata

Given that client assemblies add references to component assemblies, and that no source-file-sharing (such as C++ header files) is involved, how does the client-side compiler know what types are in the assembly? How does the compiler know which types are public and which are internal? How does it know what the method signatures are? This classic component-oriented programming problemthe problem of type discovery is raised by the fact that the client application is trying to use a binary component. The solution .NET introduces is called metadata.

Metadata is a comprehensive, standard, mandatory, and complete way of describing what is in an assembly. Metadata describes what types are available in the assembly (classes, interfaces, enums, structs, etc.) and their containing namespaces, the name of each type, its visibility, its base class, which interfaces it supports, its methods, each method's parameters, and so on. The assembly metadata is generated automatically by the high-level compiler directly from the source files. The compiler embeds the metadata in the physical file containing the IL (either a DLL or an EXE). In the case of a multi-file assembly, every module with IL must contain metadata describing the types in that module. In fact, a CLR-compatible compiler is required to generate metadata, and the metadata must be in a standard format.

Metadata isn't just for compilers, though. .NET makes it possible to read metadata programmatically, using a mechanism called reflection. Reflection is particularity useful from a software-engineering standpoint when used in conjunction with attributes, which provide you a way to add your own information to the metadata describing the types used to build your application. Both reflection and attributes are addressed in Appendix C.

Metadata is pivotal for .NET both as a component technology and as a development platform. For example, .NET uses metadata for remote-call marshaling across execution boundaries. Marshaling involves forwarding calls made by a client in one execution context (such as a process or machine) to another where the objects reside, invoking the calls in the other execution context, and sending the responses back to the client. Marshaling typically uses a proxy an entity that exposes the same entry points as the object. The proxy is the entity responsible for marshaling the call to the actual object. Because of the metadata's exact and formal description of the object's types and methods, .NET is able to construct proxies automatically to forward the calls. Chapter 10 discusses the link between remoting and metadata.

COM Type Libraries Versus Metadata

COM developers usually provided type libraries to address the type-discovery problem. The COM type library included the definitions of the interfaces the components implemented and a list of the components themselves. Type libraries had many problems, but foremost among them was the low affinity between what type libraries described and what the binaries actually contained. The binaries could contain types not listed in the type libraries, and the type libraries could list components not present in the binaries. The type library was limited in its description of the actual method signatures and often presented a dumbed-down version of the actual interfaces and method parameter semantics. Type libraries could be used to marshal method calls across context and process boundaries, but that in turn imposed some restrictions on the method parameters. For unusual custom types, type-library marshaling was powerless, and developers had to build custom proxy/stub pairs. Finally, type libraries could be embedded in the binaries or handed to the client separately, creating a development and deployment pitfall.

Even with all these shortcomings, though, type libraries provided client developers for the first time with a way to interact with binary components without involving source files. .NET takes the type library concept to a whole new level, because metadata provides all the information that type libraries do with precise type affinity, as well as additional type information, from base classes to custom attributes. Fundamentally, however, they serve the same purpose.

Visual Studio 2005 uses metadata, too. IntelliSense is implemented using reflection. The code editor simply accesses the metadata associated with the type the developer uses and displays the content for autocompletion or type information. Another nifty metadata-based feature in Visual Studio 2005 is Go to Definition, which allows you to get the definition of any typeeven those for which you do not have the source files. Right-click on a type name and select Go to Definition from the pop-up context menu. Visual Studio 2005 will create a new file and dump in it the definition of the type (public and protected members only), including XML comments and attributes. This often saves you the trouble of sifting through the help files looking for type information.

You can view the metadata of your assembly with the ILDASM utility.

2.2.7. The Assembly Manifest

Just as metadata describes the types in an assembly, the manifest describes the assembly itself, providing the logical attributes shared by all the modules and all components in the assembly. The manifest contains the assembly name, the version number, the locale, and an optional strong name uniquely identifying the assembly (discussed in Chapter 5). It also contains the security demands to verify for the assembly (discussed in Chapter 12), as well as the names and hashes of all the files that make up the assembly. Under COM, a malicious party (or even a benevolent party, by mistake) could swap an original DLL or EXE file with another and cause damage. In .NET, every manifest contains a cryptographic hash of the different modules in the assembly. When the assembly is loaded, the .NET runtime recalculates the cryptographic hash of the modules at hand. If the hash generated at runtime is different from that found in the manifest, .NET assumes foul play, refuses to load the assembly, and throws an exception.

Like the metadata, the manifest is generated automatically by the high-level compiler directly from the source files of all the modules in the assembly. Unlike metadata, there is no need to duplicate and embed the manifest for every module in the assembly; only one copy of it is embedded in one of the assembly's physical files. Any CLR-compatible compiler must generate a manifest, and the manifest has to be in a standard format.

The manifest is also the way .NET captures information about other referenced assemblies. This information is crucial to ensure version compatibility and to ensure that the assembly gets to interact with the exact trusted set of other assemblies it expects. For every other assembly referenced by this assembly, the manifest contains the name, the public key (if a strong name is available), the version number, and the locale. At runtime, .NET guarantees that only the referenced assemblies are used and that only compatible versions are loaded (Chapter 5 discusses .NET's versioning policy in depth). When strong names are used, the manifest maintains trust between the component vendor and its clients, because only the original vendor could have signed the referenced assembly with that strong name. You can view the manifest of your assembly with the ILDASM utility.

You can provide the compiler with information to add to the assembly manifest using special assembly attributes, defined in the System.Runtime.CompilerServices and System.Reflection namespaces. Typically, you provide identity information and security permissions, as explained in the subsequent chapters. Although you can sprinkle these attributes all over the assembly source files, a more structured and maintainable approach is to have a dedicated source file containing only these attributes. The convention is to name this file AssemblyInfo.cs in a C# project, or AssemblyInfo.vb in a Visual Basic 2005 project. In fact, Visual Studio 2005 generates an assembly information file for every new project under the Properties folder in the Solution Explorer. The Visual Studio 2005-generated assembly information file contains an initial set of assembly attributes with default values. Example 2-2 shows a typical set of assembly attributes.

Example 2-2. The assembly information file

     using System.Reflection;     using System.Rue.CompilerServices;     [assembly: AssemblyTitle("MyAssembly")]     [assembly: AssemblyDescription("Assembly containing my .NET components")]     [assembly: AssemblyCompany("My Company")]     [assembly: AssemblyCopyright("Copyright © My Company 2005")]     [assembly: AssemblyTrademark("MyTrademark")]     [assembly: AssemblyVersion("1.2.3.4")]

2.2.8. Friend Assemblies

An interesting assembly-level attribute introduced by .NET 2.0 is the InternalsVisibleTo attribute, defined in the System.Runtime.CompilerServices namespace. This attribute allows you to expose internal types and methods to clients from another specified assembly. This is also known as declaring a friend assembly. For example, suppose the server assembly MyClassLibrary.dll defines the internal class MyInternalClass as:

     internal class MyInternalClass     {        public void MyPublicMethod(  )        {...}        internal void MyInternalMethod(  )        {...}     }

If you add this line to the AssemblyInfo.cs file of MyClassLibrary.dll:

     [assembly: InternalsVisibleTo("MyClient")]

any client in the assemblies MyClient.dll and MyClient.exe will be able to use MyInternalClass and call its public or internal members. In addition, any subclass in the MyClient assembly will be able to access members marked as protected internal.

Declaring an assembly as a friend could easily be abused, violating the essential encapsulation of the internals of the assembly and tightly coupling the client to the internals of the server assembly. Declaring a friend assembly is available for when you break an existing assembly into one or more assemblies by moving some of the types to new assemblies. If the relocated types still rely on internal types in the original assembly, declaring a friend assembly is a quick (albeit potentially dirty) way of enabling the move. Another case where a friend assembly is handy is when you want to test internal components but the test client resides in a different assembly.

2.2.9. Composing Assemblies

There are many ways to compose an assembly. The only two rules are:

Every assembly must contain a manifest.
Every assembly module that contains IL must embed in it the corresponding metadata for that IL.

Assemblies can optionally contain resources such as strings or images. Of course, a single class library or application assembly contains all these items in one file. A multi-module assembly, on the other hand, has much more latitude in how it is composed. Figure 2-8 shows a few possibilities for composing assemblies.

Figure 2-8. Different assembly compositions

As you can see, you can compose multi-module assemblies in almost any way and use compiler switches to bind all your files together. In practice, I recommend abiding by the following composition rules:

Always store locale-specific resources in a separate satellite assembly, rather than as an embedded resource in the assembly using them. Doing so will greatly simplify localization issues. This is, by the way, the default behavior of Windows Forms.
Avoid multi-file class library assemblies with modules that don't contain IL.
Minimize the code in an application assembly. Focus instead on visual layout, and encapsulate business logic in separate class library assemblies.
Make sure all the components in a class library have the same lifecycle and will always have the same version number and security credentials. If you anticipate the possibility of divergence, split the assembly into two class libraries.

2.2.10. The Assembly Type

.NET provides a class for programmatic representation of an assembly. This is the Assembly class, defined in the System.Reflection namespace. Assembly provides numerous methods for retrieving detailed information about the assembly (location, files, etc.) and the types it contains, as well as methods for creating new instances of types defined in the assembly. You typically access an assembly object using a static method of Assembly. For example, to get the assembly from which the current code is running, use the static GetExecutingAssembly( ) method:

     Assembly assembly = Assembly.GetExecutingAssembly(  );

Using other static methods, you can access the assembly that called your assembly, the assembly in which a specified class is defined, and so on. The Assembly type is most often used with reflection to obtain information about an assembly or to implement certain advanced remote-call scenarios.