Assemblies | Application Development Using C# and .NET

for RuBoard

In .NET, Assemblies are components . Assemblies, which may be composed of one or more DLL or EXE files, are the unit of deployment. You do not deploy individual DLLs or EXEs. Security evidence and versioning are based on the assembly. Assemblies contain Microsoft Intermediate Language (MSIL) instructions, resource data, and metadata. Since metadata describes the content of the assembly, the assembly does not require any external description, such as in the system registry. .NET components are much simpler and less error prone to install and uninstall than traditional COM components, which had extensive registry entries.

A digital signature is required before an assembly can be deployed in the global assembly cache. Digitally signed assemblies provide cryptographically generated verification information that can be used by the CLR to enforce crucial dependency rules when locating and loading assemblies. This is distinct from the security verification that is done to make sure that code is type safe.

The identity of an unsigned assembly is defined simply as a human-readable name , along with a version number. The identity of a digitally signed assembly is defined by a unique cryptographic key pair. Optionally , an assembly's identity may also include a culture code for supporting culturally specific character sets and string formats.

An assembly's version can be checked, so that the CLR can insure that the same assembly version with which the client was built and tested is loaded. This eliminates the infamous "DLL Hell" problem, where Windows applications could easily break when an older version was replaced with a newer version (or vice versa). A digitally signed assembly can be used to verify that the assembly contents were not altered after it was digitally signed. Not only will you not accidentally use the wrong version, but you will not be tricked into using a maliciously tampered component that could do serious harm.

Although there is often a one-to-one correspondence between namespace and assembly, an assembly may contain multiple namespaces, and one namespace may be distributed among multiple assemblies. While there is often a one-to-one correspondence between assembly and binary code file (i.e., DLL or EXE), one assembly can span multiple binary code files. An assembly is the unit of deployment; an application is the unit of configuration.

Contents of an Assembly

For our next step of the case study, we split our Hotel Administrator's program into three assemblies. The example CaseStudy directory for this chapter has an AcmeGui application program (EXE), and two component (DLL) assemblies: Customer and Hotel . The code associated with the customer and hotel classes has been moved to the appropriate assemblies. When we discuss configuration later in the chapter, it is the AcmeGui application that will be configured.

We will use the Customer and Hotel assemblies to understand the issues associated with deployment. All public members of the Customer and Hotel assembly will be visible to code outside of their respective assemblies. Members marked as internal can be used only within the assembly.

If you look at Figure 7-1, you will see that the Solution Explorer shows that the AcmeGui project has references to the Customer and Hotel dynamic link libraries. These references enable the compiler to find the Hotel and Customer types used by AcmeGui and then build the application. They do not dictate where the DLLs have to be when the project is deployed; we will explain how this works when we discuss deployment. You will also notice references made to system assemblies such as System.dll . Looking at the properties for the reference will show you where the assembly is located. ^[1]

^[1] Select the assembly in the Solution Explorer, right-mouse click, select Properties in the context menu.

Figure 7-1. AcmeGui's Solution Explorer showing references.

graphics/07fig01.gif

Creating a DLL is simple. Just select "Class Library" from the New Project Wizard in Visual Studio.NET, specify a location and name, and then start coding. To setup a reference to another DLL from your project you use the Add Reference menu item from the Visual Studio.NET Project menu. Navigate to the DLL you want, select it with the Select button, then click the OK button. ^[2] Every Assembly has a Manifest that describes the metadata information associated with the Assembly. A manifest provides the following information about an assembly:

^[2] It is straightforward to go from the monolithic program we had in the previous chapter to the componentized one we have now.

Create two new Class Library projects in the AcmeGui Solution for Customer and Hotel. In Visual Studio select File New Project. In the dialog box that comes up, select Visual C# projects in the left top pane, then select Class Library in the right top pane. Enter the name of the project (Customer or Hotel) and make sure the Add to Solution radio button is selected.

Remove the appropriate files from the AcmeGui project and add them to the appropriate project. In the Solution Explorer, select the file in the AcmeGui project, right-mouse click, select exclude from project. Then in the Solution Explorer select the appropriate project and right-mouse click, select Add, then Add Existing Item, navigate to the appropriate file and select it, and hit the open button. You can select more than one file at a time.

Build the two component projects by selecting their project name in the Solution Explorer and select the build option for the assembly in the Build menu. Since we no longer have a monolithic application, we have to indicate to the compiler how to resolve references to the Customer and Hotel classes. Select the AcmeGui project in the Solution Explorer, right-mouse click, then select Add Reference. Click on the Projects tab and you should see the Customer and Hotel dlls there. Select them both and then hit the select button. You should see both dynamic link libraries in the bottom list. Then click the OK button. Now when you rebuild the solution, the AcmeGui project will compile and run. You can click on the plus button next to References in any project to see what dependencies it has.

Assembly identity based on name, version, culture, and ” optionally ”a digital signature
Files that contribute to the assembly contents
Other assemblies on which the assembly is dependent
Permissions required by the assembly

Every assembly created by Visual Studio has a file, AssemblyInfo.cs , containing the following attributes that can be used to set the information associated with an assembly:

 [assembly: AssemblyTitle("")]  [assembly: AssemblyDescription("")]  [assembly: AssemblyConfiguration("")]  [assembly: AssemblyCompany("")]  [assembly: AssemblyProduct("")]  [assembly: AssemblyCopyright("")]  [assembly: AssemblyTrademark("")]  [assembly: AssemblyCulture("")]  [assembly: AssemblyVersion("1.0.*")]  [assembly: AssemblyDelaySign(false)]  [assembly: AssemblyKeyFile("")]  [assembly: AssemblyKeyName("")]

To explore how versioning, digital signing, and deployment work, we use the ILDASM tool introduced in Chapter 2 to view the appropriate metadata. Visual Studio.NET installs with ILDASM on the Tools menu. You can also find it in your \Program Files\Microsoft.Net\FrameworkSDK\Bin directory.

Figure 7-2 shows the top level that you see when you open the Customer.dll assembly in ILDASM and double-click on the OI.NetCs.Acme namespace. You see entries for the MANIFEST, the Customers and Customer classes, the ICustomer interface, and the CustomerListItem value type. Clicking on the plus (+) button will expand an entry.

Figure 7-2. Top-level ILDASM view of Customer component.

graphics/07fig02.gif

To view the manifest, double-click the MANIFEST node shown in Figure 7-2; the resulting manifest information is displayed in Figure 7-3. Some of the numbers will vary if you have rebuilt any of the samples, or you have a later version of .NET.

Figure 7-3. ILDASM showing manifest of Customer.dll.

graphics/07fig03.gif

The manifest contains information about the dependencies and contents of the assembly. You can see that the manifest for Customer contains, among others, the following external dependency. ^[3]

^[3] If you have rebuilt any of the components, you will, of course, see different build and revision numbers.

 .assembly extern mscorlib  {    .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )    .ver 1:0:2411:0  }

The .assembly extern mscorlib metadata statement indicates that the Customer assembly makes use of, and is therefore dependent on, the standard assembly mccorlib.dll , which is required by all managed code. When an assembly makes a reference to another assembly, you will see an .assembly extern metadata statement. If you open AcmeGui in ILDASM and look at the manifest, you will see dependencies on the Customer and Hotel assemblies as well as the System.Drawing assembly.

 .assembly extern Customer  {      .ver 1:0:592:25677  }  .assembly extern Hotel  {      .ver 1:0:592:25677  }  .assembly extern System.Drawing  {      .publickeytoken = (B0 3F 5F 7F 11 D5 0A 3A)      .ver 1:0:2411:0  }

The System.Drawing assembly is a shared assembly, which can be seen in the \WINNT\Assembly directory using Windows Explorer. Mscorlib , which is a shared assembly, is not deployed in the assembly cache. Microsoft made a single exception here: because mscorlib is so closely tied with the CLR engine ( mscorwks ^[4] ), it is installed in the appropriate install directory ( \WINNT\Microsoft.NET\Framework ) for the current .NET version.

^[4] Or mscorsvr.dll for servers.

In the System.Drawing shared assembly, the .publickeytoken = (B0 3F 5F 7F 11 D5 0A 3A) metadata statement provides a public key token, which is the lowest 8 bytes of a hash of the public key that matches the corresponding private key owned by the System.Drawing assembly's author. This public key token cannot actually be used directly to authenticate the identity of the author of System.Drawing . However, the original public key specified in the System.Drawing manifest can be used to mathematically verify that the matching private key was actually used to digitally sign the System.Drawing assembly. Since Microsoft authored System.Drawing.dll , the public key token seen above is Microsoft specific. Of course, the matching private key is a closely guarded corporate secret, and it is believed by most security experts that such a private key is, in practice, virtually impossible to determine from the public key. However, there is no guarantee that some mathematical genius will not find a back door someday!

The .publickeytoken declaration

The .publickeytoken declaration provides only the least significant 8 bytes of the SHA1 hash of the producer's public key (which is 128 bytes), which saves some space but can still be used to verify at runtime that the assembly being loaded comes from the same publisher as the one you compiled against. Alternatively, the . publickey declaration could be used, which provides the full public key. This would take up more space but makes it harder for villains to find a private key that matches the full public key.

As we shall see shortly, the .publickeytoken statement is present in the client assembly's manifest only if the referenced assembly has been digitally signed, and all assemblies intended for shared deployment must be digitally signed. Microsoft has digitally signed the standard .NET assemblies, such as mscorlib.dll , and System.Windows.Forms.dll with private keys belonging to them. This is why the public key token for many of those shared assemblies, seen in the \WINNT\Assembly directory using Windows Explorer, has the same value repeated. Assemblies authored and digitally signed by other vendors are signed with their own distinct private keys, and they will therefore result in a different public key token in their client assembly's manifests . Later, we will look at how you can create your own private and public key pair and digitally sign your own assemblies for deployment into the global assembly cache.

Nonetheless, while unique, none of these digital keys can identify who the author of a particular module is. A developer of assemblies can use the signcode utility to add a digital certificate that will identify the publisher of the assembly.

The .ver 1:0:2411:0 metadata statement indicates the version of the System.Drawing assembly. While these numbers have no intrinsic meaning, the Microsoft suggested format of this version specification is Major:Minor:Build:Revision. Over time, as new versions of this assembly are released, existing clients that were built to use this version will continue using this version, assuming the conventional meaning of major and minor values. Newer client programs will, of course, be able to access newer versions of this assembly as they become available. The old and new versions can be deployed side-by-side in the global assembly cache and be simultaneously available to old and new client programs.

Note that the version 1:0:2411:0 appearing in the client manifest belongs to the current version of the Acme.Gui assembly and is unrelated to the "1.0.*" version attribute specified in the AssemblyInfo.cs file in the AcmeGui source code. We will soon look more closely at the four fields that make up a version number, and how assembly versioning works with the suggested format.

Now let us consider the information about the component itself in its manifest.

ILDASM shows the assembly metadata in the Customer manifest:

 .assembly Customer  {    .custom instance void         [mscolib]System.Reflection.AssemblyKeyNameAttribute                       ::.ctor(string) = ( 01 00 00 00 00 )  ...    // --- The following custom attribute is added                     automatically, do not uncomment -------   //  .custom instance void             [mscolib]System.Diagnostics.DebuggableAttribute                                               ::.ctor(bool,    //                        bool) = ( 01 00 01 01 00 00 )    .hash algorithm 0x00008004    .ver 1:0:592:25677  }

The .assembly Directive

The .assembly directive declares the manifest and specifies to which assembly the current module belongs. In this example, the .assembly directive specifies the name of the assembly to be Customer. It is this name (combined with the version number and optionally a public key) rather than the name of the DLL or EXE file that is used at runtime to resolve the identity of the assembly. Also note that if the assembly is signed, you will see the .publickey defined within the .assembly directive. It also indicates what custom attributes have been added to the metadata.

The .assembly Customer metadata statement indicates that the assembly name is Customer. Note that this is not the name of a component class within the assembly, but rather the assembly itself. This assembly is not digitally signed, and therefore it does not contain a public key.

In multifile assemblies (discussed in a later section) the manifest stores a hash of each file. The .hash algorithm 0x00008004 metadata statement indicates that SHA1 is the hash algorithm that is to produce this hash-code value. Many hash-code algorithms exist. Initially, however, only MD5 (0x000803) and SHA1 (0x000804) are supported by .NET.

Hash Algorithms

A hash algorithm is a mathematical function that takes the original data of arbitrary size as input and generates a hash code, also known as a message digest, which is a fixed- sized binary output. An effective hash function is a one-way function that is highly collision free, with a result that is relatively small and fixed in size. Ideally, a hash function is efficient to calculate as well. A one-way function is a function that has no inverse, so that you cannot effectively reproduce the original data from the hash-code value. ^[5] The phrase "highly collision free" means that the probability that two distinct original input data samples generate the same hash code is very small, and it is unlikely to calculate two distinct input data samples that result in the same hash-code value. The well-known MD5 and SHA1 hash algorithms are considered to be excellent choices for use in digital signing, and they are both supported by .NET.

^[5] One-way encryption codes are used to store passwords in a passwords database. When you log in, the password you enter is encrypted and compared with what is stored in the database. If they match, you can log in. The password cannot be reconstructed from the encrypted value stored in the passwords database.

Versioning an Assembly

An assembly manifest contains the version of the assembly as well as the version of each of the assemblies that the assembly depends on. The version number of an assembly is composed of four numerical fields: Major, Minor, Build, and Revision. There are no semantics assigned to any of these fields by the CLR. Microsoft does suggest the following convention:

Major ” a change to this field indicates major incompatible changes.
Minor ” a change to this field indicates minor, but incompatible changes.
Build number ” a change to this field indicates a new backward-compatible release.
Revision ” a change to this field indicates a backward-compatible emergency bug fix.

None of this is enforced by the CLR. You enforce this convention, or any other convention you choose, by testing assemblies for compatibility and specifying the version policy in a configuration file that we will discuss.

In the metadata for the Customer assembly, the .ver 1:0:592:25677 gives us the assembly's version: Major Version 1, Minor Version 0, Build Number 592, Revision 25677.

The version information for the manifest can be defined in the source code using the assembly attribute assembly::AssemblyVersion . This attribute (as with other global attributes) can appear in a source file after a using statement but before any namespace or class definitions. The AssemblyVersionAttribute class is defined in the System::Reflection namespace. If this attribute is not used, a default version number of 0.0.0.0 is listed in the assembly manifest, which is generally not desirable.

In a project created with the VisualStudio.NET project wizard, the source file AssemblyInfo.cs is automatically generated, with a version of 1.0.* , producing a major version of 1, and a minor version of 0 and automatically generated build and revision values. If you change the AssemblyVersionAttribute to, for example, "1.1.0.0" , as shown below, the version number displayed in the manifest will be modified accordingly to 1:1:0:0 .

 //AssemblyInfo.cs  ...  [assembly: AssemblyVersion("1.1.0.0")];

If you specify any version number at all, you must at a minimum specify the major number. If you specify only the major number, the remaining values will default to zero. If you also specify the minor value, you can omit the remaining fields, which will then default to zero, or you can specify an asterisk, which will provide automatically generated values. The asterisk will cause the build value to equal the number of days since January 1, 2000, and the revision value will be set to the number of seconds since midnight, divided by 2. If you specify major, minor, and build values, and specify an asterisk for the revision value, then only the revision is defaulted to the number of seconds since midnight, divided by 2. If all four fields are explicitly specified, then all four values will be reflected in the manifest. The following examples show valid version specifications.

 Specified in source      Result in manifest  None                   0:0:0:0  1                      1:0:0:0  1.1                    1:1:0:0  1.1.*                  1:1:464:27461  1.1.43                 1:1:43:0  1.1.43.*               1:1:43:29832  1.1.43.52              1:1:43:52

If you use the asterisk, then the revision and possibly the build number will automatically change every time you rebuild the component. You must make an explicit change to the major and minor numbers if you wish to have their values changed.

Strong Names

Before we can discuss version policy, we have to introduce the idea of a strong name. A strong name is guaranteed to be globally unique for any version of any assembly. Strong names are generated by digitally signing the assembly. This ensures that the strong name not only is unique, but can be generated only by an individual that owns a secret private key.

A strong name is made up of a simple text name, a public key, and a hash code that has been encrypted with the matching private key. The hash code is known as a message digest and the encrypted hash code is known as a digital signature. The digital signature effectively identifies the assembly's author and ensures that the assembly has not been altered. Two assemblies that have the same strong name and version are considered to be identical assemblies. Two assemblies with different strong names are considered to be different. A strong name is also known as a cryptographically strong name, since, unlike a simple text name, a strong name is guaranteed to uniquely identify the assembly based on its contents and its author's private key. A strong name has the following useful properties:

A strong name guarantees uniqueness based on encryption technology.
A strong name establishes a unique namespace based on the use of a private key. ^[6]

^[6] Do not confuse this namespace with the one used by the compiler to disambiguate class names.
A strong name prevents unauthorized personnel from versioning the assembly.
A strong name allows the CLR to find the right version of a shared assembly.

Digital Signatures

Digital signatures are based on public key cryptographic techniques. In the world of cryptography, the two main cryptographic techniques are symmetric ciphers (shared key) and asymmetric ciphers (public key). Symmetric ciphers use one shared secret key for encryption as well as decryption. DES, Triple DES, and RC2 are examples of symmetric-cipher algorithms. Symmetric ciphers can be very efficient and powerful for message privacy between two trusted cooperating individuals, but they are generally unsuitable for digital signatures. Digital signatures are not used for privacy but for identification and authentication. If you shared your symmetric key with everyone who would potentially want to identify or authenticate you, you would inevitably share it with people who would want to impersonate you.

Asymmetric ciphers are used in digital signatures. Asymmetric ciphers, also known as public key ciphers, make use of a public/private key pair. The paired keys are mathematically related and are generated together. It is, however, exceedingly difficult to calculate one key from the other. The public key is typically exposed to everyone who would like to authenticate its owner. On the other hand, the owners keep the matching private signing key secret, so that no one can impersonate them. RSA is an example of a public key cipher system.

Public key cryptography is based on a very interesting mathematical scheme that allows plain text to be encrypted with one key and decrypted only with the matching key. For example, if a public key is used to encrypt the original data (known as plain text), then only the matching private key is capable of decrypting it. Not even the encrypting key can decrypt it! This scenario is useful for sending secret messages to only the individual who knows the private key.

The opposite scenario is where the individual who owns the private key uses that private key to encrypt the plain text. The resulting cipher text is by no means a secret, since everyone who is interested can obtain the public key to decrypt it. This scenario is useless for secrecy but very effective for authentication purposes. To improve performance, instead of encrypting the original data, a highly characteristic hash code is encrypted instead.

If you use the matching public key to decrypt the encrypted hash code, you can recalculate the hash code on the original data and compare the two values. If they match, you can be certain that the owner of the private key was the digital signer. Of course, the owner of the private key has to make sure to keep the private key secret, otherwise you cannot prove that the data has not been tampered with from the time when it was digitally signed. Figure 7-4 shows how a digital signature works.

Figure 7-4. How a digital signature works.

graphics/07fig04.gif

SHA1 and RSA

To sign the assembly, the producer calculates a SHA1 hash of the assembly (with the bytes reserved for the signature preset to zero) and then encrypts the hash value with a public key using RSA encryption. The public key and the encrypted hash are then stored in the assembly's metadata.

Digitally Signing an Assembly

The process of digitally signing an assembly involves generating a public/private key pair, calculating a hash code on the assembly, encrypting the hash code with the private key, and writing the encrypted hash code along with the public key into the assembly for all to see. The encrypted hash code and public key together comprise the entire digital signature. The digital signature is written into a reserved area within the assembly that is not included in the hash-code calculation. All these steps are performed with two simple tools ” the Strong Name utility ( Sn.exe) and the Assembly Linker ( Al.exe ). To build and digitally sign an assembly, the following steps are performed.

Develop and build the component.
Generate a public/private key pair.
Calculate a hash code on the contents of the assembly.
Encrypt the hash code using the private key.
Place the encrypted hash code into the manifest.
Place the public key into the manifest.

Step 1 is, of course, usually performed using Visual Studio.NET. Steps 2 through 6 are known as digital signing. Step 2 is accomplished using the Strong Name utility Sn.exe . Steps 3 through 6 are accomplished using either Visual Studio.NET or the Assembly Linking utility Al.exe (that's "A-el", not "A-one").

To illustrate this process we will develop a version of our Customer and Hotel assemblies that have strong names. They are located in the SignedCaseStudy directory. We generate key pairs for the assemblies using Sn.exe , known as the Strong Name utility. This tool generates a cryptographically strong name for the assembly. You generate a public/private key pair and place them into a file named KeyPair.snk as shown in the following command (which you can run from the source directory):

 sn -k KeyPair.snk

The resulting KeyPair.snk file is a binary file and is not intended to be human readable. If you are curious , you can write these keys into a comma-delimited text file with the following command, then view it using Notepad.exe . This is not a required step.

 sn -o KeyPair.snk KeyPair.txt

In the example you will finds these files in the Customer and Hotel subdirectories.

The next step is to apply the private key to the assembly. For developing and testing it is convenient to do this at compilation time. When you release the assembly, however, you have to use the official private key of the company. For security reasons this key is probably known only to the corporate digital signing authority. The process of creating the strong name cannot be postponed until after the assembly is built, because the public key is part of the assembly's identity. Users of the assembly have to compile against the full identity of the assembly. Delay signing, which splits the process of assigning the strong name into two steps, is designed to solve this problem.

If you just want to apply the digital signature automatically at compile time without delay signing, you simply use the AssemblyKeyFileAttribute ” which, in the example, is in the file AssemblyInfo.cs of the Customer project. The KeyPair.snk file generated previously with the Sn.exe tool is specified in the attribute. The file path has to be relative to the project output directory. Once the KeyPair.snk file has been added to the AssemblyKeyFileAttribute the code must be recompiled.

 [assembly: AssemblyKeyFile(".\Customer\KeyPair.snk")]

Delay signing requires a more complex procedure. When you build the assembly, the public key is supplied to the compiler so that it can be put into the PublicKey field in the assembly's manifest. Space is reserved in the file for the signature, but the signature is not generated. When the actual signature is generated, it is placed in the file with the -R option to the Strong Name utility ( sn.exe ).

To indicate to the compiler that you want to use delay signing, you include AssemblyDelaySignAttribute in your source code. You also have to include the public key using the AssemblyKeyFileAttribute .

Assuming you have generated the public/private key pair as described previously, you then use the -p option of the Strong Name utility to obtain just the public key without giving out the still secret private key.

 sn -p KeyPair.snk PublicKey.snk

You then add the following two attributes to AssemblyInfo.cs :

 [assembly: AssemblyDelaySign(true)]  [assembly: AssemblyKeyFile(".\PublicKey.snk")]

The assembly still does not have a valid signature. You will not be able to install it into the global assembly cache. You can disable signature verification of a particular assembly by using the -Vr option on the Strong Name utility.

 sn -Vr Customer.dll

Before you ship the assembly you must supply the valid signature. You use the -R option on the Strong Name utility and supply the public/private key pair.

 sn - R customer.dll KeyPair.snk

However you add the key, if you look at the manifest in ILDASM you will see that the .publickey entry has been added to the assembly's metadata.

The .publickey attribute represents the originator's public key that resides in the KeyPair.snk file. This is the public key that can be used to decrypt the message digest to retrieve the original hash code. When the assembly is deployed into the global assembly cache, this decrypted hash code is compared with a fresh recalculation of the hash code from the actual assembly contents. This comparison is made to determine if the assembly is legitimate (i.e., identical to the original) or illegitimate (i.e., corrupt or tampered). Of course, when you use Sn.exe , it will produce a different key pair, and the public key shown below will be different in your case accordingly.

If you use ILDASM to examine the manifest of the AcmeGui client program, you will see the following:

 .assembly extern Customer  {    .publickeytoken = (8B 0E 61 2D 60 BD E0 CA )    .ver 1:0:0:0  }  .assembly extern Hotel  {    .publickeytoken = (CF 0B C2 2F 8E 2C 15 22 )    .ver 1:0:0:0  }

Now that Customer and Hotel have strong names, references to them have a public key token, which is a hash of the public key that matches the corresponding private key for the assembly. Note that we generated different keys for each assembly. Usually, each company will use the same key pair for all its public components.

Now that we have discussed strong names, we can discuss the two methods of deploying assemblies in .NET, and their associated default version policies. After this discussion we will show how the default policy can be overridden in a configuration file.

for RuBoard