2.7 Intermediate Language

only for RuBoard

2.7 Intermediate Language

VB is not compiled directly into machine code. It is first compiled to a CPU-independent language called Microsoft Intermediate Language (MSIL, or simply IL). You might think this compilation is a throwback to VB's early years as an interpreted language, but the situation is not so grim. The code is not interpreted; eventually, it is converted to machine code at runtime by a just-in-time (JIT) compiler. This happens during execution, as code is needed. Then it is cached as machine code until the process terminates.

The .NET Framework SDK ships with an IL disassembler called ILDASM, which allows you to view the IL produced by the VB compiler (or any .NET compiler, for that matter). This feature can be very useful if you want to see how something in the .NET class library was implemented or to determine what classes are available in a particular library. The hello.dll assembly can be examined by running the IL Disassembler ( ildasm.exe ) from the command line:

 C:\>ildasm hello.dll 

From the ILDASM dialog, you can view the manifest and navigate every namespace within the given assembly. As shown in Figure 2-4, ILDASM presents a tree view that allows inspection of the manifest, the various namespaces, classes, and methods contained within the assembly. Example 2-3 contains the entire IL listing for hello.dll , which was produced by selecting File/Dump from the menu.

Figure 2-4. The ILDASM dialog
figs/oop_0204.gif
Example 2-3. The IL dump of hello.dll
 //  Microsoft (R) .NET Framework IL Disassembler.  Version 1.0.3705.0
//  Copyright (C) Microsoft Corporation 1998-2001. All rights reserved.
   
.assembly extern mscorlib
{
  .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )
  .ver 1:0:3300:0
}
.assembly extern Microsoft.VisualBasic
{
  .publickeytoken = (B0 3F 5F 7F 11 D5 0A 3A )
  .ver 7:0:3300:0
}
.assembly hello
{
  .hash algorithm 0x00008004
  .ver 0:0:0:0
}
.module hello.dll
// MVID: {8A2071A6-F906-43C1-B6DB-CA5058F54BC4}
.imagebase 0x00400000
.subsystem 0x00000002
.file alignment 512
.corflags 0x00000001
// Image base: 0x03090000
//
// ============== CLASS STRUCTURE DECLARATION ==================
//
.namespace Greeting
{
  .class public auto ansi Hello
         extends [mscorlib]System.Object
  {
  } // end of class Hello
   
} // end of namespace Greeting
   
// =============================================================
   
// =============== GLOBAL FIELDS AND METHODS ===================
   
// =============================================================
   
// =============== CLASS MEMBERS DECLARATION ===================
//   note that class flags, 'extends' and 'implements' clauses
//          are provided here for information only
   
.namespace Greeting
{
  .class public auto ansi Hello
         extends [mscorlib]System.Object
  {
    .method public specialname rtspecialname 
            instance void  .ctor( ) cil managed
    {
      // Code size       7 (0x7)
      .maxstack  8
      IL_0000:  ldarg.0
      IL_0001:  call       instance void [mscorlib]System.Object::.ctor( )
      IL_0006:  ret
    } // end of method Hello::.ctor
   
    .method public instance void  Write(string 'value') cil managed
    {
      // Code size       12 (0xc)
      .maxstack  8
      IL_0000:  ldstr      "Hello, {0}!"
      IL_0005:  ldarg.1
      IL_0006:  call       void [mscorlib]System.Console::WriteLine(string,
                                                                    object)
      IL_000b:  ret
    } // end of method Hello::Write
   
  } // end of class Hello
   
// =============================================================
   
} // end of namespace Greeting
   
//*********** DISASSEMBLY COMPLETE ***********************
// WARNING: Created Win32 resource file C:\hello.res 

Do not listen to those who tell you that learning IL is a waste of time. It is simply not true. You should develop at least a basic understanding of the language because it gives you an edge over those who do not know it. After all, every single .NET language compiles to IL. Once you know IL, you will really know .NET.

Some things can be done in VB that can't be done in C#, and vice versa. Other things can be done only in IL. Arrays with arbitrary bounds, for instance, cannot be declared in any .NET language; IL, however, does support them.

ILDASM

To get a better idea of the functionality provided by System and Microsoft.VisualBasic , examine them with ILDASM. Remember, System is contained primarily in mscorlib.dll , but parts of it reside in System.dll . Microsoft.VisualBasic is located in Microsoft.VisualBasic.dll .

Both DLLs are located in the .NET Framework directory, which is usually located here: <%windir%>\Microsoft.NET\Framework\(version number)

The following registry script adds an "Open with ILDASM" option to the context menu associated with EXEs and DLLs. This menu is available by right-clicking on these types of files in Explorer. Save the script to a file called ildasm.reg and double-click the file to execute it. If you type in this listing, everything between square brackets [ ] needs to be on the same line.

Before running this script, make sure that the bin directory for the .NET SDK is in your path . The directory should be similar to C:\Program Files\Microsoft Visual Studio .NET \FrameworkSDK\Bin . Locating the bin directory allows you to run all .NET Framework SDK utilities from the command line without having to worry about the path. Otherwise, you have to modify the script to include the full path to ildasm.exe :

 REGEDIT4
   
[HKEY_CLASSES_ROOT\.dll]
@="dllfile"
   
[HKEY_CLASSES_ROOT\dllfile\shell\
Open With ILDASM\command]
@="ildasm \"%L\""
   
[HKEY_CLASSES_ROOT\.exe]
@="exefile"
   
[HKEY_CLASSES_ROOT\exefile\shell\
Open With ILDASM\command]
@="ildasm \"%L\"" 

One way to start learning the language is by disassembling your own programs. Once you see how an assembly is laid out (which we'll go over momentarily), understanding it will be much easier. The .NET Framework SDK ships with two documents: the MSIL Instruction Set specification and the IL Assembly Language Programmer's Reference . Peek at both every once in a whilelearn a command here and there. Before you know it, listings like Example 2-3 will become quite readable.

The .NET Framework also supplies an IL assembler called ILASM ( ilasm.exe ). Assuming that Example 2-3 is saved to a file named hello.il , you could compile it back to a DLL like this:

 C:\>ilasm /DLL hello.il 

2.7.1 Assembly Internals

Let's examine the individual elements of Example 2-3 to get a better idea of how an assembly is put together. The first thing to look at is the manifest, which starts at the top of the listing and continues to the beginning of the Greeting namespace block.

The listing begins with two references to external assemblies that hello.dll needs to run properly mscorlib.dll and Microsoft.VisualBasic.dll :

 .assembly extern mscorlib
{
    .publickeytoken = (B7 7A 5C 56 19 34 E0 89 )
    .ver 1:0:2411:0
}
.assembly extern Microsoft.VisualBasic
{
    .publickeytoken = (B0 3F 5F 7F 11 D5 0A 3A )
    .ver 7:0:0:0
} 

The VB compiler references these two assemblies automatically; there is nothing you can do to change this fact. Each reference contains a publickeytoken that contains the low 8 bytes of the SHA1 hash of the originator's public key. The CLR uses this public key token to verify that an external assembly is valid. You can find out the public key token in your own code by using the .NET Framework Strong Name tool from the command line:

 C:\>  sn -t hello.dll  Microsoft (R) .NET Framework Strong Name Utility  Version 1.0.2914.16
Copyright (C) Microsoft Corp. 1998-2001. All rights reserved.
   
Public key token is f45b0326d39e29a9 

In addition to a public key token, all references must have a version number, which is built using four 32-bit integers.

The assembly definition for hello.dll follows the external assembly references:

 .assembly hello
{
  .hash algorithm 0x00008004
  .ver 0:0:0:0
} 

It contains a version number, too, but as you can see, it has not been defined, so it consists of four s. You also see the hash algorithm key here that denotes the use of the SHA1 algorithm, used to generate the cryptographic hash of the file's contents (the public key token for hello.dll ).

The rest of the manifest consists of the following:

 .module hello.dll
// MVID: {EE8D826F-18C1-4317-82C5-74209B06C55E}
.imagebase 0x00400000
.subsystem 0x00000002
.file alignment 512
.corflags 0x00000001
// Image base: 0x03080000 

Modules (in this context, the EXEs or DLLs) are not referred to by filename, but are referenced logically by the runtime. Here, the module key is the same as the filename, but it doesn't necessarily have to be that way. If you examine the manifest for mscorlib.dll , for instance, you will see that the actual name of the module is CommonLanguageRuntimeLibrary .

The imagebase entry contains the preferred address of the image (EXE or DLL) when it is loaded into memory. File alignment is the alignment of the raw data in the image. Neither key is metadata related . Executables in .NET are still in the Windows PE format. These settings relate to the PE header for the file.

The subsystem key contains one of two values: 2 or 3 . A 2 means that the program should run using whatever standards are necessary for a program that has a graphical user interface. If the value is a 3 , the program is console-based.

The corflags key is a reserved metadata field. It does not tell you anything about the current assembly.

Rather than get into the actual code portion of hello.dll , look at Example 2-4, which contains a listing for a standard "Hello, world" application in IL. It is not as jumbled as the IL dump from Example 2-3, and it is pretty readable, even if you don't know the language.

Example 2-4. Simple "Hello, world" in IL
 .assembly extern mscorlib {}
.assembly HelloWorld {}
.class Hello {
  .method static void Main( ) {
    .entrypoint
    ldstr      "Hello, World!"
    call       void [mscorlib]System.Console::WriteLine(string)
    ret
  }
} 

Save it to a file named hello-world.il and compile it:

 C:\>ilasm hello-world.il 
only for RuBoard