ILDASM and Microsoft Intermediate Language | Debugging Applications for MicrosoftВ® .NET and Microsoft WindowsВ® (Pro-Developer)

When I first started playing with .NET several years ago, I wrote the usual "Hello World!" program and immediately wanted to see how it all worked under the covers. I had quite the shock when I realized that .NET was essentially a whole new development environment! When learning a new environment, I like to get down to the simplest operation and start working my way up so that I can see how it all fits together.

For example, when I was making my transition from MS-DOS to Microsoft Windows 3.0 (wow, am I getting old!), whenever I got a little confused about what was going on, I took a peek at the assembly language the CPU was executing so that I could get a clue. The beautiful thing about assembly language (also known as unambiguous mode) is that it never lies. I was able to continue using this technique until I started moving to .NET, at which point my world became a little topsy-turvy. I lost my assembly-language crutch! I could look at the Intel assembly language in the debuggers, but it didn't help much. I was seeing quite a bit more assembly language that called through allocated addresses and other complicated techniques—the one-to-one mapping present in Microsoft C/C++ Win32 development no longer existed.

However, immediately after writing that first "Hello World!" program, I found the coolest feature in .NET: the Microsoft Intermediate Language Disassembler (ILDASM). Armed with a killer disassembler, I felt I could start tackling this elephant-sized mound of stuff known as .NET a single bite at a time. ILDASM allows you to see the pseudo-assembly language for .NET, called Microsoft intermediate language (MSIL). You or I might never write anything in MSIL, but knowing our way around the assembly language for an environment is the difference between merely using that environment and really learning it. Additionally, although the .NET documentation is excellent, nothing beats seeing exactly how an application is implemented.

Before we jump into ILDASM usage and MSIL, I want to address one common issue that always comes up with .NET and is a source of confusion. Many people have told me that they won't really consider .NET because it's so easy to reverse engineer, or decompile, so there's no protection for their intellectual property. That's a completely accurate assessment of .NET. In exchange for the power of true distributed objects, garbage collection, and ease of development, a .NET binary has to be self-describing. However, the argument is a complete red herring.

The same complaints were made about Java when it first burst on the scene. Lots of folks were aghast that it was so easy to decompile. In fact, a third-party company produced an excellent decompiler for Java binaries that was simply amazing. I remember talking with some of their customers who liked the results produced by this tool so much that they compiled their coworkers code and decompiled it with the tool because the tool produced better and more readable code than their coworkers! Even though all those people were extremely worried about protecting their intellectual property in the beginning, that worry sure didn't slow down Java's acceptance by enterprise developers.

For those of you doing Web applications or XML Web services, be aware that customers and users don't have physical access to the binaries and therefore you don't have to be concerned about reverse engineering. However, many of you are doing Windows Forms or console applications and might be a little worried. Starting with Visual Studio .NET 2003, Microsoft is distributing a "community edition" of PreEmptive Solutions, Dotfuscator. This version does nothing more than what looks to me like renaming your classes and methods, which might be good enough. Be prepared to spend some time with Dotfuscator because its graphical user interface (GUI) is in need of some serious user interface research.

If you're concerned about intellectual property issues in your Windows Forms or console applications, Wise Owl (also known as fellow Wintellectual Brent Rector) has written an outstanding obfuscator, Demeanor for .NET. Demeanor will completely protect your intellectual property. I strongly recommend it because it will eliminate any worries you might have about deploying .NET applications in cases where someone might have physical access to the binaries. You can learn more about Demeanor for .NET by surfing to http://www.wiseowl.com.

Of course, if you are going to us an obfuscator on your .NET code, make sure you have completely debugged and tested the code before obfuscating. As of now, there's no way to match the obfuscated code back to your source so you're left debugging at the x86 assembly language level.

Getting Started with ILDASM

ILDASM is located in your <Visual Studio .NET Install Directory>\SDK\v1.1\Bin directory, which might not be in your PATH environment variable by default. If you execute the VSVARS.BAT file from <Visual Studio .NET Installation Directory>\Common7\Tools, you'll get the appropriate .NET environment variables set up so that you can access any of the .NET utilities from the command line. When you start ILDASM and choose a file to disassemble, the initial view looks similar to Figure 6-9. What you're seeing is the metadata expansion for the module. Figure 6-9 shows all possible icons displayed for the types, but it's not very clear from the figure what all the icons are for and what their textual values are when you save the tree display to a file. I created Table 6-2 to make everything simple to understand.

click to expand
Figure 6-9: Main ILDASM display

Table 6-2: ILDASM Tree Output Descriptions
Glyph	Text Output	Description
	[MOD] for module heading	Informational directives, class declarations, and manifest information
	[NSP]	Namespace
	[CLS]	Class
	[INT]	Interface
	[ENU]	Enumeration
	[VCL]	Value class
	[MET]	Instance method (private, public, or protected)
	[STM]	Static method
	[FLD]	Instance field (private, public, or protected); also assembly
	[STF]	Static field
	[EVT]	Event
	[PTY]	Property (get and/or set)

If you'd like to see more information and statistics about files you're opening, start ILDASM with the /ADV command line option. This turns on advanced display information and appends three new items to the View menu:

COR Header lets you view the file header information.
Statistics lets you see various statistics concerning size percentages and a breakdown of all metadata in the system.
MetaInfo contains a submenu that lets you select specific information to view. Choose the Show! submenu item (or press Ctrl+M) to see that information, which is displayed in a separate MetaInfo window. If you don't select any of the specific information, when you select Show!, you'll see a raw dump of the metadata.

If you have source code available for a particular module, you'll certainly want to turn on Show Source Lines on the View menu. Alternatively, you can specify the /SOURCE command line option. When you specify source line display, the disassembly shows the source lines as comments above the MSIL generated for them. To see all the command-line options, specify /? on the command line. I use a batch file that contains the following to start instances of ILDASM with the options I want:

ildasm /adv /source %1

To see the actual MSIL for a particular item, simply double-click on that item and another window will pop up. Depending on what you double-clicked, you'll see the disassembly, declaration information, or general information for the item. If the window looks like the one in Figure 6-10, you're ready to start learning MSIL. One nice little feature of ILDASM is that it supports drag-and-drop functionality, so you can easily jump between modules.

click to expand
Figure 6-10: MSIL for a method

The final trick I want to share about ILDASM is how you can see your C#, J#, Managed Extensions for C++, or Visual Basic .NET code and the MSIL all at the same time. If you've looked at the debugger's Disassembly window when debugging a managed application, you've probably seen the .NET language code and only the Intel assembly language. The reason is that the MSIL is just-in-time (JIT) compiled, so you're only executing the native assembly language, never the MSIL. What makes ILDASM really interesting is that it achieves the Holy Grail of disassemblers: it's a true "round-trip" disassembler!

With a round-trip disassembler, you can disassemble a binary and immediately run it through an assembler to rebuild the application. Since .NET comes with ILASM, the Microsoft Intermediate Language Assembler, you've got everything you need to see your C#/J#/Managed Extensions for C++/Visual Basic .NET code and MSIL all at the same time. This view allows you to see how things fit together. Disassemble the file with the /SOURCE and /OUT= command-line options to ILDASM, specifying an output file name that ends in .IL. Compile the file with ILASM using the /DEBUG option. Now you'll step through the MSIL with Visual Studio .NET's debugger and see the corresponding C#/J#/Managed Extensions for C++/Visual Basic .NET code as comments. If you want to see it all, simply look at the Disassembly window and you'll see how the high-level language is compiled to MSIL, and how the MSIL is JIT compiled to Intel assembly language. Listing 6-2 shows a method disassembled with the original source embedded as comments from the ShowBPs program, included with this book's sample files.

Listing 6-2: Mixed source and MSIL

 .method private instance void               btnConditionalBreaks_Click(object sender,                                        class [mscorlib]System.EventArgs e)                                         cil managed {   // Code size       139 (0x8b)   .maxstack  4   .locals init ([0] int32 i,            [1] int32 j,            [2] string[] _Vb_t_array_0,            [3] class [System.Windows.Forms]                           System.Windows.Forms.TextBox _Vb_t_ref_0) //000120:  //000121: Private Sub btnConditionalBreaks_Click _ //                     ( ByVal sender As System.Object, _ //                       ByVal e As System.EventArgs) _ //                             Handles btnConditionalBreaks.Click   IL_0000:  nop //000122:         Dim i As Integer = 0   IL_0001:  ldc.i4.0   IL_0002:  stloc.0 //000123:         Dim j As Integer = 0   IL_0003:  ldc.i4.0   IL_0004:  stloc.1 //000124:  //000125: ' Clearn the output edit control. //000126: edtOutput.Clear()   IL_0005:  ldarg.0   IL_0006:  callvirt   instance class                   [System.Windows.Forms]System.Windows.Forms.TextBox                                    ShowBPs.ShowBPsForm::get_edtOutput()   IL_000b:  callvirt   instance void                 [System.Windows.Forms]System.Windows.Forms.TextBoxBase::Clear()   IL_0010:  nop //000127:  //000128: ' Both are on one line to show how BPs can apply to part of a line. //000129: For i = 1 To 5 : For j = 1 To 5   IL_0011:  ldc.i4.1   IL_0012:  stloc.0   IL_0013:  ldc.i4.1   IL_0014:  stloc.1 //000130: ' Do the output //000131: edtOutput.Text += "i = " + i.ToString() + " j = " + _ //                j.ToString() + vbCrLf   IL_0015:  ldarg.0   IL_0016:  callvirt   instance class                 [System.Windows.Forms]System.Windows.Forms.TextBox                          ShowBPs.ShowBPsForm::get_edtOutput()   IL_001b:  stloc.3   IL_001c:  ldloc.3   IL_001d:  ldc.i4.6   IL_001e:  newarr     [mscorlib]System.String   IL_0023:  stloc.2   IL_0024:  ldloc.2   IL_0025:  ldc.i4.0   IL_0026:  ldloc.3   IL_0027:  callvirt   instance string                  [System.Windows.Forms]System.Windows.Forms.TextBox::get_Text()   IL_002c:  stelem.ref   IL_002d:  nop   IL_002e:  ldloc.2   IL_002f:  ldc.i4.1   IL_0030:  ldstr      "i = "   IL_0035:  stelem.ref   IL_0036:  nop   IL_0037:  ldloc.2   IL_0038:  ldc.i4.2   IL_0039:  ldloca.s   i   IL_003b:  call       instance string [mscorlib]System.Int32::ToString()   IL_0040:  stelem.ref   IL_0041:  nop   IL_0042:  ldloc.2   IL_0043:  ldc.i4.3   IL_0044:  ldstr      " j = "   IL_0049:  stelem.ref   IL_004a:  nop   IL_004b:  ldloc.2   IL_004c:  ldc.i4.4   IL_004d:  ldloca.s   j   IL_004f:  call       instance string [mscorlib]System.Int32::ToString()   IL_0054:  stelem.ref   IL_0055:  nop   IL_0056:  ldloc.2   IL_0057:  ldc.i4.5   IL_0058:  ldstr      "\r\n"   IL_005d:  stelem.ref   IL_005e:  nop   IL_005f:  ldloc.2   IL_0060:  call       string [mscorlib]System.String::Concat(string[])   IL_0065:  callvirt   instance void           [System.Windows.Forms]System.Windows.Forms.TextBox::set_Text(string)   IL_006a:  nop //000132: ' For the output to show up. //000133: edtOutput.Update()   IL_006b:  ldarg.0   IL_006c:  callvirt   instance class                  [System.Windows.Forms]System.Windows.Forms.TextBox                          ShowBPs.ShowBPsForm::get_edtOutput()   IL_0071:  callvirt   instance void                  [System.Windows.Forms]System.Windows.Forms.Control::Update()   IL_0076:  nop //000134: Next j   IL_0077:  nop   IL_0078:  ldloc.1   IL_0079:  ldc.i4.1   IL_007a:  add.ovf   IL_007b:  stloc.1   IL_007c:  ldloc.1   IL_007d:  ldc.i4.5   IL_007e:  ble.s      IL_0015     //000135: Next i   IL_0080:  nop   IL_0081:  ldloc.0   IL_0082:  ldc.i4.1   IL_0083:  add.ovf   IL_0084:  stloc.0   IL_0085:  ldloc.0   IL_0086:  ldc.i4.5   IL_0087:  ble.s      IL_0013     //000136:     End Sub   IL_0089:  nop   IL_008a:  ret } // end of method ShowBPsForm::btnConditionalBreaks_Click

CLR Basics

Before you start grinding through MSIL instructions, I need to explain a little bit about how the CLR works. The CLR is essentially the CPU for MSIL instructions. Whereas traditional CPUs rely on registers and stacks to do everything, the CLR uses only a stack. This means that to add two numbers, the CLR loads both numbers onto the stack and calls an instruction to add them. The instruction removes the two numbers from the stack and puts the result on top of the stack. If you're like me, it sometimes helps to see the actual implementation. To see a system similar to the CLR that's small enough to digest, see Brian Kernighan and Rob Pike's book, The Unix Programming Environment (Prentice Hall, 1984). In it they implement a higher order calculator (hoc), a nontrivial C example of a stack-based machine. If you'd like to see a real CLR implementation, download the Shared Source Common Language Infrastructure (CLI)—that is, "Rotor"—which is the ECMA standard implementation of a cross-platform CLR. It's a ton of code, but you'll see how it all works. You can download the Shared Source CLI from http://msdn.microsoft.com/netframework.

The CLR evaluation stack can hold any type of value in the stack slots. Copying values from memory to the stack is referred to as loading, whereas copying items from the stack to memory is referred to as storing. Unlike the Intel CPU, the CLR stack doesn't hold the locals, but the locals are in memory. The stacks are local to the method doing the work, and the CLR saves them across method invocations. Finally, the stack is also where method return values are placed. Now that I've covered just enough about how the CLR works, I'll move on to the instructions.

MSIL, Locals, and Parameter

Before I jump into the heavy gyrations, I thought I'd start out with the simplest program possible, "Hello World!" written in MSIL. That way you can see a proper MSIL program in action so I can start pointing out the various items you'll see as part of ILDASM's output. Listing 6-3 shows the complete "Hello World!" program and is included with this book's sample files as HelloWorld.IL. Even if this is the first time you've seen MSIL, you can easily see what's going on. Anything that starts with a period is a directive for the assembler, ILASM.EXE, and comments are delimited with the standard C# double slashes.

Listing 6-3: HelloWorld.IL

// You need the .assembly for the program to run. .assembly hello {}     // Declare a "C" like main. .method static public void main() il managed {     // This tells the execution engine (EE) where to start executing.     // You need one per program. This directive can apply to methods as     // well.     .entrypoint         // This is not needed for ILASM, but ILDASM will always show it so     // I included it.     .maxstack 1         // Push a string onto the stack.     ldstr  "Hello World from IL!"         // Call the System.Console.Writeline class.     call   void [mscorlib]System.Console::WriteLine(class System.String)         // Return to the caller. The file will compile if you forget this,     // but you will cause an exception in the EE.     ret }

The important parts of the code in Listing 6-3 are the last three lines. The ldstr instruction takes care of getting the string onto the stack. Putting items on the stack is loading, so all instructions that start with "ld" are getting items from memory and putting them on the stack. Even though I didn't use storing in the "Hello World!" program, getting items from the stack and putting them into memory is storing, and all those instructions begin with "st." Armed with those two little facts and the help ILDASM gives you by placing the hard-coded strings inline with the disassembly, you can perform a good portion of your reverse engineering.

Now that I've shown you a little bit of MSIL assembly language, it's time to turn to what ILDASM shows you so that you can start seeing how the various constructs fit together.

Getting the parameters and return types in ILDASM is trivial because the disassembly gives them to you when you double-click on a method to view it. The best part is that the disassembly shows the actual parameter names. Class values are shown as [module]namespace.class format. The core primitive types, int, char, and so on, are shown as their specific class type. For example, int is shown as Int32.

Local variable display is very easy to decipher as well. If you have debugging symbols available, the locals display will show the actual names. However, disassembling the system classes will look like the following:

.locals (class [mscorlib]Microsoft.Win32.RegistryKey V_0,          class System.Object V_1,          int32 V_2,          int32 V_3)

The .locals and the parentheses delineate the complete list of parameters, separated by commas. The type is given followed by a V_# format, where the # indicates each parameter number. As you'll see later, the number is used in quite a few instructions. In the previous snippet, [mscorlib] indicates the particular DLL the class comes from.

The Important Instructions

Instead of providing a huge table of instructions, I want to show the most important instructions you'll run into and examples of their use. I'll start with the loading instructions and explain all their options. As I get to the other types of instructions, I'll skip parts that are in common with the load instructions and just show their usage. The instructions I don't cover are quite easy to figure out based on their names.

ldc Load number constant

This instruction pushes a hard-coded number on the stack. The instruction format is ldc.size[.num], where size is the byte size of the value and num is a special short encoding for a 4-byte integer from -128 through 127 (when size is i4). The size is either i4 (4-byte integer), i8 (8-byte integer), r4 (4-byte floating point), or r8 (floating point). There are numerous forms to this instruction to keep the number of opcodes down.

ldc.i4.0                   // Load 0 onto the stack using the                             // special form. ldc.r8  2.1000000000000001 // Load 2.1000000000000001. ldc.i4.m1                  // Load -1 onto the stack. This                            // is the special form. ldc.i4.s -9                // Load -9 onto the stack                             // using the short form.

ldarg Load argument

ldarga Load argument address

The argument numbers start at 0. For instance methods, argument 0 is the this pointer, and the first argument starts at 1 rather than 0.

ldarg.2               // Load argument 2 onto the stack. 3 is the                       // highest number using this form. ldarg.s 6             // Load argument 6 onto the stack. All argument                       // numbers past 4 (inclusive) use this form. ldarga.s newSample    // Load newSample's address

ldloc Load local variable

ldloca Load local variable address

These instructions load the specified local variable onto the stack. All local variables are specified by the order in which they appear in the locals declaration. The instruction ldloca loads the local variable's address.

ldloc.0        // Load local 0 onto the stack. 3 is the                 // highest number using this form. ldloc.s V_6    // Load local variable 6 onto the stack. All                // variables past number 4 (inclusive) use this form. ldloca.s V_5   // Load local variable 5's address onto the stack.

ldfld Load object field of a class

ldsfld Load static field of a class

These instructions load the normal or static field from an object onto the stack. MSIL disassembly of an object is very easy because the complete field value is specified. The instruction ldflda loads the field's address.

// Load the _Originator field from System.Reflection.AssemblyName.  // Notice the type of the field is given as well. ldfld    unsigned int8[] System.Reflection.AssemblyName::_Originator // Load the empty string from System.String. ldsfld   class System.String [mscorlib]System.String::Empty

ldelem Load an element of an array

This instruction loads the specified element onto the stack for single-dimensional, zero-based arrays. The previous two instructions put the array item and the index onto the stack (in that order). The ldelem instruction removes the array and the index from the stack and puts the specified element on the top of the stack. A type field follows the ldelem instruction. The most common type field in the compiled base class library is ldelem.ref, which gets the element as an object. Other common types are ldelem.i4 for getting the element as a signed 4-byte integer, and ldelem.i8 to get a 8-byte integer.

.locals (System.String[] V_0, // The [] indicate an array declaration.          int32 V_1 )          // The index. ...                           // Do work to fill V_0. ldloc.0                       // Load the array. ldc.i4.0                      // Load the zero index. ldelem.ref                    // Get the object at index zero.

ldlen Load the lLength of an array

This instruction removes the zero-based, single-dimensional array from the stack and pushes the length of the array onto the stack.

// Load the attribute field, which is an array. ldfld class System.ComponentModel.MemberAttribute[]    System.ComponentModel.MemberDescriptor::attributes stloc.1                    // Store the value into the first                            // local (an array). ldloc.1                    // Load the first local onto the stack. ldlen                      // Get the array length.

starg Store a value in an argument slot

Takes the value off the top of the stack and places it into the specified argument.

starg.s categoryHelp            // Store the top of the stack into                                 // categoryHelp. All starg                                  // instructions use the .s form.

stelem Store an element of an array

Whereas the previous three instructions place the zero-based, single-dimensional array; the index; and the value onto the stack (in that order), the stelem instruction casts the value into the appropriate array type before moving the value into the array. The stelem instruction removes all three items from the stack. Like the ldelem instruction, the type field specifies the conversion. The most common conversion is stelem.ref to convert a value type to an object.

.method public hidebysig specialname instance void  set_MachineName(class System.String 'value') il managed {   .maxstack  4   .locals (class System.String[] V_0)      ldloc.0                     // Load the array on the stack.    ldc.i4.1                    // Load the index, the constant 1.    ldarg.1                     // Load the argument, the string.    stelem.ref                  // Store the element.

stfld Store into a field of an object

This instruction takes the value off the top of the stack and places it into the object field. As when loading a field, the complete reference is given.

stfld  int32[] System.Diagnostics.CategoryEntry::HelpIndexes

ceq Compare equal

This instruction compares the top two values on the stack. The two items are removed from the stack, and if the values are equal, a 1 is pushed onto the stack; otherwise, a 0 is pushed onto the stack.

ldloc.1                    // Load the first local. ldc.i4.0                   // Load the constant zero. ceq                        // Compare the items for equality.

cgt Compare greater than

This instruction also compares the top two values on the stack. The two items are removed, and if the first value pushed is greater than the second value, a 1 is pushed on the stack; otherwise, a 0 is pushed. The cgt instruction can also have the .un modifier applied to indicate the comparison is unsigned or unordered.

// Get the collection count. call instance int32 System.Diagnostics.   CounterCreationDataCollection::get_Count() ldc.i4.0                    // Load the constant zero. cgt                         // Compare if the count is                              // greater than zero.

clt Compare less than

This instruction performs identically to cgt except that 1 is pushed when the first value is less than the second value.

// Get the trace switch level. call instance value class System.Diagnostics.TraceLevel             System.Diagnostics.TraceSwitch::get_Level() ldc.i4.1                    // Load the constant 1. clt                         // Compare if the trace level is                             // less than one.

br Unconditional branch

This instruction is the goto of MSIL.

br.s IL_008d                // Goto offset into the method.

brfalse Branch on false

brtrue Branch on true

Both instructions look at the value on the top of the stack and branch accordingly. The brtrue instruction branches only when the value is 1, whereas brfalse branches only when it is 0. Both instructions remove the value from the top of the stack.

ldloc.1                       // Load the first local. brfalse.s  IL_006a            // If zero, branch. ldloc.2                       // Load the second local. brtrue.s   IL_006c            // Branch if one.

beq Branch on equal

bgt Branch on greater than or equal

ble Branch on less than or equal

blt Branch on less than

bne Branch on not equal

In each general branching case, the instruction takes the two values at the top of the stack and compares the top value with the next value. In all cases, the branch takes the place of a comparison followed by one of the Boolean branches. For example, bgt is equivalent to a cgt instruction followed by a brtrue instruction.

conv Data conversion

This instruction converts the data on the top of the stack to a new type and leaves the converted value on the top of the stack. The final conversion type follows the conv instruction. For example, conv.u4 converts to an unsigned 4-byte integer. The conv instruction with just the type doesn't throw any exceptions if there is any sort of overflow. If the instruction has .ovf between the conv and the type (for example, conv.ovf.u8), an overflow generates an exception.

ldloc.0                      // Load local zero (an array). ldlen                        // Get the array length. conv.i4                      // Convert the array length to a                               // four byte value.

newarr Create a zero-based, one-dimensional array

This instruction creates a new array of the specified type with the number of elements indicated by the value on the top of the stack. The number of elements count is removed from the stack, and the new array is placed on the top of the stack.

ldc.i4.5                    // Set the number of elements to                              // create to five.                             // Create a new array. newarr System.ComponentModel.MemberAttribute

newobj Create a new object

Creates a new object and calls the object's constructor. All constructor arguments are passed on the stack. If the creation succeeds, the arguments are removed from the stack and the object reference is left on the stack.

.method public hidebysig specialname rtspecialname      instance void  .ctor(class [mscorlib]System.IO.Stream 'stream',                           class System.String name) il managed {      ldarg.1                    // Load the stream argument.                               // Create the new class.    newobj instance void [mscorlib]           System.IO.StreamWriter::.ctor(class                                          [mscorlib]System.IO.Stream)

box Convert value type to object reference

This instruction forces a value into an object and leaves the object on the stack when the conversion is done. When boxing, this instruction does the work. You will see the following code a lot when passing parameters:

// Notice the value type INT32 is passed to this method. .method public hidebysig specialname         instance void  set_Indent(int32 'value') il managed {   ldstr     "Indent"                  // Push the method name. ldarga.s  'value'                   // Load the argument address of the                                      // first parameter. box       [mscorlib]System.Int32    // Convert the address into an                                     // object.                                     // Load the message. ldstr     "The Indent property must be non-negative."           // Create a new ArgumentOutOfRangeException newobj    instance void [mscorlib]System.ArgumentOutOfRangeException:: .ctor(class System.String,       class System.Object,       class System.String)

unbox Convert boxed value type to its raw form

This instruction returns a managed reference to the value type in the boxed form. The returned reference isn't a copy but rather the actual object state. With C# and Visual Basic .NET compiled code, after an unbox instruction comes the ldind instruction (load value indirect onto the stack) or ldobj (copy value type to the stack).

// Convert the value into a System.Reflection.Emit.LocalToken unbox System.Reflection.Emit.LocalToken // Get the value onto the stack ldobj System.Reflection.Emit.LocalToken unbox [mscorlib]System.Int16        // Convert the value to an Int16  // object ldind.i2                            // Put the object's value onto the                                     // stack.

call Call a method

callvirt Call a method associated at run time with an object

The call instruction calls static and nonvirtual normal methods. Virtual methods and interface methods use the callvirt instruction. Arguments are placed in left-to-right order. Note that this order is the opposite of most calling conventions in the IA32 world. Here is an example of using callvirt:

// Load the parameter. ldfld class System.String   System.CodeDOM.Compiler.CompilerResults::pathToAssembly // Call the virtual method set_CodeBase. callvirt   instance void [mscorlib] System.Reflection.AssemblyName::set_CodeBase  (class System.String)   ldarg.0                    // Load the this pointer, which is                             // always the first parameter. ldarg.1                    // Load argument one. ldnull                     // Load a null value                            // Call the virtual function callvirt instance void    System.Diagnostics.TraceListener::Fail(class System.String                                          class System.String) ret                        // Return to caller.

Other Reverse Engineering Tools

ILDASM is an excellent tool, but I want to mention two other tools that I find invaluable. For both of these tools, the price is right—they are free! The first is Lutz Roeder's .NET Reflector (http://www.aisto.com/roeder/dotnet/), which does everything that ILDASM does and much more. One of .NET Reflector's key features is that you can easily search for types in an assembly. You'd hope everyone would properly document all the custom exceptions they throw, but they don't always do so. With .NET Reflector, select Type Search and, in the Type Search window, above the Type field, type in except. All types that have exception in the name are shown.

At times, it's extremely valuable to see at a glance which methods a particular method calls. In .NET Reflector, in the tree control, highlight the particular method you're interested in and select Call Tree from the View menu. In the Call Tree window, expand any subcalls to see exactly what calls what from that method. It's an outstanding way to see how things fit together.

Finally, .NET Reflector's disassembly view is superior to ILDASM's. After selecting the method you want to view, press the Enter key to make the Disassembler window pop right up. If you're curious about what an instruction does, simply move the cursor over the instruction to make a tool tip pop up with an explanation. Parameter and local types as well as methods called by call instructions are underlined. Simply click on the item, and the main .NET Reflector window will jump to the type or method so that you can examine it.

The second tool I want to mention is called Anakrino, which is a Greek word meaning "to examine" or "judge." Anakrino is a decompiler for .NET that shows the C# or Managed Extensions for C++ code for an assembly. Anakrino was written by Jay Freeman and is downloadable from http://www.saurik.com/net/exemplar/. Unlike .NET Reflector, Anakrino has source code available. Although Anakrino isn't perfect, it's a fantastic way to learn about how the .NET Framework code all fits together. Using Anakrino is self-explanatory, so I won't bother to go into it. One caveat I will mention is that the source code is quite "original" with a huge amount of template usage, so you'll need to make a serious commitment if you want to extend it. A few commercial decompilers that produce better output have been released at the time of this writing, but they're prohibitively expensive, so Anakrino's foibles are perfectly acceptable.