Emitting Code and Metadata | Professional .NET Framework 2.0 (Programmer to Programmer)

Compilers emit code and metadata on your behalf. They do this by parsing your source, optimizing it, and, in the case of managed code, transforming it into its corresponding IL representation. The System.Reflection.Emit namespace gives you a set of APIs with which to do the same. This can be useful for building your own compiler, performing code generation, or simply generating a snippet of code. This latter case could be used to cache dynamic method invocations so that jitted code is used for method dispatch rather than purely dynamic code paths.

The cornerstone of this feature is the builder APIs. As noted at the beginning of this chapter, these APIs derive from the info APIs and add functionality to emit metadata rather than read it. In general, this feature is self-explanatory once you get started. In this section, we'll briefly take a look at some of the interesting highlights.

Generating Assemblies

In most cases, you'll want to define an assembly, module, set of types, and some methods when generating code dynamically. We'll see later on how you can skip a lot of this and go straight to generating code for a method using the LCG feature. When defining a new assembly, you have to construct a single AssemblyBuilder instance and one ModuleBuilder for each module in the assembly.

You can create the AssemblyBuilder via the AppDomain.DefineDynamicAssembly instance method. This method accepts a name for the assembly, an AssemblyBuilderAccess enumeration value to indicate what you intend to use the assembly for (Run, Save, RunAndSave, or ReflectionOnly), and a set of mostly optional code access security options. Similarly, you must construct each ModuleBuilder using the AssemblyBuilder.DefineDynamicModule method. You pass to this method the name of the module, the filename, and whether you'd like debug symbols emitted (i.e., a PDB):

 AssemblyBuilder ab = AppDomain.CurrentDomain.DefineDynamicAssembly(     new AssemblyName("foo.exe"), AssemblyBuilderAccess.Save,     AppDomain.CurrentDomain.Evidence); ModuleBuilder mb = ab.DefineDynamicModule(ab.FullName, "foo.exe", false);

This sets up a single assembly, foo.exe, with a single module, foo.exe. We've chosen not to generate a PDB for the time being. Please refer to the "Further Reading" section for more information on Reflection.Emit, including how precisely to generate debugging symbols for your code.

After that, you will use TypeBuilders, MethodBuilders, and so forth, to create the types and methods inside of your dynamic assembly. There are also methods to embed resources (DefineResource and DefineUnmanagedResource), among other less common things. Once you've constructed an assembly entirely, you will want to Save it. If you've chosen to generate an assembly for in-memory execution, this step can be skipped. If you're generating an EXE, you'll want to use the SetEntryPoint method to indicate which method is the assembly's entrypoint.

Building Types

To create new types inside of your dynamic module, you need to create a new TypeBuilder. To do that, call the ModuleBuilder's DefineType method. It accepts a string representing the fully qualified type name (namespace included). Overloads are offered that accept a TypeAttributes flags-style enumeration value (which specifies various flags about a type you might want, such as Abstract, Public, Sealed, and so forth), the parent class, interfaces implemented, and so forth. TypeBuilder offers methods (e.g., SetParent) to change these attributes post construction.

For example, given ab and mb above, we might create a new type as follows:

 TypeBuilder tb = mb.DefineType("Program",     TypeAttributes.Public | TypeAttributes.Sealed, typeof(object));

This creates a public sealed class, which derives from System.Object (this base type would have been inferred had we not supplied it; you can do things like deriving from other types inside your dynamic assembly, for example).

Once you have a TypeBuilder, you can go about defining fields via the DefineField method:

 tb.DefineField("idCounter", typeof(int),     FieldAttributes.Private | FieldAttributes.Static);

DefineField returns a FieldBuilder, but there's not much you can do directly with it — all of its attributes can be specified with the constructor. You can, however, pass it to other builder APIs to reference that field, for example if you wanted to load the contents of the field in some method's IL.

You can likewise create methods using the DefineMethod method. We'll take a look at that momentarily. You can also build properties and constructors using the DefineProperty and DefineConstructor methods, which end up being very much like defining methods. There are other related and similar methods. We won't discuss them explicitly here. When you are finished creating a new type, you must call the CreateType method on it. Otherwise, the program will not save or execute correctly.

Building Methods

Building a method is slightly more complicated than other types of CTS abstractions. This is because you must worry about creating the actual executable IL inside of a method body. As noted already, you obtain a new MethodBuilder using the DefineMethod method on TypeBuilder:

 MethodBuilder m = tb.DefineMethod("MyMethod",     MethodAttributes.Public | MethodAttributes.Static);

MethodAttributes offers a large number of possible flags, just as with TypeAttributes, FieldAttributes, and the like. A method's parameters can be constructed using the DefineParameter method on MethodBuilder. Once you've constructed a MethodBuilder with the appropriate attributes, parameters, and return type, you call GetILGenerator to obtain an instance of the ILGenerator type. This is what you'll use to generate the code itself.

The ILGenerator type is actually very simple. It puts a lot of power (and responsibility) into your hands. You need to be quite familiar with IL and the stack transformations for each instruction such that you can actually generate verifiable and correct code. You emit each IL instruction one by one using the Emit method, passing in an OpCode instance representing the IL instruction being emitted. A large number of Emit overloads exist to facilitate passing in arguments to the instructions. You'll also find methods like DeclareLocal to aid you in creating local variables in the method's activation frame, BeginExceptionBlock and EndExceptionBlock (and similar methods) to help you to construct exception handling code, and DefineLabel and MarkLabel for creating and marking IL offsets with labels (e.g., for control flow logic). The OpCodes static class has a set of fields containing an OpCode instance corresponding to each available IL instruction.

Unfortunately, a holistic overview of all of this functionality is outside of the scope of this book. Please consult the "Further Reading" section for detailed resources. Here is a sample "Hello, World!" program using Reflection.Emit:

 // Set up our assembly and module builders: string outFilename = "foo.exe"; AssemblyBuilder ab = AppDomain.CurrentDomain.DefineDynamicAssembly(     new AssemblyName("foo"), AssemblyBuilderAccess.Save,     AppDomain.CurrentDomain.Evidence); ModuleBuilder mb = ab.DefineDynamicModule(ab.FullName, outFilename, false); // Create a simple type with one method: TypeBuilder tb = mb.DefineType("Program",     TypeAttributes.Public | TypeAttributes.Sealed, typeof(object)); MethodBuilder m = tb.DefineMethod("MyMethod",     MethodAttributes.Public | MethodAttributes.Static); // Now emit some very simple "Hello World" code: ILGenerator ilg = m.GetILGenerator(); ilg.Emit(OpCodes.Ldstr, "Hello, World!"); ilg.Emit(OpCodes.Call,     typeof(Console).GetMethod("WriteLine", new Type[] { typeof(string) })); ilg.Emit(OpCodes.Ret); // Lastly, create the type, set our entry point, and save to disk: tb.CreateType(); ab.SetEntryPoint(m); ab.Save(outFilename);

We emit three simple instructions in the method. The corresponding textual IL for this method is:

 .method public static void MyMethod() cil managed {     .entrypoint     .maxstack 1     ldstr "Hello, World!"     call  void [mscorlib]System.Console::WriteLine(string)     ret }

Lightweight Code Generation (LCG)

A new feature in 2.0 enables you to generate code without having to set up assemblies, modules, and types before hand. Furthermore, you can attach a dynamic method to an existing type at runtime. This enables you to access and manipulate the enclosing type's state inside your method's code. Working with a LCG method is exactly like working with a MethodBuilder.

To begin constructing a new LCG method, you just instantiate a new DynamicMethod supplying some basic information to the constructor: the name of the method, its return type, a Type[] containing the types of its parameters, and either a Module or Type in which the method will live. If you choose a Module, the method is a global method without access to any fields; choosing a Type places the method on that type, enabling you to access enclosing fields. You can also specify MethodAttributes, CallingConventions, and/or whether the emitted method should skip checks for visibility at runtime. To emit code, call GetILGenerator and proceed just as if you had a MethodBuilder.

One word of caution when using LCG: Debugging is difficult. If you emit incorrect IL, the CLR will throw an InvalidProgramException when you try to execute it. It won't even give you a verifier log to tell you precisely what went wrong. You can use the SOS debugger extensions to traverse pointers and get at the IL, but it's still not easy to run it through the verifier (i.e., peverify.exe). For this situation, I recommend one of two things: (1) you can use the same IL generation on a true Reflection .Emit assembly, save it to disk, and then try to verify it; or (2) search the Internet for LCG debugging utilities — some of the CLR team members have written useful utilities that plug right into Visual Studio to make this experience simpler.