Native Image Generation (NGen) | Professional .NET Framework 2.0 (Programmer to Programmer)

As we've already discussed, the CLR does not interpret IL. It uses a Just in Time (JIT) compiler to compile IL to native code at runtime, inserting hooks into the various CLR services in the process. This process was discussed in Chapter 2. But clearly compiling managed code into native code at runtime has some costs associated with it. For certain classes of applications, startup time is crucial. Client applications are great examples of this, because there's a user sitting at the computer waiting for the application to become responsive. The more time spent jitting code up front, the more time the user must wait. For situations in which this is a problem, the CLR offers ahead-of-time JIT compilation using a technology called NGen.

The ngen.exe utility, located in your Framework directory (i.e., \WINDOWS\Microsoft.NET\Framework\v2.0.50727\), enables you to perform this ahead-of-time compilation on the client machine. The result of this operation is stored in a central location on the machine called the Native Image Cache. The loader knows to look here when loading an assembly or DLL that has a strong name. All of the .NET Framework's assemblies are NGen'd during install of the Framework itself.

In version 2.0 of the Framework, a new NGen Windows Service has been added to take care of queuing and managing NGen compilations in the background. This means your program can install, add a request to the NGen queue, and then exit installation. The NGen service will then take care of compilation asynchronously. This can reduce quite noticeably the install time for your program. Right after an install of a new component, however, there's a window of time where the NGen image might not be ready yet, and thus could be subject to jitting.

NGen uses the same code generation techniques that the CLR's JIT uses to generate native code. As discussed briefly in the section on the JIT in Chapter 2, the code that is generated is designed to take advantage of the underlying computer architecture. Subtle differences in chip capabilities will make an NGen image unusable across machines. Thus, image generation must occur on the client machine as part of install (or postinstall) rather than being done in a lab before packaging and shipping your program. Thankfully, the CLR notices this at load time and will fall back to runtime JIT if necessary.

If you determine NGen is right for you — performance testing should determine this choice — you'll want to run it against your application's EXE. Doing this will cause NGen to traverse your application's dependencies, generate code for each, and store the image in the Native Image Cache alongside your program's. If any of your program's dependencies is missing a native image, the CLR loader won't be able to load your image and will end up jitting instead.

Managing the Cache (ngen.exe)

The ngen.exe tool has quite a few switches to control behavior. We'll briefly look at the most common activities you'll want to perform. Running ngen.exe /? at the command prompt will show detailed usage information for the tool. The Windows Service that takes care of managing and executing queued activities is called ".NET Runtime Optimization Service v2.0.50727_<Processor>," and can be found in your computer's Administrative Tools\Services menu.

Here is a brief summary of operating NGen:

Install: Running ngen install foo.dll will JIT compile and install the images for foo.dll and its dependencies into the Native Image Cache. If dependencies already exist in the cache, those will be reused instead of regenerating them. You can specify /queue:n at the end of the command, where n is 1, 2, or 3 (e.g., ngen install foo.dll /queue:2). This takes advantage of the Windows Service to queue the activity for background execution instead of executing it immediately. The scheduler will execute tasks in priority order, where 1 is the highest-priority task, and 3 is the lowest-priority task.
Uninstall: To completely remove the image from the Native Image Cache for (say) foo.dll, you can run ngen uninstall foo.dll.
Display: Typing ngen display foo.dll will show you the image status for foo.dll, such as whether it's available or enqueued for generation. Executing ngen display by itself will show a listing of the entire Native Image Cache's contents.
Update: Executing ngen update will update any native images that have been invalidated due to a change in an assembly or one of its dependencies. Specifying /queue at the end, for example ngen update /queue, schedules the activity rather than performing it synchronously.
Controlling background execution: Running ngen queue [pause|continue|status] enables you to manage the queue from the command line by pausing, continuing, or simply enquiring about its status.
Manually executed queued items: You can synchronously perform some or all of the queued work items by invoking ngen executeQueuedItems and optionally passing a priority of either 1, 2, or 3. If a priority is supplied, any lesser-priority items are not executed. Otherwise, all items are executed sequentially.

For detailed usage information, please consult the Microsoft .NET Framework SDK.

Base Addresses and Fix-Ups

A process on Windows has a large contiguous address space which, on 32-bit systems, simply means a range of numbers from (0x00000000 through 0xffffffff, assuming /3GB is off). All images get loaded and laid out at a specific address within this address space. Images contain references to memory addresses in order to interoperate with other parts of the image, for example making function calls (e.g., call 0x71cb0000), loading data (e.g., mov ecx,0x71cb00aa), and so on. Such references are emitted as absolute addresses to eliminate the need for address arithmetic at runtime — for example, calculating addresses using offsets relative to a base address — making operations very fast. Furthermore, this practice enables physical page sharing across processes, reducing overall system memory pressure.

To do this, images must request that the loader place them at a specific address in the address space each time they get loaded. They can then make the assumption that this request was granted, burning absolute addresses that are calculated at compile time based on this address. This is called an image's base address. Images that get to load at their preferred base address enjoy the benefits of absolute addressing and code sharing listed above.

Most developers never think about base addresses seriously. The .NET Framework team certainly does. And any team developing robust, large-scale libraries who wants to achieve the best possible startup time should do the same. Consider what happens if you don't specify the base address at all. Another assembly that didn't have a base address might get loaded first. And then your assembly will try to load at the same address, fail, and then have to fix-up and relocate any absolute memory addresses based on the actual load address. This is all done at startup time and is called rebasing.

The base address for an image is embedded in the PE file as part of its header. You can specify a preferred base address with the C# compiler using the /baseaddress:<xxx> switch. Each compiler offers its own switch to emit this information in the resulting PE file. For example, ilasm.exe permits you to embed an .imagebase directive in the textual IL to indicate a base address.

Clearly, two assemblies can still ask for the same base address. And if this occurs, your assembly will still have to pay the price for rebasing at startup. Large companies typically use static analysis to identify overlaps between addresses and intelligently level the base addresses to avoid rebasing. The Platform SDK ships with a tool called ReBase.exe that enables you to inspect and modify base addresses for a group of DLLs to be loaded in the same process.

Hard Binding

Even in the case of ahead-of-time generated native images, some indirection and back-patching is still necessary. All accesses to dependent code and data structures in other assemblies still goes indirectly through the CLR, which looks up the actual virtual addresses and back-patches the references. This is done through very small, hand-tuned stubs of CLR code, but nonetheless adds an extra indirection for the first accesses. A consequence of this is that the CLR must mark pages as writable in order to perform the back-patching, which ends up reducing the amount of sharing and increasing the private pages in your application. We've already discussed why this is bad (above).

NGen 2.0 offers a feature called hard binding to eliminate this cost. You should only consider hard binding if you've encountered cases where this is a problem based on your targets and measurements. For example, if you've debugged your private page footprint and determined that this is the cause, only then should you turn on hard binding. Turning it on can actually harm the performance of your application, because it bundles more native code together so that absolute virtual addresses can be used instead of stubs. The result is that more code needs to be loaded at startup time. And base addresses with hard-bound code must be chosen carefully; with more code, rebasing is substantially costlier.

To turn on hard binding, you can hint to NGen that you'd like to use it via the DependencyAttribute and DefaultDependencyAttribute, both located in the System.Runtime.CompilerServices namespace. DependencyAttribute is used to specify that an assembly specifically depends on another. For example, if your assembly Foo.dll depends on Bar.dll and Baz.dll, you can mark this using the assembly-wide DependencyAttribute attribute:

 using System.Runtime.CompilerServices; [assembly: Dependency("Bar", LoadHint.Always)] [assembly: Dependency("Baz", LoadHint.Sometimes)] class Foo { /*...*/ }

Alternatively, you may use DefaultDependencyAttribute to specify the default NGen policy for assemblies that depend on the assembly annotated with this attribute. For example, if you have a shared assembly which will be used heavily from all of your applications, you might want to use it:

 using System.Runtime.CompilerServices; [assembly: DefaultDependency(LoadHint.Always)] class Baz { /*...*/ }

The LoadHint specifies how frequently the dependency will be loaded from calling assembly. Today, NGen does not turn on hard binding except for assemblies marked LoadHint.Always. In the above example, this means Foo.dll will be hard bound to Bar.dll (because the association is marked as Always). Although Baz.dll has a default of Always (which means assemblies will ordinarily be hard-bound to it), Foo.dll overrides this with Sometimes, meaning that it will not be hard bound.

String Freezing

Normally, NGen images will create strings on the GC heap using the assembly string table, as is the case with ordinary assemblies. String freezing, however, results in a special string GC segment that contains all of your assembly's strings. These can then be referenced directly by the resulting image, requiring fewer fix-ups and back-patching at load time. As we've seen above, fewer fix-ups and back-patching marks less pages as writable and thus leads to a smaller number of private pages in your working set.

To apply string freezing, you must mark your assembly with the System.Runtime.CompilerServices .StringFreezingAttribute. It requires no arguments. Note: string freezing is an NGen feature only; applying this attribute to an assembly that gets jitted has no effect.

 using System; using System.Runtime.CompilerServices; [assembly: StringFreezing] class Program { /*... */ }

One downside to turning string freezing on is that an assembly participating in freezing cannot be unloaded from a process. Thus, you should only turn this on for assemblies that are to be loaded and unloaded transiently throughout a program's execution. We discussed domain neutrality and assembly unloading earlier in this chapter, where similar considerations were contemplated.

Benefits and Disadvantages

NGen has the clear advantage that the CLR can execute code directly without requiring a JIT stub to first load and call into mscorjit.dll to generate the code. This can have substantial performance benefits for your application. The time savings for the CLR to actually load your program from scratch is usually not dramatic — that is, the cold boot time — because there is still validation and data structure preparation performed by the runtime. But because the native images are loaded into memory more efficiently (assuming no fix-ups) and because code sharing is increased, warm boot time and working set can be substantially improved.

Furthermore, for short running programs, the cost of runtime JIT compilation can actually dominate the program's execution cost. In the very least, it may give the appearance of a sluggish startup (e.g., the time between a user clicking a shortcut to the point at which the WinForms UI shows up). In such cases, NGen can improve the user experience quite dramatically. For longer-running programs — such as ASP.NET web sites, for example — the cost of the JIT is often minimal compared to other startup and application logic. The added management and code unloading complexity associated with using NGen for ASP.NET scenarios means that you should seldom ever try to use the two in combination.

On the other hand, there are certainly some disadvantages to using NGen, not the least of which is the added complexity to your installation process. Worst of all, running ngen.exe across an entire assembly and its dependencies is certainly not a quick operation. When you install the .NET Framework redistributable package, you'll probably notice a large portion of the time is spent "Generating native images." That's NGen working its magic. In 2.0, this is substantially improved as a result of the new Windows Service that performs compilation in the background.

To actually invoke ngen.exe for manual or scheduled JIT compilation also unfortunately requires Administrator access on the client's machine. This can be an adoption blocker in its own right. You can certainly detect this in your install script and notify the user that, for optimized execution time, they should run a utility as Administrator to schedule the NGen activity. Images generated by Administrator accounts can still be used by other user accounts.

NGen images can also get invalidated quite easily. Because NGen makes a lot of optimizations that create cross-assembly interdependencies — for example, cross-assembly inlining and especially in the case of hard binding — once a dependency changes, the NGen image will become invalid. This means that the CLR will notice this inconsistency and resort back to a JIT-at-runtime means of execution. In 2.0, invalidations occur less frequently — the infrastructure has been optimized to prevent them to as great an extent as possible — and the new NGen Windows service may be used to schedule re-NGen activities in the background whenever an image is invalidated.