Basics of the Common Language Runtime

The .NET common language runtime is but one of many aspects of the .NET concept, but it’s the core of .NET. (Note that, for variety’s sake, I’ll sometimes refer to the common language runtime as the runtime.) Rather than focusing on an overall description of the .NET platform here, then, let’s concentrate on the part of .NET where the action really happens: the common language runtime.

For excellent discussions of the general structure of .NET and its components, see Introducing Microsoft .NET (Microsoft Press, 2001), by David S. Platt, and Inside C# (Microsoft Press, 2001), by Tom Archer.

Simply put, the common language runtime is a run-time environment in which .NET applications run. It provides an operating layer between the .NET applications and the underlying operating system. In principle, the common language runtime is similar to the runtimes of interpreted languages such as GBasic or Smalltalk or to the Java Virtual Machine. But this similarity is only in principle: the common language runtime is not an interpreter.

The .NET applications generated by .NET-oriented compilers (such as Microsoft Visual C# .NET, Microsoft Visual Basic .NET, ILAsm, and many others) are represented in an abstract, intermediate form, independent of the original programming language and of the target machine and its operating system. Because they are represented in this abstract form, .NET applications written in different languages can interoperate very closely, not only on the level of calling each other’s functions but also on the level of class inheritance.

Of course, given the differences in programming languages, a set of rules must be established for the applications to allow them to get along with their neighbors nicely. For example, if you write an application in Visual C# .NET and name three items MYITEM, MyItem, and myitem, Visual Basic .NET, which is case-insensitive, will have a hard time differentiating them. Likewise, if you write an application in ILAsm and define a global method, Visual C# .NET will be unable to call the method because it has no concept of global (out-of-class) items.

The set of rules guaranteeing the interoperability of .NET applications is known as the common language specification (CLS), outlined in Partition I of the Common Language Infrastructure standardization proposal of the European Computer Manufacturers Association (ECMA). It limits the naming conventions, data types, function types, and certain other elements, forming a common denominator for different languages. It is important to remember, however, that the CLS is merely a recommendation and has no bearing whatsoever on common language runtime functionality. If your application is not CLS-compliant, it might be valid in terms of the common language runtime, but you have no guarantee that it will be able to interoperate with other applications on all levels.

The abstract intermediate representation of the .NET applications, intended for the common language runtime environment, includes two main components: metadata and managed code. Metadata is a system of descriptors of all structural items of the application—classes, their members and attributes, global items, and so on—and their relationships. This chapter provides some examples of metadata, and later chapters describe all the metadata structures.

The managed code represents the functionality of the application’s methods (functions) encoded in an abstract binary form known as Microsoft intermediate language (MSIL), or common intermediate language (CIL). To simplify things, I’ll refer to this encoding simply as intermediate language (IL). Of course, other intermediate languages exist in the world, but as far as our endeavors are concerned, let’s agree that IL means CIL/MSIL, unless specified otherwise.

The IL code is “managed” by the runtime. Common language runtime management includes, but is not limited to, three major activities: type control, structured exception handling, and garbage collection. Type control involves verification and conversion of item types during execution. Structured exception handling is functionally similar to “unmanaged” structured exception handling (C++-style), but it is performed by the runtime rather than by the operating system. Garbage collection involves automatic identification and disposal of objects no longer in use.

A .NET application, intended for the common language runtime environment, consists of one or more managed executables, each of which carries metadata and (optionally) managed code. Managed code is optional because it is always possible to build a managed executable containing no methods. (Obviously, such an executable can be used only as an auxiliary part of an application.) Managed .NET applications are called assemblies. (This statement is somewhat simplified; for more details about assemblies, application domains, and applications, see Chapter 5) The managed executables are referred to as modules. You can create single-module assemblies and multimodule assemblies. As illustrated in Figure 1-1, each assembly contains one prime module, which carries the assembly identity information in its metadata.

Figure 1-1 A multimodule .NET assembly.

Figure 1-1 also shows that the two principal components of a managed executable are the metadata and the IL code. The two major common language runtime subsystems dealing with each component are, respectively, the loader and the JIT (just-in-time) compiler.

In brief, the loader reads the metadata and creates in memory an internal representation and layout of the classes and their members. It performs this task on demand, meaning that a class is loaded and laid out only when it is referenced. Classes that are never referenced are never loaded. When loading a class, the loader runs a series of consistency checks of the related metadata.

The JIT compiler, relying on the results of the loader’s activity, compiles the methods encoded in IL into the native code of the underlying platform. Because the runtime is not an interpreter, it does not execute the IL code. Instead, the IL code is compiled in memory into the native code, and the native code is executed. The JIT compilation is also done on demand, meaning that a method is compiled only when it is called. The compiled methods stay cached in memory. If memory is limited, however, as in the case of a small computing device such as a handheld PDA or a smart phone, the methods can be discarded if not used. If a method is called again after being discarded, it is recompiled.

The diagram shown in Figure 1-2 illustrates the sequence of creation and execution of a managed .NET application.

Figure 1-2 The creation and execution of a managed .NET application. Arrows with hollow circles at the base indicate data transfer; arrows with black circles represent requests and control messages.

A managed executable can be precompiled from IL to the native code, using the NGEN utility. You can do this when the executable is expected to run repeatedly from a local disk, to save time on just-in-time compilation. This is standard procedure, for example, for managed components of the .NET Framework, which are precompiled during the installation. (Tom Archer refers to this as install-time code generation.) In this case, the precompiled code is saved to the local disk or other storage, and every time the executable is invoked, the precompiled native-code version is used instead of the original IL version. The original file, however, must also be present because the precompiled version does not carry the metadata.

With the roles of the metadata and the IL code established, let’s consider the ways you can use ILAsm to describe them.