Compilers and Language Support

I l @ ve RuBoard

You can write .NET applications using any of the languages supported by .NET ”C#, Visual Basic .NET, C++, and J#. As with most modern languages, the code you write must be compiled to machine code. In a departure for Microsoft, however, the compilers for languages supported by .NET do not generate machine instructions for a specific processor. Instead, they generate a pseudomachine code called Microsoft Intermediate Language (MSIL).

The Common Language Runtime

The common language runtime is the execution system that runs code written for .NET. When an application is executed, the runtime uses just-in-time (JIT) compilation to convert the MSIL code into real machine instructions and runs them. If you're familiar with the operation of the Java runtime environment, you already know about this kind of operation because like a Java Virtual Machine (JVM), the common language runtime aims to provide portability. The MSIL code can be transported onto any machine that has the runtime installed.

With .NET, Microsoft has extended the concept underlying the JVM. The deployment and dynamic linking mechanisms used by the common language runtime allow compiled code to be built on one machine and transported to another machine for execution. More on this later. Also, by enabling a number of languages to compile to the same MSIL code and, more important, by standardizing many of the data types used by these languages (for example, a Long in Visual Basic .NET is the same as a long in C#, which is the same as a long in J#, and so on), .NET not only allows applications to be portable across hardware, but it enables the data processed by them to be understood across all languages. All languages running under .NET use the same method-calling conventions, which makes it easy to perform interlanguage method calls. (We'll discuss this in detail later in the chapter.)

When an application is compiled into MSIL, the result is a module. A module can be linked with resources from other modules and DLLs into an executable (EXE) file or another DLL. Although the filename extensions used are still EXE and DLL, the content and format of these files are a little different from native EXEs and DLLs. The runtime uses an extended version of the Portable Executable (PE) file format used by regular Windows executables. The main difference is that runtime PE files contain sections with information about the types defined by your code (or classes, if you're writing J# code, or classes, structs, enums, and attributes if you're writing C# code). They also contain security information and any dependencies on types defined by other modules. This information is required when you link modules.

A Closer Look at MSIL

To understand the runtime, it helps to look closely at MSIL. Let's look at a variation on the familiar "Hello, World" program:

 packageGreeting; publicclassHello { publicstaticvoidmain(String[]args) { for(inti=0;i<10;i++) System.out.println("Hello,World"); } } 

If you create a text file called Greeting.jsl using Wordpad, or download it from the book sample files, and then type in this code and compile it, the result will be an executable named Greeting.exe. (The convention is to use the .jsl extension for J# source code.) To compile the program, you can use the J# compiler from the command line:

 vjcGreeting.jsl 

Tip

The J# compiler is called vjc. Don't fall into the trap of thinking that just because the C# compiler is called csc, the J# equivalent is jsc. Visual Studio .NET includes a jsc compiler that is used for compiling JavaScript code. If you use it over your J# code, you'll get what looks like meaningful errors as the compiler understands some Java language syntax, but not all of it. If you don't realize this difference, you can spend hours telling the compiler that your code is perfectly valid and hoping that it will compile your program if you shout loudly enough!


When you run Greeting.exe, you'll be rewarded with the message "Hello, World" displayed 10 times.

Tip

To run vjc or any of the other command-line tools supplied with the .NET Framework SDK, you must set your environment variables appropriately. Microsoft supplies the batch file Corvars.bat, which you can run from the command line for this purpose. Corvars.bat is located at C:\Program Files\Microsoft Visual Studio.NET\FrameworkSDK\bin.


You can look at the MSIL generated for this executable using the Intermediate Language Disassembler tool, better known as ILDASM. Figure 2-1 shows ILDASM being used to disassemble Greeting.exe.

Figure 2-1. Using ILDASM to disassemble the sample code

The first item you'll notice is labeled Manifest. Every executable program has a manifest. A manifest contains information about the DLL or EXE as well as any dependencies on other DLLs needed to run this executable. Figure 2-2 shows the manifest for the Greeting executable. If you double-click the manifest entry, you'll see another window containing the details. There is much more to the manifest than we'll cover here; for now, we'll just concentrate on the pertinent information.

You can see that the Greeting application requires three other assemblies when it runs: vjscor, mscorlib, and vjslib. For the time being, you can think of an assembly as a DLL containing runtime support code needed to execute your application. The mscorlib assembly contains the core base class libraries needed by every .NET application and is always present. Vjslib and vjscor contain J# runtime and type information and are used only by J# applications.

Figure 2-2. An application manifest

Underneath the reference to vjslib is information about the Greeting application itself. The first piece of information concerns an attribute ( DebuggableAttribute ). Attributes are additional items of data that can be examined and acted upon by the runtime to modify the way in which the program runs. They typically contain configuration or other declarative information; we'll cover them in more detail in Chapter 3.

Close the Manifest window, return to the main ILDASM window, and expand the Greeting node to see the Hello class. The Greeting node corresponds to the package created in the application. If you expand the Hello class, you'll see its contents as shown in Figure 2-3.

Figure 2-3. The ILDASM window showing the Hello class

Items with a red triangle symbol provide more information. In this case, the information is the pedigree of the Hello class. If you double-click the item marked .class public auto ansi, another window will appear, as shown in Figure 2-4.

Figure 2-4. The definition of the Hello class

Again, there is more information in here than you might care to know, but you should be able to gather that the Hello class is descended from System.Object (as all .NET classes are by default) and that the implementation of System.Object is found in the mscorlib assembly.

If you close this window and return to ILDASM, you'll notice four methods in the Hello class, denoted by pink squares. A square containing an S indicates that the method is static. The method marked .ctor is the constructor for the Hello class. Although you didn't define a constructor yourself, in this situation the semantics of the Java language are such that the compiler will create a default constructor for you automatically. If you double-click the constructor, you'll see the MSIL code generated for you. We won't describe the entire MSIL instruction set here, but you can probably guess that the instruction at address IL_001 invokes the constructor inherited from System.Object (the parent class of Hello ), as shown in Figure 2.5.

Figure 2-5. The default constructor for the Hello class

Close this window and return to the main ILDASM window. All .NET classes inherit a number of methods from System.Object . In some cases, the default implementation provided by System.Object is sufficient for the class, but in other cases you might want to override the default implementation and provide your own code. MemberwiseClone and ToString are two methods that are normally inherited from System.Object . The MemberwiseClone method is used for creating a copy of an object, and ToString generates a printable (string) representation of an object. However, the requirements of J# objects are subtly different from those of objects written in other languages (for reasons of interoperability with the Java programming language), and the J# compiler automatically generates specialized versions of MemberwiseClone and ToString for you. The other methods inherited from System.Object are not affected by the requirements of the Java language.

Double-click the main method to display its MSIL code, as shown in Figure 2-6. The main method is the entry point for the application. You'll see a few directives (the lines that start with a dot, as in .entrypoint ) followed by some real MSIL code.

Figure 2-6. The main method of the Hello class

We wrote the J# code for this method earlier. The MSIL code resulting from the compilation is reproduced here:

main (MSIL)
 .methodpublichidebysigstaticvoidmain(string[]args)cilmanaged { .entrypoint //Codesize44(0x2c) .maxstack2 .localsinit(int32V_0) IL_0000:ldtoken[vjslib]com.ms.vjsharp.lang.ObjectImpl IL_0005:callvoid[mscorlib]System.Runtime.CompilerServices. RuntimeHelpers::RunClassConstructor(valuetype[mscorlib] System.RuntimeTypeHandle) IL_000a:ldc.i4.0 IL_000b:stloc.0 IL_000c:br.sIL_0021 IL_000e:ldsfldclass[vjslib]java.io.PrintStream [vjslib]java.lang.System::'out' IL_0013:ldstr "Hello,World" IL_0018:callvirtinstancevoid [vjslib]java.io.PrintStream::println(string) IL_001d:ldloc.0 IL_001e:ldc.i4.1 IL_001f:add IL_0020:stloc.0 IL_0021:ldloc.0 IL_0022:ldc.i4.s10 IL_0024:blt.sIL_000e IL_0026:callvoid [vjslib]com.ms.vjsharp.util.Utilities::cleanupAfterMainReturns() IL_002b:ret }//endofmethodHello::main 

This code needs a little explanation. MSIL is a stack-based language. The .maxstack directive at the start of the method indicates the maximum number of values that will be pushed onto the stack (maximum depth of stack) while the method runs. If this value is exceeded, the program will terminate with an exception for security reasons. The MSIL verification process conducts this check by analyzing the MSIL code as it is JIT-compiled but before it is run.

Local variables are identified by number in MSIL. The .locals directive at the start of the main method indicates that that a 32-bit integer value (the variable i created in the for loop) is local variable V_0. If there were more variables, they would be called V_1, V_2, V_3, and so on, in the order in which they were defined in the original code.

The instructions at addresses IL_0000 and IL_0005 initialize the object calling the class constructor for the ObjectImpl class. ObjectImpl is an internal class defined in the vjslib assembly that implements most of the methods normally inherited by Java language objects from the java.lang.Object class. The way in which J# objects are implemented and mapped into the Java language object hierarchy is discussed in more detail in Chapter 3.

The MSIL instruction ldc.i4.0 at address IL_000a pushes the 4-byte constant value 0 onto the top of the stack. The next instruction, stloc.0 , pops the value from the top of the stack (containing 0) into local variable 0. The net effect of this action is to set the variable i to 0. The instruction br.s is an unconditional transfer that causes a jump to address IL_0021. At this address, the ldloc.0 instruction copies the value in local variable 0 (variable i ) back onto the stack.

The instruction ldc.i4.s 10 at address IL_0022 pushes the 4-byte integer value 10 onto the stack, so the stack now contains two values: 0 (variable i ) and 10. The instruction blt.s compares the top two items of the stack and transfers execution to address IL_000e if the penultimate value in the stack is less than the value at the top of the stack. Because i contains 0, this condition is true and execution continues at IL_000e.

The ldsfld instruction at address IL_000e pushes the value of System.out , which is the static variable of type PrintStream used for printing to the console, onto the top of the stack. The definitions of PrintStream and System are held in the vjslib assembly. The next instruction, ldstr "Hello, World" , pushes the constant string " Hello, World" onto the stack as well. Once again, the stack contains two items: the System.out static field and the " Hello, World" string.

The callvirt instruction calls an instance method using the information on the stack. In this case, the MSIL instruction invokes the println method and expects a single string parameter (which is currently at the top of the stack). The object below this on the stack, System.out , is the object whose println method is invoked. Both items are popped from the stack, and because the println method does not return a value, nothing is pushed back onto the stack. At this point, the text "Hello, World" appears on the console for the first time.

The instruction at IL_001d, ldloc.0 , loads the value of local variable 0 (variable i ) onto the stack; this value is still 0. The next instruction, ldc.i4.1 , pushes the constant value 1 onto the stack, and the add instruction adds the top two items on the stack and replaces them with the result. The stloc.0 instruction at IL_0020 pops the new value at the top of the stack and places it in variable 0. The local variable i has now been incremented to 1. We are now back at address IL_0021, where the ldloc.0 instruction copies the new value of i back onto the stack, the ldc.i4.s 10 instruction pushes the constant 10 onto the stack, and the blt.s IL_000e instruction jumps back to address IL_000e if i is still less than 10.

When variable i eventually reaches 10, the method cleanupAfterMainReturns defined in the utilities class in the vjslib assembly performs some housekeeping, and then the ret instruction at IL_002b exits the method.

Now you can see how the Java code in the original program is transformed into MSIL. If you want to know more about MSIL, see the document called Partition III CIL.doc that comes with Visual Studio .NET Enterprise Edition (in \Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Tool Developers Guide\docs\Partition III CIL.doc).

MSIL Verification

If you're feeling brave, you're welcome to write MSIL code yourself. Microsoft supplies the MSIL assembler (ILASM.exe) for compiling raw MSIL code files into PE format. However, this option raises a difficult issue. MSIL code is not as easy to write as J#, C#, Visual Basic .NET, or even C++ code. It's fair to assume that if you write in one of these languages and use the appropriate compiler, the resulting MSIL code in the PE file will be valid. If you're handcrafting MSIL code, this assumption is not applicable . For example, you could easily forget to push the correct number of items onto the stack before executing an add instruction, which would result in a nasty problem.

The runtime guarantees that the code it executes will not crash in an uncontrolled manner. To achieve this guarantee, the runtime performs code verification on each assembly as it's compiled from PE format into native code immediately before it is executed. The runtime checks for many problems, including attempts to use uninitialized variables, illegal memory accesses , and assignment of incompatible values to types. These checks eliminate a whole range of common programming errors and ensure that your program is type-safe and not prone to security failures.

For example, an attempt to read a piece of memory that was not directly allocated to the program could allow a devious programmer to gain access to all sorts of private data. Not all errors can be trapped at this time, but the run ­time performs numerous checks at run time as well, preventing problems such as "out by one" errors when reading an array (stepping off the end of an array) or invalid type-casts. If these occur, the runtime will throw an exception.

If you're writing your own MSIL code, you can use the PEVerify tool, PEVerify.exe (in \Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Bin), to check that your PE file contains valid and verifiable code.

Java Bytecodes

You've seen the .NET code for the "Hello, World" program. Now compare it to the Java bytecodes for the same program generated by the standard Java compiler supplied with Sun Microsystems' Java Development Kit (JDK). This output was generated using the javap tool, which also comes with the JDK.

 CompiledfromHello.java publicclassGreeting.Helloextendsjava.lang.Object{ publicGreeting.Hello(); publicstaticvoidmain(java.lang.String[]); } MethodGreeting.Hello() 0aload_0 1invokespecial#1<Methodjava.lang.Object()> 4return Methodvoidmain(java.lang.String[]) 0iconst_0 1istore_1 2goto16 5getstatic#2<Fieldjava.io.PrintStreamout> 8ldc#3<String "Hello,World"> 10invokevirtual#4<Methodvoidprintln(java.lang.String)> 13iinc11 16iload_1 17bipush10 19if_icmplt5 22return 

We won't go into the Java bytecodes here, but if you look at the main method you should recognize the pattern. The algorithm used by the MSIL code is exactly the same. (The JVM is also stack-based.) If you're interested in learning more, Bill Venner's book Inside the Java 2 Virtual Machine (McGraw-Hill, 1999) is a good place to start.

Compiling MSIL to Native Code

To speed up the execution process, you can compile MSIL executables and DLLs using the Ngen.exe utility. Ngen stands for Native Image Generator. When you run Ngen over an MSIL EXE or DLL, the file is compiled into native code and placed in the native image cache on your computer. Currently, this cache is a folder called NativeImages_<. NETVersion > under \Windows\assembly, where <. NETVersion > is the version of the .NET Framework that you have installed. When you invoke a .NET EXE or DLL, the runtime will check in the native image cache for a compiled version and will use it if it finds one; otherwise , it will load and compile your MSIL using the JIT compiler.

You can remove an image from the native image cache using the /delete parameter with Ngen.

Cross-Language Development

Earlier, we said that the runtime makes data portable across languages and that languages that execute using the runtime obey the same method-calling conventions. What does this mean for developers? For one thing, it eliminates the headaches you used to have in attempting to call a method in a DLL written using Visual C++ from your Visual Basic 6.0 application. Because the data types in C++ and Visual Basic are different ”an int in C++ is a different length from an Integer in Visual Basic ”you always had to remember to use a Long in Visual Basic instead. Then you had to make sure that the C++ methods used the standard calling convention; otherwise, you could end up with a corrupt stack. You also had to turn off name decoration in C++ so Visual Basic would actually find the names of the methods in the DLL. The final hurdle was using the Declare Function or Declare Sub statements in Visual Basic, making sure that you specified the correct path to the DLL to pretend that the methods you were calling were written in Visual Basic.

So many errors could creep in, and you wouldn't know about them until you tried to run your program. If you were lucky, you'd get a message box saying "Bad calling convention." Of course, more often than not, applications would crash, your computer would freeze, and you'd lose hours of work. (Developers never save anything until it all works!). Once you had the program functioning correctly, someone in the C++ development team would update the DLL ”maybe by adding a parameter or two to an existing method or changing its return type ”and ship a new version that broke your Visual Basic application again. Thankfully, Windows XP and the .NET Framework provide solutions to these problems.

Combining Visual Basic and J#

Calling C++ routines from Visual Basic was a complex process, often involving the use of additional software such as the ActiveX Bridge for JavaBeans. Then along came the common language runtime, and what was complicated became simple. Look at the following sample J# package, CakeUtils.jsl:

 packageCakeUtils; publicclassCakeFilling { publicstaticfinalshortSponge=0,Fruit=1; } publicclassCakeShape { publicstaticfinalshortSquare=0,Round=1,Hexagonal=2, Other=3; } publicclassCakeInfo { //Workouthowmanypeopleacakeofagivensize,shape, //andfillingwillfeed publicstaticshortFeedsHowMany(shortdiameter,shortshape, shortfilling) { doublemunchSizeFactor=(filling==CakeFilling.Fruit?2.5:1); doubledeadSpaceFactor; switch(shape) { caseCakeShape.Square:deadSpaceFactor=0; break; caseCakeShape.Hexagonal:deadSpaceFactor=0.1; break; caseCakeShape.Round:deadSpaceFactor=0.15; break; default:deadSpaceFactor=0.2; break; } shortnumConsumers=(short)(diameter*munchSizeFactor* (1-deadSpaceFactor)); returnnumConsumers; } } 

For years , John's wife has been baking and decorating cakes for birthdays, weddings, and other occasions. Customers would often ask her, for example, how big a birthday cake they would need to feed 25 people. The answer depends on several factors: primarily the shape of the cake (a square cake has more volume than a round cake and will feed more people) and the type of filling (a fruit cake of a given size feeds more people than a sponge cake of the same size). She tended to base her answers on past experience, but we thought we'd try to add a bit of science to the process and write some code to make the calculation. The result is the FeedsHowMany method in the CakeInfo class shown previously.

The CakeUtils package also defines classes exposing constants for the shape and filling of the cake. If you want to follow along, save the code in a file called CakeUtils.jsl. (You could do all of this in Visual Studio .NET, but it's more instructive to perform the build tasks manually for this example.)

You can compile the package into a DLL from the command line:

 vjcCakeUtils.jsl/target:library 

The result is an assembly called CakeUtils.dll.

To test the package, use the following Visual Basic .NET class. Save it in a file called SizeCake.vb in the same folder as your J# source code and DLL.

 ImportsCakeUtils,System ModuleSizeCake SubMain() DimnumEatersAsShort 'Howmanypeoplewilla10" hexagonalspongecakefeed? numEaters=CakeInfo.FeedsHowMany(10,CakeShape.Hexagonal,_ CakeFilling.Sponge) Console.WriteLine("Thiswillfeed " &numEaters& " people") EndSub EndModule 

You should notice a number of things about this Visual Basic application. It uses the CakeShape and CakeFilling classes exposed by the J# package, it calls the FeedsHowMany method, and the result is a short integer. These steps might seem obvious, but what's not obvious from the Visual Basic code is the language that was used to create the CakeInfo class. The code contains no Declare statements. You don't have to worry about how Visual Basic data types map to J# (or vice-versa) or about using the wrong calling convention. In fact, the developer writing the Visual Basic application does not need to know what language was used for the CakeInfo class.

To compile the SizeCake program and link it to use CakeUtils.dll, use the vbc compiler from the following command line:

 vbcSizeCake.vb/reference:CakeUtils.DLL 

If you run the SizeCake.exe program, you'll be informed that a 10-inch hexagonal sponge cake will feed nine people. Besides working out how big a cake you need to order from John's wife, this little exercise proves how easy it is to combine code written in different languages when you use .NET.

The Common Language Specification

Even though the common language runtime looks wonderful, it can't do everything. Cross-language method calls work because of the way the source code for each language is compiled into MSIL. Microsoft has made the data types as compatible as possible across all the languages supported by .NET it has built compilers for, including J#. And because all the code is converted to MSIL, the calling convention and name decoration issues evaporate. (MSIL does not use name decoration as we would recognize it.)

However, different languages inevitably have different data types. Compiler writers can map many of them into common formats and sizes, but there will always be types in one language that have no corresponding type in another. For example, Visual Basic has no direct equivalent to a C++ pointer. Naming conventions used by different languages can also be problematic ; a valid identifier in one language might not be valid in another. To help address this problem, Microsoft has developed a set of rules for building cross-language classes, called the Common Language Specification (CLS).

The CLS is a large subset of all the runtime types and other features available for writing components that can be used by any of the languages supported by .NET. If you abide by these rules, your classes will be universally accessible. If you break any of the rules, you'll restrict the audience for your code. Bear in mind that the CLS applies only to the exposed elements of your components (public methods, fields, and so on). You can use any language features you like internally.

Common CLS rules

The first rule of the CLS is that global static fields and methods are banned. The CakeFilling class shown earlier is not actually CLS-compliant because it violates the first rule:

 publicclassCakeFilling { publicstaticfinalshortSponge=0,Fruit=1; } 

Nevertheless, you can still use the class from Visual Basic. Nonconformance with the CLS doesn't mean that your class can't be used from other languages; it just means it might not be usable if the language doesn't know how to handle the noncompliant construct.

The CLS restricts the data types you can use. The available primitive types are Byte , Int16 , Int32, IntPtr , Int64 , Single , Double , Boolean , Decimal , Char , and String . These are the official language-independent names of the types, and they can map to different named types in each of the languages supported by .NET. For example, the Int64 data type of the CLS corresponds to the long data type in J#, and a CLS Single is a J# float . For a method to be CLS-compliant, you must restrict its parameters and return types to CLS-supported types. You can use classes as parameter types and return values if the classes themselves conform to the CLS.

Another important rule concerns identifiers. In the CLS, for two identifiers to be considered distinct, they must vary by more than just their case. This rule is necessary because some languages, such as Java, are case sensitive but others, such as Visual Basic, are not. For example, if you define two methods in J# called MyMethod and Mymethod , Visual Basic will not be able to distinguish between them. Also, for strict CLS compliance, all names in the same scope must be distinct. You cannot have a class and a method that share the same name even if the language used to create the class allows it. Overloading is also an issue. The CLS allows methods to be overloaded based only on the number and types of their parameters; the return type cannot be used to distinguish between overloaded methods. Operator overloading is not supported, and neither are methods that take variable numbers of parameters.

A CLS-compliant class cannot inherit from a class that is not CLS-compliant. Arrays are allowed if the elements conform to the CLS, the array has a fixed number of dimensions, and each dimension has a zero lower bound.

We've covered a lot of information, but you've seen just the highlights. Most of the remaining CLS rules don't apply to the Java language, so we won't elaborate further. One final point: A language does not have to implement every feature of the CLS in order for you to use it to write CLS-compliant code. For example, the CLS includes enumerations, which are not part of the Java language, but that does not prevent you from writing components in the Java language that meet the requirements of the CLS.

Memory Management

If you have a C++ background, you're accustomed to watching how your objects use memory. C++ allows programmers to grab large chunks of memory and not release it, making memory management a constant issue. Applications use the new operator to allocate a piece of memory and create an object, but then they forget to delete the object and release the memory when they're done. The essential rule of C++ and similar languages that allow you direct access to memory is "That which you new , you shall also delete ." Failure to adhere to this decree means that your application will consume more and more memory, causing the application to run slower and slower until it eventually halts.

A second rule regarding memory access in C++ and similar languages is "Use only that which you created." C++ allows you to create pointers to blocks of memory. A well-behaved application restricts itself to pointing at memory that actually belongs to the address space of the process running the application. More rebellious applications attempt to read and write random pieces of memory anywhere on the computer, sometimes because the programmer failed to point the pointer to a valid location and other times for more insidious reasons. (Some systems have been known to exhibit a security loophole, allowing a developer to read and write memory that is not part of the running process.) Many modern operating systems will trap such memory accesses and terminate the offending application before it can cause damage to the operating system or to other processes running at the same time.

If you're a Java programmer who has never used C++, the previous two paragraphs might not mean anything to you because the Java language doesn't have pointers. Neither does the common language runtime. Instead, both the Java language and the common language runtime have reference types . A reference type is an object whose lifetime and memory are managed by the common language runtime. The common language runtime grabs memory on behalf of applications when those applications create new objects. The common language runtime tracks the use of these objects and releases the memory they use automatically after the last references to them disappear. Neither J# nor the Java language have a delete operator as C++ does to indicate when an object can be disposed of. Instead, the runtime used by both languages performs automatic garbage collection. In the case of the common language run ­time, the objects are said to be managed .

Reference Types and Value Types in the Common Language Runtime

Reference types in the runtime are heap-based. The heap is the large block of memory that is managed by the common language runtime. When you create a new object, the common language runtime allocates memory from the heap and assigns it to your object. The garbage collector (part of the common language runtime) returns this memory to the heap when it collects your object.

All classes are reference types. The common language runtime also has value types. These are items that are allocated memory on the stack when they're created, and they automatically disappear when they go out of scope ”they're not subject to garbage collection. Most of the primitive types in the common language runtime ( int , long , float , short , bool , and so on) are value types. For example, in the DoSomething method shown here, int is a value type and the class MyClass is a reference type:

 publicvoidDoSomething() { intj=99; MyClassmyThing=newMyClass(); //jdisappears;theMyClassobjectwillbegarbagecollectedlater } 

When the method runs, the integer j will be created on the stack and the new operator will create an instance of MyClass on the heap. At the end of the method, the variable j will disappear from the stack and its memory will be reused by the next variable that is created on the stack. The reference ( myThing ) to the MyClass object will disappear, rendering the object inaccessible when the method finishes. However, the object itself will remain in memory until the garbage collector decides to dispose of it.

The fact that the common language runtime grabs chunks of memory on your behalf means that neither the Java language nor the languages supported by .NET need pointers, so they don't have them. (There is one exception: You can use pointers in C++ and C# running under .NET, but this code runs in unmanaged space, which we'll describe later in the chapter). If you don't have pointers, you cannot point at specific blocks of memory outside the address space of the current process; therefore, no one can try to read them in an attempt to compromise the security of the computer.

To plug the gap totally, the verification process that MSIL code goes through as it is compiled ensures that all variables are initialized before use; using an uninitialized reference won't accidentally damage some random system memory. Also, the type-safety verification checks performed by the common language runtime prevent you from assigning invalid values to references. For example, you cannot assign the value 99 to a reference with the expectation that you can use it to access memory address 99.

Garbage Collection

Some programmers like automatic garbage collection, and others loathe it. On the positive side, garbage collection removes the need for your application to track the objects that it created and make sure they're deleted when no longer required. On the negative side, you can never be quite sure when your objects are reclaimed by the runtime because the garbage collection process is nondeterministic. Essentially, garbage collection happens when the common language runtime decides to do it. Garbage collection is a potentially expensive operation, so the common language runtime does it only when absolutely necessary.

Note

The documentation supplied with the .NET Framework SDK provides details on the algorithms used by the garbage collector (\Program Files\Microsoft Visual Studio .NET\FrameworkSDK\Tool Developers Guide\docs\Partition I Architecture.doc). Be aware that they might change in future versions of the common language runtime.


If you come from a C++ background, the garbage collector will affect how you design your classes. The most important point is that when you use the common language runtime, you cannot necessarily count on destructors running when your objects disappear. Just because you've finished using an object doesn't mean that the garbage collector will dispose of it immediately.

Finalizers

If your objects obtain further resources (file handles or database connections, for example), you can arrange for these resources to be reclaimed when the object is garbage collected using a finalizer. A finalizer is a method, called finalize , that runs when the garbage collector wants to remove the object from memory but before the object actually disappears, as the following code shows:

 publicclassMyClass { protectedvoidfinalize() { //Putfinalizationcodehere } } 

Remember, you cannot guarantee when garbage will be collected, so you should not put critical code in a finalizer!

By default, the garbage collector runs as a separate thread inside the common language runtime. The runtime exposes the GC class, which contains static methods that you can use to influence and query the status of the garbage collection process. For example, the method GC.Collect forces garbage collection to start. Bear in mind, though, that all this does is signal to the thread running the garbage collector that it should do some work ”it does not wait for that work to be completed. There is also a KeepAlive method you can use to prevent an object from being collected. Use these methods only when you have a really good reason to do so. The garbage collector works best when left to its own devices.

Deterministic Garbage Collection

So far, you haven't seen much difference between the way the JVM and the runtime manage memory. However, the runtime does provide a mechanism for deterministic garbage collection through the IDisposable interface.

The IDisposable interface specifies a single method called Dispose . You call this method to free up any resources used by the object, just like a finalizer would do. You should also create a finalizer that calls the Dispose method. The Dispose method should include a call to the method GC.SuppressFinalize , which prevents the garbage collector from calling the finalizer again. There's a good reason to do this: The resources have already been released, and trying to release them again can cause errors. Not calling the finalizer again also speeds up garbage collection because the collector can simply reclaim the object without doing any further finalization.

The following code fragment shows the general shape for a J# class that implements IDisposable :

Note

The structure shown here is a pr cis of the Dispose pattern documented by Microsoft. For more details, consult the .NET Framework SDK documentation.


 publicclassCollectedClassimplementsSystem.IDisposable { publicvoidDispose() { //Releaseresources //Removeobjectfromgarbagecollectionqueue System.GC.SuppressFinalize(this); } protectedvoidfinalize() { Dispose(); } //Othermethods } 

A consumer program that creates instances of a disposable class can call the Dispose method explicitly when the application is done with the object, but the runtime also provides an automatic mechanism that allows the consumer to specify when those objects should be disposed of. This feature is not available for consumers written in J#. However, it's still good practice to implement IDisposable in your J# components because they might be used from other languages.

The following example shows some C# code that creates and disposes of an instance of the CollectedClass shown earlier. It employs the using construct of C#.

 publicstaticvoidMain() { using(CollectedClassmyThing=newCollectedClass()) { //myThingisaccessibleinthisblock }//myThingisdisposedofatthispoint } 

Integrating Unmanaged Code into .NET Applications

A lot of code was written before Microsoft created .NET. Some of it still runs and is still in use. Rather than forcing you to throw away perfectly good components and rewrite them for .NET, Microsoft allows you to integrate them into your .NET applications as-is. You have several techniques available to incorporate your old code into .NET. Your choice will depend on the type of component. All of these methods involve calling unmanaged code ”that is, code that does not run under the common language runtime. Unmanaged code cannot be verified by the common language runtime and might corrupt the memory used by the process. Applications that execute unmanaged code have to be granted special access rights. Furthermore, method calls from managed to unmanaged code have to be marshaled into and out of the common language runtime, which will hurt performance if it occurs often.

To integrate .NET applications with unmanaged DLLs, you can use the Platform Invoke Service, also known as PInvoke. PInvoke locates a specified DLL and loads it into unmanaged space before invoking the required function in the DLL. The techniques used for specifying the DLL to be loaded and the method to be invoked are reminiscent of those required when you integrated DLLs into Visual Basic 6.0 applications. The major difference is that PInvoke is far more robust than the Visual Basic 6.0 runtime, and a bad method call is unlikely to crash your application, although it might generate exceptions that you can trap. If you're familiar with Visual J++ 6.0, you might have used a similar facility called J/Direct for making method calls into DLLs from Java. J/Direct is still available with J#.

COM is still an important technology, and you can use COM components from managed code by creating COM Callable Wrappers (CCWs). These are pieces of code that act as proxies, making COM components appear as if they are ordinary .NET components. The CCW is responsible for locating and loading the COM component into unmanaged space, marshaling and converting managed parameters into unmanaged data, and converting returned data back into managed types. You can also call a .NET component from COM. This involves creating a Runtime Callable Wrapper (RCW), which acts as a proxy for the .NET component, marshaling COM data into the managed space of the common language runtime.

We'll look at integrating J# with legacy code in more detail in Part IV of this book.

I l @ ve RuBoard


Microsoft Visual J# .NET (Core Reference)
Microsoft Visual J# .NET (Core Reference) (Pro-Developer)
ISBN: 0735615500
EAN: 2147483647
Year: 2002
Pages: 128

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net