| ImplementationsThe principal rule for Hotdog's compilation strategy is "keep it simple." Rather than exploit the idiosyncratic behavior of a particular VM implementation, Hotdog attempts to generate code that mimics a Java or C# compiler. The JIT compilers, given only a few microseconds to compile a method, are designed to optimize patterns found in common Java and C# programs. An aggressive JIT compiler can inline method calls and field accesses , remove array bounds checks, and optimize method dispatch. As the VMs and JIT compilers will only strive to improve the performance of object-oriented programs, the performance of Hotdog Scheme will, along with that of Java and C#, be carried forward by every improvement in virtual machine technology. Attempts to fool the VM will sacrifice these gains. ClosuresThe heavy use of closures in Scheme makes its implementation critically important. A closure is a data structure containing executable code and its associated environment. Unlike conventional languages, it is not necessarily attached to a name . Instead, it can be used like any other data type: passed as an argument, stored in a variable, or even deallocated when it is no longer needed. In fact, there are no named functions or methods , as are common in conventional languages such as C, Java, and C#. In Scheme, global names can be bound to closures, Boolean values, strings, or anything else. If the name is bound to a closure, then it can be applied like a function. This flexibility allows for particularly powerful programming techniques. Although there is no structure analogous to a closure on either VM, there are a number of possible implementation strategies using the provided object-oriented features. Whereas some implementations attempt clever encodings, Hotdog opts to generate code that can be clearly understood by the JIT compiler. Ideally, the JIT compiler will be able to recognize and optimize Hotdog's simple object-oriented code patterns, producing faster code than is possible with a more aggressive implementation. General ImplementationA closure is implemented as a subclass of a general Closure class, where one (or more) of the apply methods is overridden to supply the closure's body of code (see Listing F.1). The subclass also implements a class constructor method that takes the closure's environment and stores it within the class instance. Every instance of this class is a closure with a different environment. Listing F.1 The Closure class is the parent of all Scheme closures class Closure {   Closure (Environment e) {}   object apply () { error() ;}   object apply (object x1) { error() ;}   object apply (object x1, object x2) { error () ;}   object apply (object x1, object x2, object x3)                { error() ;}   object apply (object x1, object x2, object x3, object x4)                { error (); }   object apply (object x1, object x2, object x3, object x4,                object x5) { error() ;}   object apply (object x1, object x2, object x3, object x4,                object x5, Pair rst) { error() ;} } The number of arguments supported by the closure class is arbitrary. The example Closure class in Listing F.1 shows only five arguments, followed by a Pair argument. If the number of arguments in a Scheme closure exceeds the hard limit set by the Closure class, then Hotdog packages the remaining arguments into a list and passes that list as the final argument. The preamble to the Scheme closure will unpack the list and assign the remaining arguments to local variables before execution of the body begins. It is likely that adding more arguments will improve performance and allow for further optimizations, but the compiler must be able to handle any number of arguments. The Closure class implements every apply method with an error function. If a Scheme closure is applied with the wrong number of arguments, it will not have implemented that apply method; therefore, the runtime environment will call the apply method on the Closure class, which will report the error. This is usually an unrecoverable error in a Scheme program, and execution will fail. EnvironmentsWhen a Scheme closure is allocated, an environment is given to the class constructor. The environment is implemented as an array of objects. As modern VMs can optimize array range checks, array accesses can be quite cheap. However, because all environment values are coerced to the Object type, every use must be preceded by a type coercion. Also, because small values must be boxed, they must be unboxed again before use. As Scheme is latently typed, these costs are usually mitigated by special optimizations and runtime tricks in Scheme virtual machines. Unfortunately, neither object-oriented VM offers any way to reduce this overhead. Variable Argument ListsScheme supports variable argument lists, which, like C's printf function, allow an arbitrary number of arguments to the closure (see Listings F.2 and F.3). Hotdog generates a single private method named exec , with the same number of fixed arguments, plus an extra argument to accept the variable arguments as a list. Then, every apply method with at least the number of required arguments is implemented, each one packaging the extra arguments into a list and sending them to the exec method. Because the caller does not know the signature of the closure it is calling, it can assume the closure will do the right thing with the arguments it sends. Listing F.2 A closure that accepts three or more arguments(lambda (x y z . rst) body ...) Listing F.3 Implementation of a closure with three or more arguments class lambda1 : Closure {   lambda1 (Environment e) {}   object apply (object x1, object x2, object x3) { ... }   object apply (object x1, object x2, object x3, object x4)                { ... }   object apply (object x1, object x2, object x3, object x4,                 object x5) { ... }   object apply (object x1, object x2, object x3, object x4,                 object x5, Pair rst) { ... }   object exec  (object x1, object x2, object x3, Pair x4)                 { body ... } } For many Scheme procedures, this implementation of variable argument lists can be quite costly, as it allocates a list for every call. For example, the arithmetic procedures accept zero or more arguments, which means simply adding two numbers will involve boxing the numbers and calling the "+" procedure. The "+" closure's apply method, which takes two arguments, will then pack both numbers into a list and call the exec method. This method will unpack the list, check the types of both numbers, unbox both, and, finally, add them. As several important Scheme procedures use variable argument lists, this technique can seriously degrade the overall performance of a Scheme program. Case-Lambda Special FormTo provide better support for variable argument lists, Hotdog adds compiler support for a "case-lambda" special form (see Listing F.4). It allows one to define a closure with multiple entry points. In terms of classes, it allows one to write code for each apply method. The VM will quickly dispatch to the correct apply method, where user -written code can handle the arguments immediately. If the case-lambda form uses a variable argument list, then the compilation strategy described earlier applies. Listing F.4 Use of case-lambda form(define + (case-lambda (() 0) ((x) x) ((x y) (%primitive-add x y)) ((x y . rst) (let ((sum (%primitive-add x y))) (if (null? rst) sum (+ sum (car rst) (cdr rst))))))) Compiling the case-lambda form in the presence of the Closure class, which has a fixed number of apply methods, requires special care. If an arm of the case-lambda uses more fixed arguments than the Closure class can provide, then extra dispatching code is generated in the final apply method. For example, if a case-lambda has one arm that takes eight arguments and another arm that takes nine arguments, both will end up calling the last apply method, which takes only five arguments and a list containing the remaining arguments. The compiler must generate code in this apply method that unpacks the list, checks the total number of arguments provided, and executes the correct arm of the case-lambda. Therefore, the last apply method will contain code for all case-lambda arms with more than five arguments. This situation is further complicated by an arm of a case-lambda that accepts more than five fixed arguments and a variable argument list, but the compilation strategy is an obvious extension of that discussed earlier. Small ValuesBecause Scheme is latently typed, every value must be passed in a uniform way. This means, for example, that the parameters to a method must be able to accept all Scheme types. To do so, most Scheme systems pass around a pointer to a heap-allocated value. In many respects, this approach is similar to the general collection classes in Java and C#, which take items of type object . To make a collection of integers, all the integers must be boxed and coerced to the object type before they can be inserted into the collection. When an object is extracted from the collection, the type must be coerced before it can be used. This process results in a significant performance hit. Scheme runtime systems generally solve this performance problem by encoding small values directly in the pointer. Because pointers are word aligned, a valid pointer will have the low bits set to 00. A Scheme runtime system will use those low bits to tag small values. For example, if the low bit is set to 1, then the remaining 31 bits contain an integer. The integer does not need to be heap allocated, type checking is a simple bit check operation, and arithmetic can be done directly on the encoded value. Similarly, Boolean values and characters can be encoded, with additional tags, directly in the pointer. The runtime and garbage collector know to check a pointer before following it into oblivion. Unfortunately, the object-oriented VMs do not allow one to tinker with the object pointers. Each pointer is sacrosanct, because the garbage collector, runtime, and byte-code verifier require it to be valid. As a result, all small values must be boxed before passing them around. This incurs a significant overhead, as illustrated by the general collection classes. In fact, a simple numeric program written in Java or C# with boxed integers will demonstrate a performance gap compared to an unboxed version of the same program. This is a critical performance problem for latently typed languages on an object-oriented VM. This performance gap can be reduced somewhat by preallocating a range of small values. At initialization time, the Hotdog Scheme runtime system will box all ASCII characters, the integers from 1 to 1000, and the Boolean values true and false. At runtime, a special routine is used to box a small value. This routine first checks whether a preallocated value is available and, if so, returns it. If no preallocated value is available, it boxes the integer as it normally would. Avoiding allocation for the most commonly used numbers and characters improves performance noticeably. These values must still be unboxed to be used, which invokes a type check on the object before extracting the value. Therefore, there is still a performance penalty paid for numbers on the object-oriented VMs. Dynamic Type CheckingAs a latently typed language, Scheme remains unconcerned about the types of values being used until the program explicitly attempts to interact with a value. For example, the only time Scheme cares that a value really is a number is when one tries to add two numbers together. Of course, the programmer is often concerned about the types in a program; therefore, a program may explicitly check the type of a value during execution using type predicates. Therefore, all the types will be handled in the same way until a definite action on the type takes place. To layer this notion of types atop the object-oriented VMs, all objects must be handled as object types. This technique, called type erasure [6], replaces all type declarations with the object type. Whenever a type-specific operation is performed, such as reading from an input port, the object must first be cast to the correct type before the operation begins. If the coercion fails, it is a runtime error. The return value for a successful operation must also be coerced to the object type. Although this implementation meets the needs of a latently typed language, it does not match the behavior of statically typed object-oriented languages. Consequently, most object-oriented VMs do not attempt to optimize dynamic type checking. In addition, where Scheme is concerned with the immediate type of an object, object-oriented languages are interested in the type hierarchy. That is, Scheme wants to know whether an object is a closure, whereas the object-oriented languages want to know whether an object is, or is a child of, a closure. Although this is fine if the object really is a closure, if it is not then the VM checks the object against each parent type, all the way back to the root type. Checking whether an object is a child of a type implies additional work that is completely unnecessary for Scheme. Because dynamic type checking occurs continuously in a Scheme program, any overhead is too much overhead.   | 
