12.6 Optimizing Scheme

The simple compilation techniques proposed here are woefully inefficient. Fortunately, by taking a less short-sighted view of the compilation process, performance can be greatly improved.

12.6.1 Static Scoping

In the implementation described in section 12.5, each time a variable is referenced, the binding environment has to find the name of the variable in a hash table. If it fails to find it there, it must check the ancestral binding environments. This search can be expensive, since it involves computing the hash value and a String comparison. It is especially expensive for symbols in the default binding environment like +, which may require several hash table lookups.

A Scheme compiler can take advantage of the fact that Scheme has static scoping. "Static scoping" means that the binding environment is set when a lambda expression is first evaluated instead of when the lambda is applied. Static scoping can be used to replace the expensive search through hash tables with a much cheaper one, since the compiler knows where to find each symbol when the code is compiled.

Earlier we looked at this Scheme program:

 (define add    (lambda (x)        (lambda (y)            (+ x y))) (define add2 (add 2)) (define add99 (add 99))

It creates two different anonymous procedures, anonymous2 and anonymous3, corresponding to the outer and inner lambda expressions. The first is the result of evaluating the outer lambda expression (lambda (x) (lambda (y) (+ x y)). It creates a binding environment that binds x, then evaluates the form (lambda (y) (+ x y)). The binding environment points back to the default binding environment for symbols other than x.

The second anonymous procedure is produced by evaluating the first. It creates a binding environment that binds y, then evaluates the form (+ x y). The ancestor of that binding environment is the one in which the inner lambda expression was evaluated.

Consider the evaluation of the form (add2 3). A new binding environment is created that binds y to 3. The binding environments look like Figure 12.2. In this diagram the relevant binding environments ancestors are shown. Ancestor pointers are represented by curved lines. The current computation is the box labeled "Computing (+ x y)." The environment binds y to 3, and the curved line shows the environment's ancestor pointer. The ancestor of that is the environment of anonymous2, and the ancestor of that is the default binding environment.

Figure 12.2. While evaluating `(add2 3)`

graphics/12fig02.gif

The same situation would hold if add99 were invoked. The new environment would look identical, except the value of x would be different. In both cases, y is the first element in the current environment. You would have to look back one environment for x, three for +. The symbols x and y are each in the first position, and + is in the fourth position.

This information can be used to create a better kind of binding environment, which looks up bindings by position rather than by name. The environment is implemented like this:

 public class Environment {    /** A list of previous binding environments. The first element     * is always this. The second is always the immediate ancestor,     * and so on. The last element is the default binding     * environment.     */    public Environment ancestors[];    /** The values of the symbols, indexed by position */    public Object value[];    /** Values are looked up by position and by environment */    public Object lookup(int position, int ancestor)    {        return ancestors[ancestor].value[position];    } }

The new definition of lookup uses a list of ancestors instead of a single ancestor pointer. This means that instead of having to recursively call lookup on previous ancestors, it can instead jump directly to the correct ancestor. Also, instead of having to look up the position in a hash table using the name of the symbol, it uses only the position of that symbol in the list. This is a simple array lookup, which is much faster than a hash table lookup.

The new definition of Environment requires a change in the way variables are accessed and the way they are bound. The definition of call in anonymous3 using this definition of Environment looks like this:

 .method call([Ljava/lang/Object;)Ljava/lang/Object .var 0 is this Lanonymous3; .var 1 is args Ljava/lang/Object; from initialize to code .var 1 is bindings LEnvironment; from code to end initialize: ; Create a new binding environment new Environment dup aload_0                        ; Get this.env getfield Environment/env LEnvironment; invokespecial Environment/<init>(LEnvironment;)V ; args contains the bindings for the current ; environment. Set the value array of the environment to ; the argument array dup                            ; Dup the array ref aload_1                        ; Store args in value putfield Environment/value [Ljava/lang/Object; astore_1                       ; Location 1 holds the                                ; new environment ; Now comes evaluation of the body. Instead of loading variables ; by name, they are loaded by position. From the diagram, these ; positions are ; Symbol     Position       # of ancestors back ; y             0                    0 ; x             0                    1 ; +             3                    3 ; Get + aload_1                        ; Get the environment iconst_3                       ; Push position (3) iconst_3                       ; Push ancestor (3) invokevirtual Environment/lookup(II)Ljava/lang/Object; iconst_2                       ; Create the arguments to + anewarray java/lang/Object     ; Call this array "args" ; Get x and store it args[0] dup                            ; Dup the array iconst_0                       ; Push position 0 aload_1                        ; Get the environment iconst_0                       ; Push position (0) iconst_1                       ; Push ancestor (1) invokevirtual Environment/lookup(II)Ljava/lang/Object; aastore                        ; Store x in args[0] ; Get y and store it in args[1] dup                            ; Dup the array iconst_1                       ; Push position 1 aload_1                        ; Get the environment iconst_0                       ; Push position (0) iconst_0                       ; Push ancestor (0) invokevirtual Environment/lookup(II)Ljava/lang/Object; aastore                        ; Store y in args[1] ; Now invoke + on x and y and return the result invokevirtual Procedure/call    ([Ljava/lang/Object;)Ljava/lang/Object end: areturn

This way of doing things has turned a hash table lookup into an array reference, which is much faster.

12.6.2 Using Local Variables

The compiled code can be sped up even further by passing variables the way Java does: as local variables. That way, they never need to be put into the environment at all. This saves both the array lookup when the variables are evaluated and the need to create an array for the arguments to the procedure calls.

This is a little more complicated that it sounds, for several reasons. First, it means that each call statement is different. In the previous examples, each instruction to invoke call was identical. The arguments are passed as an array of Objects. Second, it makes defining procedures like + more difficult. As described earlier, Scheme allows methods to take an arbitrary number of arguments, which means that the same definition of + can be used for both (+ 2 3) and (+ 1 2 3 4 5 6). In the Java virtual machine, the number of arguments passed must be the same as the number of parameters the method expects to receive.

These problems can be solved by redefining Procedure:

 class Procedure {    public Object call()        {return call(new Object[0]); }    public Object call(Object a)        {return call(new Object[1] {a }); }    public Object call(Object a, Object b)        {return call(new Object[2] {a, b }); }    public Object call(Object a, Object b, Object c)        {return call(new Object[3] {a, b, c }); }    public Object call(Object[] args)    {        throw new RuntimeException("Wrong number of arguments");    } }

When a lambda expression is evaluated, it produces a new class that subclasses Procedure and overrides the call method with the appropriate number of arguments. A procedure like +, which takes an arbitrary number of arguments, overrides the last definition of call, which takes an array of arguments, just as before.

In the body of a procedure that takes three or fewer arguments, when a symbol is evaluated that is an argument to the current procedure, the procedure uses the aload instruction to push the argument instead of making a call to lookup. The binding environment is used for all other lookups. A new binding environment has to be created only if there is a lambda expression anywhere within the body. This can be determined by examining the code.

For example, let's examine the code for this expression:

 (lambda (x) (* x x))

This code causes a new procedure to be compiled; let's call it anonymous4. The definition of this procedure is

 .class anonymous4 .super Procedure ; Override the definition of call which takes 1 argument .method call(Ljava/lang/Object;)Ljava/lang/Object; ; There is no need to create a new environment, since there ; are no lambdas in the body. ; Compute and return the value of (* x x). First evaluate *, x, ; and x aload_0                                ; Evaluation of * involves getfield anonymous4/env LEnvironment;  ; a lookup call ldc "*" invokevirtual Environment/lookup     (Ljava/lang/String;)Ljava/lang/Object; checkcast Procedure                    ; Make sure it's a                                        ; Procedure aload_1                                ; Push x. It is stored in                                        ; the argument (variable 1) aload_1                                ; Push x again ; Call * with 2 arguments invokevirtual Procedure/call    (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; areturn .end method

This procedure calls the multiplication procedure with two arguments. The definition of the multiplication procedure must be prepared to handle any number of arguments.

Procedures that take more than three arguments override the last definition of call, as if it took an arbitrary number of parameters. These procedures have to pay the higher price of getting their arguments using lookup instead of from local variables. Since procedures that take more than three arguments are uncommon, this is a price which will not be paid very often.

12.6.3 Inlining

Another nice effect of static scoping is that not only will you find the variables in the same locations each time, you will also find the same values, as long as they have not been redefined.

Suppose this is your program:

 (define pi 3.14) (define area    (lambda (r)        (* pi (square r)))

It defines area as a method that computes the area of a circle with radius r. It would be nice to be able to inline the value of pi and square so that area really executes

 (* 3.14 (* r r))

This code substitutes 3.14 for pi and expands the call to square. This is more efficient, since it substitutes a constant value for a symbol lookup, and it eliminates the overhead of the call to square. However, it can cause problems. Suppose somebody came along and decided that the value of pi was insufficiently precise:

 (set! pi 3.14159)

This code redefines pi. If the value of pi were inlined into area, then the definition of area would now be incorrect. The easiest way to handle the problem is to force the user to explicitly recompile the program whenever set! is used.

12.6.4 Tail Recursion

Inlining is not always possible. For example, consider a definition like this:

 (define factorial (lambda (x value-so-far)    (if (< x 1)        value-so-far        (factorial (- x 1) (* x value-so-far)))    ))

This is a recursive definition of factorial. If the compiler tried to inline factorial, it would find that the expanded definition contained another use of factorial. The compiler could go on forever like that.

There is a way to avoid the overhead of having to call factorial over and over again. The compiler can notice that the last thing the definition does is to call factorial again. The compiler can replace the recursive call with a loop. Compiled into bytecodes, this becomes

 ;; The code to handle the binding of x goes here loop:    ;; Code to evaluate (< x 1) not shown    ifeq success           ; Loop until x < 1    ; If the test fails:    ;; Code to compute (* x value-so-far) and bind it to value-    ;; so-far not shown    ;; Code to compute (- x 1) and bind it to x not shown    goto loop             ; Loop again success:    ;; Code to get the value of value-so-far from the environment    ;; not shown    areturn

This code has the same effect, but it doesn't require a new stack frame to be created for each multiplication. It thus executes much more quickly.

12.6.5 Using Type Information

Although Scheme is a language with very weak typing, it is still sometimes possible to prove things about the type of an expression during compilation. For example, the form

will certainly result in an integer. Under some circumstances, it may be possible to avoid the effort of wrapping the number in an Integer wrapper and use it directly as a JVM int value.

A compiler can recognize that the default binding of the + symbol means adding numbers together. Consider this form:

 (+ 2 3)

In earlier discussions, this form produced rather ungainly code that looked up the symbol +, created two new Integer objects, stored them in an array, and called the procedure. The procedure unwrapped the objects, then created yet another object to hold the result.

A sufficiently intelligent optimizing compiler that practices inlining can recognize that only integer values are used in this form and generate this code instead:

 iconst_2                  ; Push 2 iconst_3                  ; Push 3 iadd

An even better optimizing compiler might recognize that this entire form will evaluate to 5 and optimize the entire thing to

 iconst_5

However, this simple-sounding optimization is not so simple to implement. One complication is in method arguments. In previous examples we used the class Object to represent every argument. Doing this depended on the fact that numbers were wrapped up in Integer objects. If this is no longer true, how are method arguments defined?

Another complication occurs with Scheme lists, which are heterogeneous. That is, a list may contain entities of any type. A JVM array can't hold both ints and Objects.

The simplest answer is that the optimization shown is used only in limited circumstances. Whenever an int value is passed to a method that is expecting any object, it must be wrapped up in an Integer object.

More sophisticated optimizers can recognize when a method applies only to int values. For example, consider this expression:

 (lambda (x) (* x x))

The optimizer can infer that x must be a number because the * operation is applied. This means that instead of defining the call method to take two Objects, it can instead take two ints.

Further complicating matters is the fact that in real Scheme, there are other types of numbers besides integers. The full Scheme specification calls for floating-point numbers, fractions, and arbitrary-length whole numbers. There exist other solutions that an optimizer can use to handle these types as well, using overloaded method definitions, but they are not discussed here.