In this chapter we have seen how to compile small languages, like regular expressions, and bits and pieces of larger languages. These and many other languages can be implemented using the Java virtual machine.
There are a variety of reasons to implement other languages in JVM code. Each language has its own flavor. Each programmer has a favorite language. Different programming
In succeeding chapters, we explore compilers for two languages, Scheme and Prolog. These languages are very different from Java. Scheme is based on the idea of using functions as data. Prolog is based on the idea of using rules to specify program
For an interesting discussion of extending Java to support other language features, visit http://wwwipd.ira.uka.de/~pizza. This site discusses Pizza and GJ, two extended Java languages, which compile into bytecodes.
Chapter 12. Implementing Scheme
Scheme is a language based on Lisp. Scheme features procedures as first-class entities; that is, procedures are treated just like any other kind of value. They may be passed as arguments to other procedures, stored in
12.1 Scheme Concepts
The fundamental units of Scheme programs are numbers (written like Java
A Scheme program is called a form . Here is an example of a form:
(+ 2 3)
This form denotes the list containing three entities: the symbol
, the number
, and the number
. To run a Scheme program, the Scheme
For example, in the list
(+ 2 3)
, the symbol
evaluates to a procedure that adds numbers together, and
There are a few exceptions to the third rule. Some lists have keywords in the first position; these are evaluated differently from other lists. These lists are called special forms .
To create a new procedure in Scheme, a special form called a lambda expression is used. Here is an example of a lambda expression:
(lambda (x) (* x x))
This expression evaluates to a new procedure, in this case, one that squares its parameter by multiplying it by itself. A lambda expression is written with the keyword lambda as the first element. The second element is a list of formal parameters. The third element is called the body of the lambda expression.
When the procedure is applied to a list of actual parameters, a new binding environment is created in which the symbols in the list of formal parameters are bound to the elements of the list of actual parameters. The new binding environment includes the one in which the lambda expression was first evaluated.
( (lambda (x) (* x x)) 5 )
This is a list with two elements. The first element is a lambda expression, which evaluates to a procedure, as described. The second element of the list is the number 5 , which evaluates to itself.
After evaluating the list elements, the evaluator evaluates the whole form by using the first element as a procedure and the second element as the argument to that procedure. To apply the procedure, a new binding environment is created, binding the symbol x to number 5 . The new binding environment also contains the environment in which the lambda expression was created, so any symbol not explicitly defined in the new environment is inherited.
Next, the body of the lambda expression is evaluated with respect to the new binding environment. The symbol * is not defined in the new binding environment, but it is defined in the environment in which the lambda was first evaluated. It evaluates to a procedure that multiplies its arguments together. The symbol x is evaluated, yielding the value 5 , since that's the result of looking up the symbol x in the binding environment.
Finally, the evaluator applies the multiplication procedure to the numbers 5 and 5 , returning 25 . The value of the entire form is 25 .
Another special form is the if special form:
(if (< a b) b a )
This is an expression that returns the maximum of
special form starts by evaluating the second element of the list, called the
. In this case, the test is the expression
(< a b)
. The symbol
evaluates to a procedure that
The third and fourth elements of the if special form are called the true-expression and the false-expression . If the test returns #f , then the value of the if expression is the result of evaluating the false-expression. Otherwise, the value of the if expression is the result of evaluating the true-expression.
Only the true-expression or the false-expression is evaluated, not both. That's why if is a special form rather than a procedure. If it were a procedure, then both the true-expression and the false-expression would be evaluated. If they have side effects, like printing a value, then both sets of side effects would happen, which would be incorrect.
Suppose a is bound to 5 and b is bound to 26 . Consider the expression
(if (< a b) b a)
To evaluate this expression, the evaluator first evaluates the test:
, respectively, and
evaluates to the
The define special form creates new entries in the binding environment. Here's an example:
(define max (lambda (a b) (if (a < b) b a ) ) )
When evaluated, the define special form defines a new symbol called max in the binding environment. The value of this symbol is the result of evaluating the form that is the third element of the list. That form is
(lambda (a b) (if (a < b) b a) )
This code evaluates to a procedure that takes two arguments, returning the value of the first if it is the greater one, the value of the second otherwise. The effect is to create a new function called max that returns the greater of its two arguments. After max has been defined, the expression
(max 5 26)
evaluates to 26 .
Unlike Java, all the parentheses are important; you can't just add new ones wherever you want. For example, suppose you tried to evaluate
((max 5 26))
The evaluator would try to evaluate the first element of the list, (max 5 26) , which evaluates to 26 . Then it would try to apply 26 as a procedure. Since 26 is a number, not a procedure, an error would result.
A symbol may be rebound with the set! special form:
(set! x 10)
This special form rebinds x to 10 in the local binding environment. It should be used only on symbols that already have a binding, either from a lambda expression or from a define . For example,
(define sqr ; Define sqr to square its (lambda (x) ; argument (* x x))) (sqr 10) ; This results in 100 (set! sqr ; Define sqr to double its (lambda (y) ; argument (+ x x))) (sqr 10) ; This results in 20
The first time (sqr 10) is evaluated, the evaluator uses the original definition found in the define , which multiplies the argument times itself. The second time (sqr 10) is evaluated, the symbol sqr has been redefined to be the sum of the argument and itself, so it returns a different value.