| < Free Open Study > |
The
The motivation for let-polymorphism arises from examples like the following. Suppose we define and use a simple function double , which applies its first argument twice in succession to its second:
let double = λ:Nat
Nat. λa:Nat. f(f(a)) in double (λx:Nat. succ (succ x)) 2;
Because we want to apply double to a function of type Nat Nat , we choose type annotations that give it type ( Nat Nat) (Nat Nat) . We can alternatively define double so that it can be used to double a boolean function:
let double = λf:Bool
Bool. λa:Bool. f(f(a)) in double (λx:Bool. x) false;
What we
cannot
do is use the same
double
function with both booleans and
let doubleNat = λf:Nat Nat. λa:Nat. f(f(a)) in let doubleBool = λf:Bool Bool. λa:Bool. f(f(a)) in let a = doubleNat (λx:Nat. succ (succ x)) 1 in let b = doubleBool (λx:Bool. x) false in ...
Even annotating the abstractions in double with a type variable
let double = λf:X
X. λa:X. f(f(a)) in ...
does not help. For example, if we write
let double = λf:X
X. λa:X. f(f(a)) in let a = double (λx:Nat. succ (succ x)) 1 in let b = double (λx:Bool. x) false in ...
then the use of double in the definition of a generates the constraint X X = Nat Nat , while the use of double in the definition of b generates the constraint X X = Bool Bool . These constraints place unsatisfiable demands on X , making the whole program untypable.
What went wrong here? The variable
X
plays two distinct roles in the example. First, it captures the constraint that the first argument to
double
in the calculation of a must be a function whose domain and range types are the same as the type (
Nat
) of the other argument to
double
. Second, it captures the constraint that the arguments to
double
in the calculation of
b
must be similarly
What we'd like is to break this last connection - i.e., to associate a different variable X with each use of double . Fortunately, this is easily accomplished. The first step is to change the ordinary typing rule for let so that, instead of calculating a type for the right-hand side t 1 and then using this as the type of the bound variable x while calculating a type for the body t 2 ,
it instead substitutes t 1 for x in the body, and then typechecks this expanded expression:
We write a constraint-typing rule for let in a similar way:
In essence, what we've done is to change the typing rules for let so that they perform a step of evaluation
before calculating types.
The second step is to rewrite the definition of double using the implicitly annotated lambda-abstractions from §22.6.
let double = λf. λa. f(f(a)) in let a = double (λx:Nat. succ (succ x)) 1 in let b = double (λx:Bool. x) false in ...
The combination of the constraint typing rules for
let
(CT-LETPOLY) and the implicitly annotated lambda-abstraction (CT-ABSINF) gives us exactly what we need: CT-LETPOLY makes two copies of the definition of
double
, and CT-ABSINF
However, this scheme has some flaws that need to be addressed before we can use it in practice. One obvious one is that, if we don't happen to actually use the let -bound variable in the body of the let , then the definition will never actually be typechecked. For example, a program like
let x = <utter garbage> in 5
will pass the typechecker. This can be repaired by adding a
and a corresponding premise to CT-LETPOLY, ensuring that t 1 is well typed.
A related problem is that, if the body of the
let
contains
many
occurrences of the
let
-bound variable, then the whole right-hand side of the
let
-definition will be checked once per occurrence, whether or not it contains any implicitly annotated lambda-abstractions. Since the right-hand side itself can contain
let
-bindings, this typing rule can cause the typechecker to perform an amount of work that is exponential in the
To avoid this re-typechecking, practical
We use the constraint typing rules to calculate a type S 1 and a set C 1 of associated constraints for the right-hand side t 1 .
We use unification to find a most general solution σ to the constraints C 1 and apply σ to S 1 (and) to obtain t 1 s principal type T 1 .
We
generalize
any
One caveat is here that we need to be careful
not
to generalize variables
T
1
that are also mentioned in , since these
λf:X
X. λx:X. let g=f in g(x);
we should not generalize the variable X in the type X X of g , since doing so would allow us to type wrong programs like this one:
(λf:X
X. λx:X. let g=f in g(0)) (λx:Bool. if x then true else false) true;
We extend the context to record the type scheme
"
X
1
...X
n
.T
1
for the bound variable
x
, and start typechecking the body
t
2
. In general, the context now
Each time we encounter an occurrence of x in t 2 , we look up its type scheme " X 1 ...X n .T 1 . We now generate fresh type variables Y 1 ...Y n and use them to instantiate the type scheme, yielding [ X 1 ↦ Y 1 , ..., X n ↦ Y n ] T 1 , which we use as the type of x . [5]
This algorithm is much more efficient than the simplistic approach of substituting away
let
expressions before typechecking. Indeed, decades of experience have shown that in practice it appears "
let f = fun x (x,x) in let f 1 = fun y f (f y) in let f 2 = fun y f 1 (f 1 y) in let f 3 = fun y f 2 (f 2 y) in let f 4 = fun y f 3 (f 3 y) in let f 5 = fun y f 4 (f 4 y) in f 5 (fun z z)
To see why, try entering f , f 1 , etc., one at a time, into the OCaml top-level. See Kfoury, Tiuryn, and Urzyczyn (1994) for further discussion.
A final point worth mentioning is that, in designing full-blown programming languages with let-polymorphism, we need to be a bit careful of the interaction of polymorphism and side-effecting features such as mutable storage
let r = ref (λx. x) in (r:=(λx:Nat. succ x); (!r)true);
Using the algorithm sketched above, we calculate Ref(X X) as the principal type of the right-hand side of the let ; since X appears nowhere else, this type can be generalized to " X.Ref(X X) , and we assign this type scheme to r when we add it to the context. When typechecking the assignment in the second line, we instantiate this type to Ref(Nat Nat) . When typechecking the third line, we instantiate it to Ref(Bool Bool) . But this is unsound, since when the term is evaluated it will end up applying succ to true .
The problem here is that the typing rules have gotten out of sync with the evaluation rules. The typing rules introduced in this section tell us that, when we see a let expression, we should immediately substitute the right-hand side into the body. But the evaluation rules tell us that we may perform this substitution only after the right-hand side has been reduced to a value. The typing rules see two uses of the ref constructor, and analyze them under different assumptions, but at run time only one ref is actually allocated.
We can correct this mismatch in two ways - by adjusting evaluation or typing. In the former case, the evaluation rule for let would become [6]
Under this strategy, the first step in evaluating our dangerous example from above would replace r by its definition, yielding
(ref (λx. x)) := (λx:Nat. succ x) in (!(ref (λx. x))) true;
which is
It is better to change the typing rule to match the evaluation rule. Fortunately, this is easy: we just impose the restriction (often called the value restriction ) that a let -binding can be treated polymorphically - i.e., its free typevariables can be generalized - only if its right-hand side is a syntactic value. This means that, in the dangerous example, the type assigned to r when we add it to the context will be X X , not " X.X X . The constraints imposed by the second line will force X to be Nat , and this will cause the typechecking of the third line to fail, since Nat cannot be unified with Bool .
The value restriction
Implement the algorithm sketched in this section.
[5]
The difference between a lambda-abstraction that is explicitly annotated with a type variable and an un-annotated abstraction for which the constraint generation algorithm creates a variable becomes moot once we introduce generalization and instantiation. Either way, the right-hand side of a
let
is assigned a type involving a variable, which is generalized before being added to the context and
[6] Strictly speaking, we should annotate this rule with a store , as we did in Chapter 13, since we are talking about a language with references:
| < Free Open Study > |