8.3. Syntax-Case Transformers

This section describes a more expressive mechanism for creating transformers, based on syntax-case, a generalized version of syntax-rules. This mechanism permits more complex transformations to be specified, including transformations that "bend" lexical scoping in a controlled manner, allowing a much broader class of syntactic extensions to be defined. Any transformer that may be defined using syntax-rules may be rewritten easily to use syntax-case instead; in fact, syntax-rules itself may be defined as a syntactic extension in terms of syntax-case, as demonstrated within the description of syntax below.

With this mechanism, transformers are procedures of one argument. The argument is a syntax object representing the form to be processed. The return value is a syntax object representing the output form. A syntax object contains contextual information about a form in addition to its structure. This contextual information is used by the expander to maintain lexical scoping.

A syntax object representing an identifier is itself referred to as an identifier; thus, the term identifier may refer either to the syntactic entity (symbol, variable, or keyword) or to the concrete representation of the syntactic entity as a syntax object. It is rarely necessary to distinguish the two uses.

Transformers destructure their input with syntax-case and rebuild their output with syntax. These two forms alone are sufficient for defining many syntactic extensions, including any that can be defined using syntax-rules. They are described below along with a set of additional forms and procedures that provide added functionality.

(`syntax-case` exp (literal ...) clause ...)	syntax
returns: see below

Each literal must be an identifier. Each clause must take one of the following two forms.

 (pattern output-expression) (pattern fender output-expression)

syntax-case patterns may be in any of the forms described in Section 8.2.

Syntax-case first evaluates exp, then attempts to match the resulting value against the pattern from the first clause. This value is usually a syntax object, but it may be any Scheme object. If the value matches the pattern and no fender is present, output-expression is evaluated and its value returned as the value of the syntax-case expression. If the value does not match the pattern, the value is compared against the next clause, and so on. An error is signaled if the value does not match any of the patterns.

If the optional fender is present, it serves as an additional constraint on acceptance of a clause. If the value of the syntax-case exp matches the pattern for a given clause, the corresponding fender is evaluated. If fender evaluates to a true value, the clause is accepted; otherwise, the clause is rejected as if the input had failed to match the pattern. Fenders are logically a part of the matching process, i.e., they specify additional matching constraints beyond the basic structure of an expression.

Pattern variables contained within a clause's pattern are bound to the corresponding pieces of the input value within the clause's fender (if present) and output-expression. Pattern variables occupy the same name space as program variables and keywords; pattern variable bindings created by syntax-case can shadow (and be shadowed by) program variable and keyword bindings as well as other pattern variable bindings. Pattern variables, however, can be referenced only within syntax expressions.

See the examples following the description of syntax.

`(syntax` template`)`	syntax
returns: see below

A syntax expression is like a quote expression except that the values of pattern variables appearing within template are inserted into template, and contextual information associated both with the input and with the template is retained in the output to support lexical scoping. List and vector structures within the template become true lists or vectors (suitable for direct application of list or vector operations, like map or vector-ref) to the extent that the list or vector structures must be copied to insert the values of pattern variables. A syntax template is identical to a syntax-rules template and is treated similarly.

The definition of or below is equivalent to the one given in Section 8.2 except that it employs syntax-case and syntax in place of syntax-rules.

 (define-syntax or   (lambda (x)     (syntax-case x ()       (( ) (syntax #f))       (( e) (syntax e))       (( e1 e2 e3 ...)        (syntax (let ((t e1)) (if t t (or e2 e3 ...))))))))

In this version, the lambda expression that produces the transformer is explicit, as are the syntax forms in the output part of each clause. Any syntax-rules form can be expressed with syntax-case by making the lambda expression and syntax expressions explicit. This observation leads to the following definition of syntax-rules in terms of syntax-case.

 (define-syntax syntax-rules   (lambda (x)     (syntax-case x ()       (( (i ...) ((keyword . pattern) template) ...)        (syntax (lambda (x)                  (syntax-case x (i ...)                    ((dummy . pattern) (syntax template))                    ...)))))))

The unreferenced pattern variable dummy is used in place of each keyword since the first position of each syntax-rules pattern is always ignored.

Since the lambda and syntax expressions are implicit in a syntax-rules form, definitions expressed with syntax-rules are often shorter than the equivalent definitions expressed with syntax-case. The choice of which to use when either suffices is a matter of taste, but many transformers that can be written easily with syntax-case cannot be written easily or at all with syntax-rules (see Section 8.4).

(`identifier?` obj )	procedure
returns: #t if obj is an identifier, #f otherwise

identifier? is often used within fenders to verify that certain subforms of an input form are identifiers, as in the definition of unnamed let below.

 (define-syntax let   (lambda (x)     (define ids?       (lambda (ls)         (or (null? ls)             (and (identifier? (car ls))                  (ids? (cdr ls))))))   (syntax-case x ()     (( ((i v) ...) e1 e2 ...)      (ids? (syntax (i ...)))      (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

Syntactic extensions ordinarily take the form (keyword subform …), but the syntax-case system permits them to take the form of singleton identifiers as well. For example, the keyword pcar in the expression below may be used both as an identifier (in which case it expands into a call to car) or as a structured form (in which case it expands into a call to set-car!).

 (let ((p (cons 0 #f)))   (define-syntax pcar     (lambda (x)       (syntax-case x ()         ( (identifier? x) (syntax (car p)))         (( v) (syntax (set-car! p v))))))   (let ((a pcar))      (pcar 1)      (list a pcar))) ⇒ (0 1)

The fender (identifier? x) is used to recognize the singleton identifier case.

(`free-identifier=?` identifier₁ identifier₂)	procedure
(`bound-identifier=?` identifier₁ identifier₂)	procedure
returns: see below

Symbolic names alone do not distinguish identifiers unless the identifiers are to be used only as symbolic data. The predicates free-identifier=? and bound-identifier=? are used to compare identifiers according to their intended use as free references or bound identifiers in a given context.

free-identifier=? is used to determine whether two identifiers would be equivalent if they were to appear as free identifiers in the output of a transformer. Because identifier references are lexically scoped, this means that (free-identifier=? id₁ id₂) is true if and only if the identifiers id₁ and id₂ refer to the same lexical or top-level binding. (For this comparison, all variables are assumed to have top-level bindings, whether defined yet or not.) Literal identifiers (auxiliary keywords) appearing in syntax-case patterns (such as else in case and cond) are matched with free-identifier=?.

Similarly, bound-identifier=? is used to determine if two identifiers would be equivalent if they were to appear as bound identifiers in the output of a transformer. In other words, if bound-identifier=? returns true for two identifiers, a binding for one will capture references to the other within its scope. In general, two identifiers are bound-identifier=? only if both are present in the original program or both are introduced by the same transformer application (perhaps implicitly|see datum->syntax-object). bound-identifier=? can be used for detecting duplicate identifiers in a binding construct or for other preprocessing of a binding construct that requires detecting instances of the bound identifiers.

The definition below is equivalent to the earlier definition of a simplified version of cond with syntax-rules, except that else is recognized via an explicit call to free-identifier? within a fender rather than via inclusion in the literals list.

 (define-syntax cond   (lambda (x)     (syntax-case x ()       (( (e0 e1 e2 ...))        (and (identifier? (syntax e0))             (free-identifier=? (syntax e0) (syntax else)))        (syntax (begin e1 e2 ...)))       (( (e0 e1 e2 ...)) (syntax (if e0 (begin e1 e2 ...))))       (( (e0 e1 e2 ...) c1 c2 ...)        (syntax (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))))

With either definition of cond, else is not recognized as an auxiliary keyword if an enclosing lexical binding for else exists. For example,

 (let ((else #f))   (cond (else (write "oops"))))

does not write "oops", since else is bound lexically and is therefore not the same else that appears in the definition of cond.

The following definition of unnamed let uses bound-identifier=? to detect duplicate identifiers.

 (define-syntax let   (lambda (x)     (define ids?       (lambda (ls)         (or (null? ls)             (and (identifier? (car ls))                  (ids? (cdr ls))))))   (define unique-ids?     (lambda (ls)       (or (null? ls)           (and (let notmem? ((x (car ls)) (ls (cdr ls)))                  (or (null? ls)                      (and (not (bound-identifier=? x (car ls)))                           (notmem? x (cdr ls)))))                (unique-ids? (cdr ls))))))   (syntax-case x ()     (( ((i v) ...) e1 e2 ...)      (and (ids? (syntax (i ...)))           (unique-ids? (syntax (i ...))))      (syntax ((lambda (i ...) e1 e2 ...) v ...))))))

With the definition of let above, the expression

 (let ((a 3) (a 4)) (+ a a))

results in a syntax error, whereas

 (let-syntax ((dolet (lambda (x)                       (syntax-case x ()                         (( b)                          (syntax (let ((a 3) (b 4))                                    (+ a b))))))))  (dolet a))

evaluates to 7 since the identifier a introduced by dolet and the identifier a extracted from the input form are not bound-identifier=?. Since both occurrences of a, however, if left as free references, would refer to the same (top-level) binding for a, free-identifier=? would not distinguish them.

Two identifiers that are free-identifier=? may not be bound-identifier=?. An identifier introduced by a transformer may refer to the same enclosing binding as an identifier not introduced by the transformer, but an introduced binding for one will not capture references to the other. On the other hand, identifiers that are bound-identifier=? are free-identifier=?, as long as the identifiers have valid bindings in the context where they are compared.

(`with-syntax` ((pattern val) ...) exp₁ exp₂ ...)	syntax
returns: the value of the last exp_i

It is sometimes useful to construct a transformer's output in separate pieces, then put the pieces together. with-syntax facilitates this by allowing the creation of local pattern bindings.

pattern is identical in form to a syntax-case pattern. The value of each val is computed and destructured according to the corresponding pattern, and pattern variables within the pattern are bound as with syntax-case to appropriate portions of the value within exp₁ exp₂ ….

with-syntax may be defined as a syntactic extension in terms of syntax-case.

 (define-syntax with-syntax   (lambda (x)     (syntax-case x ()       (( ((p e0) ...) e1 e2 ...)        (syntax (syntax-case (list e0 ...) ()                  ((p ...) (begin e1 e2 ...))))))))

The following definitions of full cond and case demonstrate the use of with-syntax to support transformers that employ recursion internally to construct their output.

 (define-syntax cond   (lambda (x)     (syntax-case x ()       (( c1 c2 ...)        (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...))))          (if (null? cmore)              (syntax-case c1 (else =>)                ((else e1 e2 ...) (syntax (begin e1 e2 ...)))                ((e0) (syntax (let ((t e0)) (if t t))))                ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t)))))                ((e0 e1 e2 ...) (syntax (if e0 (begin e1 e2 ...)))))              (with-syntax ((rest (f (car cmore) (cdr cmore))))                (syntax-case c1 (=>)                  ((e0) (syntax (let ((t e0)) (if t t rest))))                  ((e0 => e1) (syntax (let ((t e0)) (if t (e1 t) rest))))                  ((e0 e1 e2 ...)                   (syntax (if e0 (begin e1 e2 ...) rest))))))))))) (define-syntax case   (lambda (x)     (syntax-case x ()       ((_ e c1 c2 ...)        (with-syntax ((body            (let f ((c1 (syntax c1)) (cmore (syntax (c2 ...))))             (if (null? cmore)                 (syntax-case c1 (else)                   ((else e1 e2 ...) (syntax (begin e1 e2 ...)))                   (((k ...) e1 e2 ...)                    (syntax (if (memv t '(k ...)) (begin e1 e2 ...)))))                 (with-syntax ((rest (f (car cmore) (cdr cmore))))                   (syntax-case c1 ()                     (((k ...) e1 e2 ...)                      (syntax (if (memv t '(k ...))                                  (begin e1 e2 ...)                                  rest)))))))))        (syntax (let ((t e)) body)))))))

`(syntax-object->datum` obj`)`	procedure
returns: obj stripped of syntactic information

The procedure syntax-object->datum strips all syntactic information from a syntax object and returns the corresponding Scheme "datum." Identifiers stripped in this manner are converted to their symbolic names, which can then be compared with eq?. Thus, a predicate symbolic-identifier=? might be defined as follows.

 (define symbolic-identifier=?   (lambda (x y)     (eq? (syntax-object->datum x)          (syntax-object->datum y))))

Two identifiers that are free-identifier=? are symbolic-identifier=?; in order to refer to the same binding, two identifiers must have the same name. The converse is not always true, since two identifiers may have the same name but different bindings.

`(datum->syntax-object` template-identifier obj`)`	procedure
returns: a syntax object

datum->syntax-object constructs a syntax object from obj that contains the same contextual information as template-identifier, with the effect that the syntax object behaves as if it were introduced into the code when template-identifier was introduced. The template identifier is often the keyword of an input form, extracted from the form, and the object is often a symbol naming an identifier to be constructed.

datum->syntax-object allows a transformer to "bend" lexical scoping rules by creating implicit identifiers that behave as if they were present in the input form, thus permitting the definition of syntactic extensions that introduce visible bindings for or references to identifiers that do not appear explicitly in the input form. For example, we can define a loop expression that binds the variable break to an escape procedure within the loop body.

 (define-syntax loop   (lambda (x)     (syntax-case x ()       ((k e ...)        (with-syntax ((break (datum->syntax-object (syntax k) 'break)))           (syntax (call-with-current-continuation                     (lambda (break)                       (let f () e ... (f)))))))))) (let ((n 3) (ls '()))   (loop     (if (= n 0) (break ls))     (set! ls (cons 'a ls))     (set! n (- n 1)))) ⇒ (a a a)

Were we to define loop as

 (define-syntax loop   (lambda (x)     (syntax-case x ()       (( e ...)        (syntax (call-with-current-continuation                  (lambda (break)                    (let f () e ... (f)))))))))

the variable break would not be visible in e ….

It is also useful for obj to represent an arbitrary Scheme form, as demonstrated by the following definition of include, an expand-time version of load.

 (define-syntax include   (lambda (x)     (define read-file       (lambda (fn k)         (let ((p (open-input-file fn)))           (let f ((x (read p)))             (if (eof-object? x)                 (begin (close-input-port p) '())                 (cons (datum->syntax-object k x)                       (f (read p))))))))     (syntax-case x ()        ((k filename)         (let ((fn (syntax-object->datum (syntax filename))))           (with-syntax (((exp ...) (read-file fn (syntax k))))             (syntax (begin exp ...))))))))

(include "filename") expands into a begin expression containing the forms found in the file named by "filename". For example, if the file f-def.ss contains the expression (define f (lambda () x)), the expression

 (let ((x "okay"))   (include "f-def.ss")   (f))

evaluates to "okay".

The definition of include uses datum->syntax-object to convert the objects read from the file into syntax objects in the proper lexical context, so that identifier references and definitions within those expressions are scoped where the include form appears.

`(generate-temporaries` list`)`	procedure
returns: a list of distinct generated identifiers

Transformers can introduce a fixed number of identifiers into their output by naming each identifier. In some cases, however, the number of identifiers to be introduced depends upon some characteristic of the input expression. A straightforward definition of letrec, for example, requires as many temporary identifiers as there are binding pairs in the input expression. The procedure generate-temporaries is used to construct lists of temporary identifiers.

list may be any list; its contents are not important. The number of temporaries generated is the number of elements in list. Each temporary is guaranteed to be different from all other identifiers.

A definition of letrec that uses generate-temporaries is shown below.

 (define-syntax letrec   (lambda (x)     (syntax-case x ()       (( ((i v) ...) e1 e2 ...)        (with-syntax (((t ...) (generate-temporaries (syntax (i ...)))))           (syntax (let ((i #f) ...)                     (let ((t v) ...)                       (set! i t) ...                       (let () e1 e2 ...)))))))))

Any transformer that uses generate-temporaries in this fashion can be rewritten to avoid using it, albeit with a loss of clarity. The trick is to use a recursively defined intermediate form that generates one temporary per expansion step and completes the expansion after enough temporaries have been generated. Here is a definition of let-values (see page 115) that uses this technique to support multiple sets of bindings.

 (define-syntax let-values   (syntax-rules ()     (( () f1 f2 ...) (let () f1 f2 ...))     (( ((fmls1 expr1) (fmls2 expr2) ...) f1 f2 ...)      (lvhelp fmls1 () () expr1 ((fmls2 expr2) ...) (f1 f2 ...))))) (define-syntax lvhelp   (syntax-rules ()     (( (x1 . fmls) (x ...) (t ...) e m b)      (lvhelp fmls (x ... x1) (t ... tmp) e m b))     (( () (x ...) (t ...) e m b)      (call-with-values        (lambda () e)        (lambda (t ...)          (let-values m (let ((x t) ...) . b)))))     (( xr (x ...) (t ...) e m b)      (call-with-values        (lambda () e)        (lambda (t ... . tmpr)          (let-values m (let ((x t) ... (xr tmpr)) . b)))))))

The implementation of lvhelp is complicated by the need to evaluate all of the right-hand-side expressions before creating any of the bindings and by the need to support improper formals lists.

A definition of letrec that does not use generate-temporaries is left as an exercise for the reader.