8.2. Syntax-Rules Transformers

The syntax-rules form described in this section permits simple transformers to be specified in a convenient manner. These transformers may be bound to keywords using the mechanisms described in Section 8.1. While it is much less expressive than the mechanism described in Section 8.3, it is sufficient for defining many common syntactic extensions.

(`syntax-rules` (literal ...) clause ...)	syntax
returns: a transformer

Each literal must be an identifier. Each clause takes the form:

 (pattern template)

Each pattern specifies one possible syntax that the input form might take, and the corresponding template specifies how the output should appear in each case.

Patterns consist of list structure, vector structure, identifiers, and constants. Each identifier within a pattern is either a literal, a pattern variable, or an ellipsis. The identifier … is an ellipsis. Any identifier other than … is a literal if it appears in the list of literals (literal …); otherwise, it is a pattern variable. Literals serve as auxiliary keywords, such as else in case and cond expressions. List and vector structure within a pattern specifies the basic structure required of the input, pattern variables specify arbitrary substructure, and literals and constants specify atomic pieces that must match exactly. Ellipses specify repeated occurrences of the subpatterns they follow.

An input form F matches a pattern P if and only if

P is a pattern variable,
P is a literal identifier and F is an identifier with the same binding (see free-identifier=? in Section 8.3),
P is of the form (P₁ … P_n) and F is a list of n elements that match P₁ through P_n,
P is of the form (P₁ … P_n. P_x) and F is a list or improper list of n or more elements whose first n elements match P₁ through P_n and whose nth cdr matches P_x,
P is of the form (P₁ … P_k P_e ellipsis P^m⁺¹ … P_n), where ellipsis is the identifier … and F is a proper list of n elements whose first k elements match P₁ through P_k, whose next m − k elements each match P_e, and whose remaining n − m elements match P_m⁺¹ through P_n,
P is of the form (P₁ … P_k P_e ellipsis P^m⁺¹ … P_n . P_x), where ellipsis is the identifier … and F is a list or improper list of n or more elements whose first k elements match P₁ through P_k, whose next m − k elements each match P_e, whose next n − m elements match P_m⁺¹ through P_n, and whose nth cdr matches P_x,
P is of the form #(P₁ … P_n) and F is a vector of n elements that match P₁ through P_n,
P is of the form #(P₁ … P_k P_e ellipsis P^m⁺¹ … P_n), where ellipsis is the identifier … and F is a vector of n or more elements whose first k elements match P₁ through P_k, whose next m − k elements each match P_e, and whose remaining n − m elements match P_m⁺¹ through P_n, or
P is a pattern datum (any nonlist, nonvector, nonsymbol object) and F is equal to P in the sense of the equal? procedure.

The outermost structure of a syntax-rules pattern must actually be in one of the list-structured forms above, although subpatterns of the pattern may be in any of the above forms. Furthermore, the first element of the outermost pattern is ignored, since it is always assumed to be the keyword naming the syntactic form. (These statements do not apply to syntax-case; see Section 8.3.)

If an input form passed to a syntax-rules transformer matches the pattern for a given clause, the clause is accepted and the form is transformed as specified by the associated template. As this transformation takes place, pattern variables appearing in the pattern are bound to the corresponding input subforms. Pattern variables appearing within a subpattern followed by one or more ellipses may be bound to a set or sets of zero or more input subforms.

A template is a pattern variable, an identifier that is not a pattern variable, a pattern datum, a list of subtemplates (S₁ … S_n), an improper list of subtemplates (S₁ S₂ … S_n . T ), or a vector of subtemplates #(S₁ … S_n). Each subtemplate S_i is either a template or a template followed by one or more ellipses. The final element T of an improper subtemplate list is a template.

Pattern variables appearing within a template are replaced in the output by the input subforms to which they are bound. Pattern data and identifiers that are not pattern variables are inserted directly into the output. List and vector structure within the template remains list and vector structure in the output. A subtemplate followed by an ellipsis expands into zero or more occurrences of the subtemplate. The subtemplate must contain at least one pattern variable from a subpattern followed by an ellipsis. (Otherwise, the expander could not determine how many times the subform should be repeated in the output.) Pattern variables that occur in subpatterns followed by one or more ellipses may occur only in subtemplates that are followed by (at least) as many ellipses. These pattern variables are replaced in the output by the input subforms to which they are bound, distributed as specified. If a pattern variable is followed by more ellipses in the template than in the associated pattern, the input form is replicated as necessary.

A template of the form (… template) is identical to template, except that ellipses within the template have no special meaning. That is, any ellipses contained within template are treated as ordinary identifiers. In particular, the template (… …) produces a single ellipsis, …. This allows syntactic extensions to expand into forms containing ellipses.

The definition of or below demonstrates the use of syntax-rules.

 (define-syntax or   (syntax-rules ()     (( ) #f)     (( e) e)     (( e1 e2 e3 ...)      (let ((t e1)) (if t t (or e2 e3 ...))))))

The input patterns specify that the input must consist of the keyword and zero or more subexpressions. An underscore (−), which is an ordinary pattern variable, is used by convention for the keyword position to remind the programmer and anyone reading the definition that the keyword position never fails to contain the expected keyword and need not be matched. (In fact, as mentioned above, syntax-rules ignores what appears in the keyword position.) If more than one subexpression is present (third clause), the expanded code both tests the value of the first subexpression and returns the value if it is not false. To avoid evaluating the expression twice, the transformer introduces a binding for the temporary variable t.

The expansion algorithm maintains lexical scoping automatically by renaming local identifiers as necessary. Thus, the binding for t introduced by the transformer is visible only within code introduced by the transformer and not within subforms of the input. Similarly, the references to the identifiers let and if are unaffected by any bindings present in the context of the input.

 (let ((if #f))   (let ((t 'okay))     (or if t))) ⇒ okay

This expression is transformed during expansion to the equivalent of the expression below.

 ((lambda (if1)     ((lambda (t1)         ((lambda (t2)            (if t2 t2 t1))          if1))       'okay)) #f) ⇒ okay

In this sample expansion, if1, t1, and t2 represent identifiers to which if and t in the original expression and t in the expansion of or have been renamed.

The definition of a simplified version of cond below (simplified because it requires at least one output expression per clause and does not support the auxiliary keyword =>) demonstrates how auxiliary keywords such as else are recognized in the input to a transformer, via inclusion in the list of literals.

 (define-syntax cond   (syntax-rules (else)     (( (else e1 e2 ...)) (begin e1 e2 ...))     (( (e0 e1 e2 ...)) (if e0 (begin e1 e2 ...)))     (( (e0 e1 e2 ...) c1 c2 ...)      (if e0 (begin e1 e2 ...) (cond c1 c2 ...)))))