10.3 Repetition Matching

	Building Parsers with Java By Steven John Metsker
	Table of Contents

	Chapter 10. Matching Mechanics

Content

The Repetition class earns its name through the way it implements the match() method. A repetition parser takes another parser and matches it repeatedly against each assembly in a set. The parser that a repetition repeats may be a simple terminal, such as the object new Word() . The parser that a repetition repeats may also be an arbitrarily complex composite parser. Figure 10.2 shows that the constructors of the Repetition class require the creator of a Repetition object to specify this subparser.

Figure 10.2. The `Repetition` class. A `Repetition` object represents a repetition of another parser, so the constructors of this class require this parser to be specified.

graphics/10fig02.gif

The idea of a Repetition object is that it produces many matches against a single assembly. The match() method's input parameter, however, is not a single assembly but rather a collection of assemblies in a vector. The responsibility of Repetition.match() is to produce zero or more matches against each assembly in the input vector, gathering all the matches into a final output state, which is another vector.

The Repetition class provides for a preassembler that it applies to every assembly in an input state before matching. Unlike Sequence and Alternation objects, a Repetition object matches every assembly in an input state at least once. If the subparser cannot match the assembly even once, it counts as zero matches and the match succeeds. Because every input state carries through to the output state, it is safe and occasionally useful to apply a preassembler. For example, the code in sjm.examples.pretty uses a preassembler to stack a fence object before matching a repetition and uses a regular assembler to pop down to this fence after matching.

In pseudocode, you can let s be a working state that contains the results of successive applications of the repetition's subparser to the input state. Here is the algorithm for a repetition match:

 // pseudocode  out = in; apply preassembler s = in; // a working state while (!s.isEmpty()) {     s = subparser.matchAndAssemble(s);     out = out + s; }

The actual code must take care that the initial value of out is a deep copy of the input state. If out were a simple clone of the input vector, it would contain the same objects as the input. Then if you applied an assembler to the out vector, you would modify the elements of the input set. Here is the code for Repetition.match() :

 public Vector match(Vector in) {      if (preAssembler != null) {         Enumeration e = in.elements();         while (e.hasMoreElements()) {             preAssembler.workOn((Assembly) e.nextElement());         }     }     Vector out = elementClone(in);     Vector s = in; // a working state     while (!s.isEmpty()) {         s = subparser.matchAndAssemble(s);         add(out, s);     }     return out; }

Note that the code uses matchAndAssemble() rather than match() . The general responsibility of any implementation of match() is for the parser to match each subparser and apply the subparser's assembler, if it has one.

Top

Figure 10.2. The Repetition class. A Repetition object represents a repetition of another parser, so the constructors of this class require this parser to be specified.

Figure 10.2. The `Repetition` class. A `Repetition` object represents a repetition of another parser, so the constructors of this class require this parser to be specified.