2.7 Assemblers

Building Parsers with Java
By Steven  John  Metsker

Table of Contents
Chapter  2.   The Elements of a Parser


To this point, this book has emphasized recognizing the strings that make up a language. In practice, you usually want to react to this recognition by doing something. What you do gives meaning, or semantics, to your parser. What you might want to do is unbounded. A common task is the creation of an object that a string describes. For example, consider a file that contains the following textual description of types of coffees:

 Brimful, Regular, Kenya, 6.95  Caress (Smackin), French, Sumatra, 7.95 Fragrant Delicto, Regular/French, Peru, 9.95 Havalavajava, Regular, Hawaii, 11.95 Launch Mi, French, Kenya, 6.95 Roman Spur (Revit), Italian, Guatemala, 7.95 Simplicity House, Regular/French, Columbia, 5.95 

You can use a parser to verify that each line of this file is an element of a coffee description language. In addition to this verification, in practice you often want to produce objects from such textual data. In this case, you would want your coffee parser to create a collection of coffee objects so that the meaning of the input is a corresponding set of objects. Assemblers make this meaning possible.

2.7.1 Parsers Use Assemblers

Any Parser object can have an associated assembler. When a parser is a composite, it can have its own assembler, and any of its subparsers can have their own assemblers, all the way down to the terminals. Figure 2.12 shows the relationship of parsers and assemblers.

Figure 2.12. The Parser / Assembler relation. A parser can have an assembler, which it uses to work on an assembly after the parser matches against the assembly.


Figure 2.13 shows the Assembler class. After a parser matches successfully against an assembly, it calls its assembler's workOn() method.

Figure 2.13. The Assembler class. The Assembler class is abstract, requiring subclasses to implement the workOn() method.


2.7.2 Assemblers Work On Assemblies

When an assembler's workOn() method is called, the assembler knows that its parser has just completed a match. For example, a parser that recognizes the preceding coffee text will include a parser to match only the country portion of a coffee description as part of building a target Coffee object. The terminal that matches "Sumatra" in the middle of

 Caress (Smackin), French, Sumatra, 7.95 

will place "Sumatra" on the assembly's stack, something that terminals do by default. After this terminal matches, it will ask its assembler to go to work. Its assembler might look like this:

 package sjm.examples.coffee;  import sjm.parse.*; import sjm.parse.tokens.*; /**  * This assembler pops a string and sets the target  * coffee's country to this string.  */ public class CountryAssembler extends Assembler { public void workOn(Assembly a) {     Token t = (Token) a.pop();     Coffee c = (Coffee) a.getTarget();     c.setCountry(t.sval().trim()); } } 

This class assumes that a parser will call this class's workOn() method in the appropriate context. This includes the assumption that the top of the assembly's stack is, in fact, a token containing the name of a country. The assembler also assumes that the assembly's target object is a Coffee object. In practice, these assumptions are safe, because the parser that calls this assembler knows with certainty that it has just matched a country name as part of an overall task of recognizing a type of coffee.

2.7.3 Elements Above

The Assembler class includes the static method elementsAbove() , which lets you design a parser that stacks a fence and later retrieves all the elements above the fence. A fence is any kind of marker object. For example, suppose a list parser is matching this list:

 "{Washington Adams Jefferson}" 

The list parser might allow the opening curly brace to go on an assembly's stack while discarding the closing curly brace. This lets the opening brace token act as a fence. After matching the elements in the list, the parser can retrieve them with an assembler that removes all the elements on the stack above the '{' token. The parser's assembler can use the elementsAbove() method, passing it the assembly and the opening brace as the fence to look for. For example:

 package sjm.examples.introduction;  import sjm.parse.*; import sjm.parse.tokens.*; /**  * Show how to use <code>Assembler.elementsAbove()</code>.  */ public class ShowElementsAbove { public static void main(String args[]) {     Parser list = new Sequence()         .add(new Symbol('{'))         .add(new Repetition(new Word()))         .add(new Symbol('}').discard());     list.setAssembler(new Assembler() {         public void workOn(Assembly a) {             Token fence = new Token('{');             System.out.println(elementsAbove(a, fence));         }     });     list.bestMatch(         new TokenAssembly("{Washington Adams Jefferson }")); } } 

This class's main() method creates a list parser that implements this pattern:

 list = '{' Word* '}'; 

In the code, the list parser allows the opening brace to go onto an assembly's stack, and discards the closing brace. The main() method gives list an assembler by using an anonymous subclass of Assembler that retrieves the elements that match after the opening brace.

Note that the opening brace terminal stacks a token and not a brace character or a string. Thus, the fence the assembler looks for is a '{' token. In this example the assembler retrieves the elements above the opening brace and prints them. Running this class prints the vector

 [Jefferson, Adams, Washington] 

The Logikus parser in Chapter 14, "Parsing a Logic Language," uses the elementsAbove() method to collect the elements of a Logikus list. The Sling parser in Chapter 16, "Parsing an Imperative Language," uses elementsAbove() to collect the statements in the body of a for loop.


Building Parsers with Java
Building Parsers With Javaв„ў
ISBN: 0201719622
EAN: 2147483647
Year: 2000
Pages: 169

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net