8.5 A Regular Expression Parser


 
Building Parsers with Java
By Steven  John  Metsker

Table of Contents
Chapter  8.   Parsing Regular Expressions

    Content

The class RegularParser holds a collection of methods that return the subparsers for a regular expression grammar. Figure 8.4 shows this class.

Figure 8.4. The RegularParser class. This class holds a collection of methods that compose into a parser for regular expressions.

graphics/08fig04.gif

To use this class, you can apply RegularParser.start() to match a character assembly. As with the ArithmeticParser class, the RegularParser class provides a value() method, which simplifies using the class. The following example uses the RegularParser class's value() method to match a variety of regular expressions:

 package sjm.examples.regular;  import sjm.parse.*; import sjm.parse.chars.*; /**  * Show how to use the <code>RegularParser</code> class.  */ public class ShowRegularParser { /*  * Just a little help for main().  */ private static void showMatch(Parser p, String s) {     System.out.print(p);     Assembly a = p.completeMatch(new CharacterAssembly(s));     if (a != null) {         System.out.print(" matches ");     }else {         System.out.print(" does not match ");     }     System.out.println(s); } public static void main(String args[])      throws RegularExpressionException {          // a*     Parser aStar = RegularParser.value("a*");     showMatch(aStar, "");     showMatch(aStar, "a");     showMatch(aStar, "aa");     showMatch(aStar, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa");     // (ab)*     Parser abStar = RegularParser.value("(ab)*");     showMatch(abStar, "aabbaabaabba");     showMatch(abStar, "aabbaabaabbaZ");     // a few other examples     showMatch(RegularParser.value("a*a*"), "aaaa");     showMatch(RegularParser.value("abc"), "bc");     showMatch(RegularParser.value("abcd"), "bc");     // four letters     Parser L = new Letter();     Parser L4 =         new Sequence("LLLL").add(L).add(L).add(L).add(L);     showMatch(L4, "java");     showMatch(L4, "joe");     showMatch(new Repetition(L), "coffee"); } } 

The first example in the main() method is

 // a*  Parser aStar = RegularParser.value("a*"); showMatch(aStar, ""); showMatch(aStar, "a"); showMatch(aStar, "aa"); showMatch(aStar, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa") 

When this section of code runs, it prints the following:

 a* matches  a* matches a a* matches aa a* matches aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 

This demonstrates that the parser built from "a*" will match any number of a characters . The next stretch of code is

 // (ab)*  Parser abStar = RegularParser.value("(ab)*"); showMatch(abStar, "aabbaabaabba"); showMatch(abStar, "aabbaabaabbaZ"); 

When this code runs, it prints the following:

 <ab>* matches aabbaabaabba  <ab>* does not match aabbaabaabbaZ 

The parser abStar prints itself as <ab>* . The angle brackets are part of how Alternation and Sequence objects represent themselves as strings: by showing their subparsers in angle brackets. The separator between alternative subparsers is a vertical bar, and between sequence elements is an empty string. The output shows that a parser built from "(ab)*" matches strings of a s and b s, but not if such a string ends with a Z .

The next few examples are

 showMatch(RegularParser.value("a*a*"), "aaaa");  showMatch(RegularParser.value("abc"), "bc"); showMatch(RegularParser.value("abcd"), "bc"); 

These examples demonstrate operator precedence, and they print the following:

 <a*a*> matches aaaa  <a<bc>> matches bc <<a<bc>>d> matches bc 

The last set of sample code is

 // four letters  Parser L = new Letter(); Parser L4 =     new Sequence("LLLL").add(L).add(L).add(L).add(L); showMatch(L4, "java"); showMatch(L4, "joe"); showMatch(new Repetition(L), "coffee"); 

This code creates the parser L4 to match a sequence of four letters and names this parser "LLLL" . When this code runs, it prints the following:

 LLLL matches java  LLLL does not match joe L* matches coffee 

   
Top


Building Parsers with Java
Building Parsers With Javaв„ў
ISBN: 0201719622
EAN: 2147483647
Year: 2000
Pages: 169

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net