6.4 Ensuring Proper Precedence

Building Parsers with Java
By Steven  John  Metsker

Table of Contents
Chapter  6.   Transforming a Grammar


Proper precedence means that some operators in a language should evaluate before others. In regular expression languages, for example, sequence usually has a higher precedence than alternation . In arithmetic, convention gives multiplication and division a higher precedence than addition and subtraction. In Java,

 System.out.println(11 + 5 * 2); 

prints "21" . Java recognizes that multiplication has higher precedence, so it computes 5 times 2 before adding this product to 11 . The idea that multiplication should proceed before addition is a convention and not a law of the universe. The designers of the Smalltalk programming language gave up traditional arithmetic precedence in order to preserve a simple and consistent messaging model. Smalltalk ignores arithmetic operator precedence, so

 11 + 5 * 2 

is equal to 32 and not 21 . As a language designer, you make the rules for your language, but you may confuse the users of your language if you ignore convention. To ensure that your parser will recognize some operators before others, you can declare this precedence in your grammar.

For example, consider an arithmetic subset that allows only the " + " operator and the " * " operator. Let us call this language Midimath. The challenge is to design a grammar that forces multiplication to come first. You can begin writing a grammar for this language by writing something like this:

 expression = term '+' term  term; 

The idea is that you can declare that there is some subparser " term " that recognizes, in particular, any text that contains an " * " sign. That is, you are declaring that you will recognize addition only after any multiplications are done. For example, expression can see "1 * 2 + 3 * 4" only as "term + term" . By declaring in the grammar that expression cannot comprehend an " * ", you are declaring that addition must go last.

One problem with this definition of expression is that the definition does not allow multiple plus signs. To fix that, you write the definition as

 expression = expression '+' term  term; 

This definition allows an unbounded number of plus signs and ensures that addition will associate to the left, but it introduces left recursion. You can repair that using the principles of the preceding section, resulting in:

 expression = term ('+' term)*; 

Now you need to provide a definition of term . A parser for term will match any mathematical expression that contains only numbers and the " * " operator. An initial definition for term is

 term = term '*' Num  Num; 

This definition matches the correct sublanguage and sets the associativity of " * " to the left, but it introduces left recursion. You can transform this definition into

 term = Num ('*' Num)*; 

So a complete grammar for Midimath is as follows :

 expression = term ('+' term)*;  term       = Num ('*' Num)*; 

This grammar is close to a complete language for arithmetic, which is explored in Chapter 7 "Parsing Arithmetic." Before developing a complete arithmetic parser, however, you must be able to handle grammar cycles.


Building Parsers with Java
Building Parsers With Javaв„ў
ISBN: 0201719622
EAN: 2147483647
Year: 2000
Pages: 169

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net