Parsing an Expression


For use with a recursive-descent parser, think of expressions as recursive data structures—that is, expressions that are defined in terms of themselves. If, for the moment, we assume that expressions can use only +, , *, /, and parentheses, all expressions can be defined by the following rules:

 expression -> term [+ term] [ term] term -> factor [* factor] [/ factor] factor -> variable, number, or (expression)

The square brackets designate an optional element, and the -> means produces. In fact, these rules are usually called the production rules of the expression. Therefore, for the definition of term you could say: “Term produces factor times factor or factor divided by factor.” Notice that the precedence of the operators is implicit in the way an expression is defined.

The expression

 10 + 5 * B

has two terms: 10, and 5 * B. The second term contains two factors: 5 and B. These factors consist of one number and one variable.

On the other hand, the expression

 14 * (7  C)

has two factors: 14 and (7 C). The factors consist of one number and one parenthesized expression. The parenthesized expression contains two terms: one number and one variable.

This process forms the basis for a recursive-descent parser, which is a set of mutually recursive methods that work in a chain-like fashion and implement the production rules. At each appropriate step, the parser performs the specified operations in the algebraically correct sequence. To see how the production rules are used to parse an expression, let’s work through an example using the following expression:

 9/3  (100 + 56)

Here is the sequence that you will follow:

  1. Get the first term, 9/3.

  2. Get each factor and divide the integers. The resulting value is 3.

  3. Get the second term, (100 + 56). At this point, start recursively analyzing the second subexpression.

  4. Get each term and add. The resulting value is 156.

  5. Return from the recursive call and subtract 156 from 3. The answer is 153.

If you are a little confused at this point, don’t worry. This is a fairly complex concept that takes some getting used to. There are two basic things to remember about this recursive view of expressions. First, the precedence of the operators is implicit in the way the production rules are defined. Second, this method of parsing and evaluating expressions is very similar to the way humans evaluate mathematical expressions.

The remainder of this chapter develops two parsers. The first will parse and evaluate floating-point expressions of type double, which consist only of literal values. This parser illustrates the basics of the recursive-descent method of parsing. The second adds the ability to use variables.




C# 2.0(c) The Complete Reference
C# 2.0: The Complete Reference (Complete Reference Series)
ISBN: 0072262095
EAN: 2147483647
Year: 2006
Pages: 300

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net