A Simple Expression Parser


Here is the first version of the parser. It can evaluate expressions that consist solely of literals, operators, and parentheses. Although GetToken( ) can process variables, the parser does nothing with them. Once you understand how this simplified parser works, we will add the ability to handle variables.

 /*    This module contains the recursive-descent    parser that does not use variables. */ using System; // Exception class for parser errors. class ParserException : ApplicationException {   public ParserException(string str) : base(str) { }   public override string ToString() {     return Message;   } } class Parser {   // Enumerate token types.   enum Types { NONE, DELIMITER, VARIABLE, NUMBER };   // Enumerate error types.   enum Errors { SYNTAX, UNBALPARENS, NOEXP, DIVBYZERO };   string exp;    // refers to expression string   int expIdx;    // current index into the expression   string token;  // holds current token   Types tokType; // holds token's type   // Parser entry point.   public double Evaluate(string expstr)   {     double result;     exp = expstr;     expIdx = 0;     try {       GetToken();       if(token == "") {         SyntaxErr(Errors.NOEXP); // no expression present         return 0.0;       }       EvalExp2(out result);       if(token != "") // last token must be null         SyntaxErr(Errors.SYNTAX);       return result;     } catch (ParserException exc) {       // Add other error handling here, as desired.       Console.WriteLine(exc);       return 0.0;     }   }   // Add or subtract two terms.   void EvalExp2(out double result)   {     string op;     double partialResult;     EvalExp3(out result);     while((op = token) == "+" || op == "-") {       GetToken();       EvalExp3(out partialResult);       switch(op) {         case "-":           result = result - partialResult;           break;         case "+":           result = result + partialResult;           break;       }     }   }   // Multiply or divide two factors.   void EvalExp3(out double result)   {     string op;     double partialResult = 0.0;     EvalExp4(out result);     while((op = token) == "*" ||            op == "/" || op == "%") {       GetToken();       EvalExp4(out partialResult);       switch(op) {         case "*":           result = result * partialResult;           break;         case "/":           if(partialResult == 0.0)             SyntaxErr(Errors.DIVBYZERO);           result = result / partialResult;           break;         case "%":           if(partialResult == 0.0)             SyntaxErr(Errors.DIVBYZERO);           result = (int) result % (int) partialResult;           break;       }     }   }   // Process an exponent.   void EvalExp4(out double result)   {     double partialResult, ex;     int t;     EvalExp5(out result);     if(token == "^") {       GetToken();       EvalExp4(out partialResult);       ex = result;       if(partialResult == 0.0) {         result = 1.0;         return;       }       for(t=(int)partialResult-1; t > 0; t--)         result = result * (double)ex;     }   }   // Evaluate a unary + or -.   void EvalExp5(out double result)   {     string op;     op = "";     if((tokType == Types.DELIMITER) &&         token == "+" || token == "-") {       op = token;       GetToken();     }     EvalExp6(out result);     if(op == "-") result = -result;   }   // Process a parenthesized expression.   void EvalExp6(out double result)   {     if((token == "(")) {       GetToken();       EvalExp2(out result);       if(token != ")")         SyntaxErr(Errors.UNBALPARENS);       GetToken();     }     else Atom(out result);   }   // Get the value of a number.   void Atom(out double result)   {     switch(tokType) {       case Types.NUMBER:         try {           result = Double.Parse(token);         } catch (FormatException) {           result = 0.0;           SyntaxErr(Errors.SYNTAX);         }         GetToken();         return;       default:         result = 0.0;         SyntaxErr(Errors.SYNTAX);         break;     }   }   // Handle a syntax error.   void SyntaxErr(Errors error)   {     string[] err = {       "Syntax Error",       "Unbalanced Parentheses",       "No Expression Present",       "Division by Zero"     };     throw new ParserException(err[(int)error]);   }   // Obtain the next token.   void GetToken()   {     tokType = Types.NONE;     token = "";     if(expIdx == exp.Length) return; // at end of expression     // skip over whitespace     while(expIdx < exp.Length &&           Char.IsWhiteSpace(exp[expIdx])) ++expIdx;     // trailing whitespace ends expression     if(expIdx == exp.Length) return;     if(IsDelim(exp[expIdx])) { // is operator       token += exp[expIdx];       expIdx++;       tokType = Types.DELIMITER;     }     else if(Char.IsLetter(exp[expIdx])) { // is variable       while(!IsDelim(exp[expIdx])) {         token += exp[expIdx];         expIdx++;         if(expIdx >= exp.Length) break;       }       tokType = Types.VARIABLE;     }     else if(Char.IsDigit(exp[expIdx])) { // is number       while(!IsDelim(exp[expIdx])) {         token += exp[expIdx];         expIdx++;         if(expIdx >= exp.Length) break;       }       tokType = Types.NUMBER;     }   }   // Return true if c is a delimiter.   bool IsDelim(char c)   {     if((" +-/*%^=()".IndexOf(c) != -1))       return true;     return false;   } }

The parser as it is shown can handle the following operators: +, , *, /, and %. In addition, it can handle integer exponentiation (^) and the unary minus. The parser can also deal with parentheses correctly.

To use the parser, first create an object of type Parser. Then, call Evaluate( ), passing the expression string that you want evaluated as an argument. Evaluate( ) returns the result. The following example demonstrates the parser:

 // Demonstrate the parser. using System; class ParserDemo {   public static void Main()   {     string expr;     Parser p = new Parser();     Console.WriteLine("Enter an empty expression to stop.");     for(;;) {       Console.Write("Enter expression: ");       expr = Console.ReadLine();       if(expr == "") break;       Console.WriteLine("Result: " + p.Evaluate(expr));     }   } }

Here is a sample run:

 Enter an empty expression to stop. Enter expression: 10-2*3 Result: 4 Enter expression: (10-2)*3 Result: 24 Enter expression: 10/3.5 Result: 2.85714285714286

Understanding the Parser

Let’s now take a detailed look at Parser. As mentioned earlier when GetToken( ) was discussed, Parser contains four private fields. The string containing the expression to be evaluated is referred to by exp. This field is set each time Evaluate( ) is called. It is important to remember that the parser evaluates expressions that are contained in standard C# strings. For example, the following strings contain expressions that the parser can evaluate:

 “10  5” “2 * 3.3 / (3.1416 * 3.3)”

The current index into exp is stored in expIdx. When parsing begins execution, expIdx is set to zero. expIdx is incremented as the parser moves through the expression. The token field holds the current token, and tokType contains the token type.

The entry point to the parser is through Evaluate( ), which must be called with a string containing the expression to be analyzed. The methods EvalExp2( ) through EvalExp6( ) along with Atom( ) form the recursive-descent parser. They implement an enhanced set of the expression production rules discussed earlier. The comment at the top of each method describes what function it performs. In the next version of the parser, a method called EvalExp1( ) will also be added.

SyntaxErr( ) handles syntax errors in the expression. The methods GetToken( ) and IsDelim( ) dissect the expression into its component parts, as described earlier. The parser uses GetToken( ) to obtain tokens from the expression, starting at the beginning of the expression and working to the end. Based on the type of token obtained, different actions are taken.

To understand exactly how the parser evaluates an expression, work through the following expression:

 10  3 * 2

When Evaluate( ), the entry point into the parser, is called, it gets the first token. If the token is a null string, the message No Expression Present is displayed and Evaluate( ) returns 0.0. However, in this case, the token contains the number 10. Next, EvalExp2( ) is called. EvalExp2( ) then calls EvalExp3( ), and EvalExp3( ) calls EvalExp4( ), which in turn calls EvalExp5( ). Then EvalExp5( ) determines if the token is a unary plus or minus; in this case, it is not, so EvalExp6( ) is called. At this point EvalExp6( ) either recursively calls EvalExp2( ) (in the case of a parenthesized expression) or Atom( ) to find the value of a number. Since the token is not a left parenthesis, Atom( ) is executed and result is assigned the value 10. Next, another token is retrieved, and the methods begin to return up the chain. Since the token is now the operator , the methods return up to EvalExp2( ).

What happens next is very important. Because the token is , it is saved in op. The parser then gets the next token, which is 3, and the descent down the chain begins again. As before, Atom( ) is entered. The value 3 is returned in result and the token * is read. This causes a return back up the chain to EvalExp3( ), where the final token 2 is read. At this point, the first arithmetic operation occurs—the multiplication of 2 and 3. The result is returned to EvalExp2( ) and the subtraction is performed. The subtraction yields the answer 4. You may want to work through some other examples in order to firmly understand how the parser functions.

If an error occurs while parsing, the SyntaxErr( ) method is called. This method throws a ParserException that describes the error. ParserException is a custom exception defined at the top of the parser file.

This parser would be suitable for use by a simple desktop calculator, as illustrated by the previous program. Before it could be used in a computer language, a database, or a sophisticated calculator, however, it would need the ability to handle variables. This is the subject of the next section.




C# 2.0(c) The Complete Reference
C# 2.0: The Complete Reference (Complete Reference Series)
ISBN: 0072262095
EAN: 2147483647
Year: 2006
Pages: 300

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net