A top-down parser can be implemented by writing a set of recursive procedures to process the input. One procedure will take care of the left-most derivations for each nonterminal while processing the input. Each procedure should also provide for the storing of the input pointer in some local variable so that it can be reset properly when the parser backtracks. This implementation, called a "recursive descent parser," is a top-down parser for the above-described grammar that can be implemented by writing the following set of procedures:
S( ) { if (input =='a' ) { advance( ); if (A( ) != error) if (input =='b') { advance( ); if (input == endmarker) return(success); else return(error); } else return(error); } else return(error); } A( ) { if (input =='c') { advance( ); if (input == 'd') advance( ); } else return(error); } main( ) { Append the endmarker to the string w to be parsed; Set the input pointer to the left most token of w ; if ( S( ) != error) print f ("Successful completion of the parsing"); else printf ("Failure"); }
where advance() is a routine that, when called, advances the input's pointer to the next occurrence of the symbol w .
Caution | In a backtracking parser, the order in which alternatives are tried affects the language accepted by the parser. For example, in the above parser, if a production A ’ c is tried before A ’ cd , then the parser will fail to accept the string w = acdb , because it first expands S , as shown in Figure 4.4. |
The first input symbol matches the left-most leaf; and therefore, the parser will advance the pointer to c and consider the nonterminal A for expansion in order to obtain the tree shown in Figure 4.5.
The second input symbol also matches. Therefore, the parser will advance the pointer to d , the third input symbol, and consider the next leaf, labeled b in Figure 4.5. It finds that there is no match; and therefore, it will backtrack to S (as shown in Figure 4.5 by the thick arrow). But since there is no alternative to S that can be tried, the parser will return failure. Because the point of mismatch is the descendent of a node labeled by S , the parser will backtrack to S . It cannot backtrack to A . Therefore, the parser will not accept the string acdb . Whereas, if the parser tries the alternative A ’ cd first and A ’ c second, then the parser is capable of accepting the string acdb as well as acb because, for the string w = acb , when the parser encounters a mismatch, it is at a node labeled by d , which is a descendent of a node labeled by A . Hence, it will backtrack to A and try A ’ c , and end up in the parse tree for acb . Hence, we conclude that the order in which alternatives are tried in a backtracking parser affect the language accepted by the compiler or parser.
Consider a grammar S ’ aa aSa . If a top-down backtracking parser for this grammar tries S ’ aSa before S ’ aa , show that the parser succeeds on two occurrences of a and four occurrences of a , but not on six occurrences of a .
In the case of two occurrences of a , the parser will first expand S , as shown in Figure 4.6.
The first input symbol matches the left-most leaf. Therefore, the parser will advance the pointer to a second a and consider the nonterminal S for expansion in order to obtain the tree shown in Figure 4.7.
The second input symbol also matches. Therefore, the parser will consider the next leaf labeled S and expand it, as shown in Figure 4.8.
The parser now finds that there is no match. Therefore, it will backtrack to S , as shown by the thick arrow in Figure 4.9. The parser then continues matching and backtracking, as shown in Figures 4.10 through 4.15, until it arrives at the required parse tree, shown in Figure 4.16.
Now, consider a string of four occurrences of a . The parser will first expand S , as shown in Figure 4.17.
The first input symbol matches the left-most leaf. Therefore, the parser will advance the pointer to a second a and consider the nonterminal S for expansion, obtaining the tree shown in Figure 4.18.
The second input symbol also matches. Therefore, the parser will consider the next leaf labeled by S and expand it, as shown in Figure 4.19.
The third input symbol also matches. So, the parser moves on to the next leaf labeled by S and expands it, as shown in Figure 4.20.
The fourth input symbol also matches. Therefore, the next leaf labeled by S is considered . The parser expands it, as shown in Figure 4.21.
Now it finds that there is no match. Therefore, it will backtrack to S (Figure 4.22) and continue backtracking, as shown in Figures 4.23 through 4.30, until the parser finally arrives at the successful generation of a parse tree for aaaa in Figure 4.31.
Now consider a string of six occurrences of a . The parser will first expand S , as shown in Figure 4.32.
The first input symbol matches the left-most leaf. Therefore, the parser will advance the pointer to the second a and consider the nonterminal S for expansion. The tree shown in Figure 4.33 is obtained.
The second input symbol also matches. Therefore, the parser will consider next leaf labeled S and expand it, as shown in Figure 4.34.
The third input symbol also matches, as do the fourth through sixth symbols. In each case, the parser will consider next leaf labeled S and expand it, as shown in Figures 4.35 through 4.38.
Now the parser finds that there is no match. Therefore, it will backtrack to S , as shown by the thick arrow in Figure 4.39.
Since there is no alternate of S that can be tried, the parser will backtrack one more step, as shown in Figure 4.40. This procedure continues (Figures 4.41 through 4.47), until the parser tries the sixth alternate aa (Figure 4.48) and fails to find a match.