In Figure 23.3, we showed the phases involved in taking an input file, breaking it down into tokens (via the lexical analyzer), and then passing these tokens to the parser to generate a parse tree. Let s now amend this figure to illustrate what we ve done here (see Figure 23.4).
As we ve shown in this chapter, the lexical analyzer is generated from the flex utility, given an input file representing the regular expressions that recognize the tokens of the grammar. The parser is generated from the bison utility, again specified by a grammar file. Each of these phases is built together in a single image, with connectivity between the two specified by the flex and bison tools and also by the developer.
The flow of the lexer and parser is partly provided internally but is visible in the grammar definitions (see Figure 23.5). Our main function (provided in the bison grammar file) calls yyparse to perform the parsing function. This in turn calls yylex to retrieve the tokens as they re extracted from the input stream. The yylex function returns the type of token found, with any other data needed by the parser returned in other variables (such as yylval ).
We ve seen a few of the internal functions and variables provided in the scanner and parser. Table 23.1 provides a list of some of the others that you may encounter in your use of flex and bison .
Name | Type | Description |
---|---|---|
yyparse | Function | Parser function (called by main ) |
yyerror | Function | Error function (can be provided by user ) |
yylex | Function | Scanner functions (returns tokens, used by yyparse ) |
yyterminate | Function | Terminates the parsing process |
yylval | char* / union | Token value |
yytext | char* | Pattern string used by the lexer |
yydebug | int | Set to 1 to enable debug mode |
Designing and specifying parsers (and lexers) can be a difficult task, but in the end, the act of specifying how the grammar works is necessary even if it s to be done by hand. Once the specification is done, the generation of the parser with bison is trivial (compared to writing one by hand) and therefore flex and bison can be very useful tools in our development toolbox.