The building blocks of Logikus programs are structures and variables . A typical structure has a string as its functor and has terms enclosed in parentheses. For now, let's sketch a grammar that will recognize a structure such as starred(jamesCagney, "Mister Roberts", Year) The following grammar gives a first draft for recognizing Logikus structures and variables: structure = functor ('(' commaList(term) ')' Empty); functor = LowercaseWord QuotedString; term = structure variable; variable = UppercaseWord; commaList(p) = p (',' p)*; Note that the functor of a structure cannot be an uppercase word. Logikus shares this feature with Prolog; users need not and cannot declare variables. A string that begins with an uppercase letter is a variable, and a string that begins with a lowercase letter is a functor. A Logikus program is a series of axioms terminated by semicolons. For example, Logikus must recognize the following three axioms: customer("Jasmine Jones", 2093); order(2093, "Brimful", 2); q (Name, Type) :- customer(Name, CustomerNum), order(CustomerNum, Type, Pounds); Here is an initial draft of a grammar for Logikus axioms: axiom = structure (ruleDef Empty); ruleDef = ":-" commaList(structure); The grammar so far defines a fairly comprehensive logic language, but it does not yet provide for all the features of Logikus. In particular, you need to broaden the concept of the composition of a rule. Rules in Logikus are series of conditions, and a normal structure is only one type of condition. Other conditions you wish to allow are lists, comparisons, negations, and evaluations. For example, Logikus must recognize this rule: travel(AltBand, Name) :- for (AltBand, 1000, 20000), city(Name, Alt), >(Alt, AltBand - 1000), <= (Alt, AltBand); Logikus allows any combination of arithmetic operators and parentheses in its expressions. To capture this in the grammar, you can extend the definition of axioms as follows : axiom = structure (ruleDef Empty); structure = functor ('(' commaList(term) ')' Empty); functor = LowercaseWord QuotedString; term = structure variable; variable = UppercaseWord; ruleDef = ":-" commaList(condition); condition = structure evaluation comparison; evaluation = '#' '(' arg ',' arg ')'; comparison = operator '(' arg ',' arg ')'; arg = expression functor; operator = '<' '>' '=' "<=" ">=" "!=" ; expression = phrase ('+' phrase '-' phrase)*; phrase = factor ('*' factor '/' factor)*; factor = '(' expression ')' Num variable; commaList(p) = p (',' p)*; The evaluation rule recognizes any string of the form #(arg, arg) , such as #(LowerPlus, Lower + 1000) . The expressions allowed by evaluations and comparisons draw on the grammar developed in Chapter 7, "Parsing Arithmetic." One difference is that Logikus evaluations and comparisons work with either strings or numbers . For example, >( tyrannosaurus , triceratops ) is a valid Logikus comparison. Another valid condition is the not condition. You can extend the grammar with condition = structure evaluation comparison not; not = "not" structure; Logikus also supports lists. A list is a pair of brackets that may or may not have contents. The nonempty contents of a list are a series of terms separated by commas. A list may follow these terms with a vertical bar and a tail, which must be either another list or a variable. The following grammar rules succinctly state these principles: list = '[' (listContents Empty) ']'; listContents = commaList(term) listTail; listTail = ('' (variable list)) Empty; These rules refer to term , which we said earlier was either a structure or variable. You need to widen this to term = structure Num list variable; You also need to allow anonymous variables, and that means variable = UppercaseWord '_'; Finally, you need to allow a single dot as a valid functor for a structure. This specifically allows the manual construction of lists, such as test(.(A, .(B, .(C, [])))) To allow the dot functor, you extend the functor rule functor = '.' LowercaseWord QuotedString; Here is the complete grammar for a Logikus axiom: axiom = structure (ruleDef Empty); structure = functor ('(' commaList(term) ')' Empty); functor = '.' LowercaseWord QuotedString; term = structure Num list variable; variable = UppercaseWord '_'; ruleDef = ":-" commaList(condition); condition = structure not evaluation comparison list; not = "not" structure; evaluation = '#' '(' arg ',' arg ')'; comparison = operator '(' arg ',' arg ')'; arg = expression functor; operator = '<' '>' '=' "<=" ">=" "!="; expression = phrase ('+' phrase '-' phrase)*; phrase = factor ('*' factor '/' factor)*; factor = '(' expression ')' Num variable; list = '[' (listContents Empty) ']'; listContents = commaList(term) listTail; listTail = ('' (variable list)) Empty; commaList(p) = p (',' p)*; 14.2.1 Comments in Logikus The preceding grammar recognizes Logikus axioms, but it does not allow for comments. In fact, you want to define the Logikus language to allow comments both within and between axioms. You can keep the grammar simple if you allow for an understanding that you will, in practice, rely on a tokenizer to screen out comments. The implementation of Logikus in sjm.examples.logic uses a default Tokenizer object from sjm.parse.tokens . This class by default allows // comments, which comment out characters to the end of a line, and /**/ comments, which comment out their contents. See Section 9.7, "Tokenizer States," for an explanation of how Tokenizer handles comments. 14.2.2 Logikus Programs The grammar developed in Section 14.2 recognizes Logikus axioms but not entire Logikus programs. You could extend the grammar to observe that a program is a semicolon-separated series of axioms, but it is more efficient to use a simple text-processing class to separate the input into axioms and then apply a parser to each axiom. The class TokenStringSource in sjm.parse.tokens parses a Logikus program string into statements. This simplifies the work of constructing the parser, and it reduces the amount of work the parser must perform. Knowing that you need to build a parser for just Logikus axioms, you can generate the parser almost directly from the grammar. The parser, however, needs assemblers to build Structure objects, Variable objects, and the other components that can be composed into a Program object. |