13.1 Introduction to Prolog | Programming for the Javaв„ў Virtual Machine

Here is an example of a Prolog program:

 inside(baltimore, maryland). inside(maryland, united_states). inside(california, united_states). inside(san_francisco, california).

These are rules that tell Prolog that Maryland is inside the United States, that Baltimore is inside Maryland, and so on.

Some terminology: the word before the parentheses is called the predicate, and the list inside the parentheses is called the arguments. In the first rule, the word inside is the predicate, and baltimore and maryland are the arguments. Order is significant: inside(baltimore, maryland) means that Baltimore is inside Maryland, but inside(maryland, baltimore) means that Maryland is inside Baltimore. Since the latter fact would not be true in the real world, it is not included in the program.

13.1.1 Asking Questions

You can use these rules to ask a number of questions. For example, you can ask:

 ?- inside(baltimore, maryland).

The ?- is the prompt from the Prolog system when you ask a question. The Prolog system will respond

Yes

You can also use placeholders called variables, which are written beginning with an upper-case letter. Words written beginning with lower-case letters are called constants. If a position inside a predicate is a variable, then it is called free; if it is a constant, then it is called bound. For example,

 ?- inside(X, united_states).

This query asks for the inside predicate with the first position free and the second position bound. Prolog responds:

 Yes, X=maryland Yes, X=california

Prolog looks through its program, looking for any rules that match the question. For positions that are free, Prolog substitutes the corresponding constant in the rule for the variable in the question. This process is called unification.

After unification, if the rule and the question match exactly, then Prolog responds that it has found an answer to the question. It also prints out a list of substitutions that had to be made to make the rule and the question match.

Prolog found two different answers to this question, Maryland and California. It tried to match against all four rules it knows. For the first rule,

 inside(baltimore, maryland).

Prolog was able to match X with baltimore, but the substituted question inside(baltimore, united_states) did not match the rule inside(baltimore, maryland). Prolog gave up on this rule and tried again. This is called backtracking.

Prolog backtracked to the next rule:

 inside(maryland, united_states).

After substituting maryland for X, Prolog found that the rule exactly matches the question. This is a valid answer to the question, so Prolog printed Yes, followed by the substitution X=maryland, which was necessary to make the question succeed.

Prolog then backtracked again to see if there were any more answers. It did not find an answer with the rule inside(san_francisco, california), but it found another answer with inside(california, united_states). After that, there were no more rules to check, so Prolog stopped.

13.1.2 Binding Patterns

You may ask your question with any combination of variables and constants. The pattern of bound and unbound arguments is called the binding pattern. Using the same program, you could also ask,

 ?- inside(san_francisco, X).

Prolog responds,

 Yes, x = california

You could even ask with all variables in both positions:

 ?- inside(X, Y).

Prolog responds,

 Yes, X=maryland, Y=united_states Yes, X=baltimore, Y=maryland Yes, X=california, Y=united_states Yes, X=san_francisco, Y=california

13.1.3 Implications

If you ask,

 ?- inside(baltimore, united_states).

Prolog responds,

No

Clearly, something a bit smarter is required here, since Baltimore is certainly within the United States. Prolog should be able to figure this out somehow, since Baltimore is inside Maryland and Maryland is within the United States. It would be good to have a rule that states this.

This kind of rule is called an implication. The following rules specify within, an implication that expresses the definition in the previous paragraph:

 within(X, Y) :- inside(X, Y). within(X, Y) :- inside(X, Z), within(Z, Y).

The first rule says that X is within Y if X is inside Y. The second rule says X is within Y if there is some Z such that Z is found inside X and Z is within Y.

The part before the :- is called the head of the rule; the part after the :- is called the body. Prolog matches the query with the head of the rule. If it matches, then it makes the same variable substitutions in the body as in the head. Then it treats the body as a series of questions. If it can answer those questions, then it has found an answer to the original question.

Let's try these rules out on

 ?- within(baltimore, united_states).

Prolog looks up its rules for within. It tries the first rule, substituting baltimore for X and united_states for Y. This succeeds so far, so Prolog tries to fulfill the body of the rule. It substitutes for X and Y, and then Prolog asks,

 ?- inside(baltimore, united_states).

As shown earlier, there is no rule that lets Prolog conclude that baltimore is inside the united_states. So this rule fails.

Prolog backtracks and tries the other rule for within. Again, X unifies with baltimore and Y unifies with united_states. Since this matches the head of the rule, Prolog substitutes these bindings into the body. There is no binding yet for Z.

Now Prolog has to see if it can satisfy the body of the rule. So Prolog asks itself,

 ?- inside(baltimore, Z).

Using the rules for inside, it discovers that this is true if Z is maryland. Then it goes to the second part of the rule, within(Z, Y).

Now Prolog knows that Z is maryland and Y is united_states, so it sets out to see if within(maryland, united_states) is true. It goes back to the first rule for within and asks itself,

 inside(maryland, united_states).

Prolog finds that this is true, which means that within(maryland, united_ states) is also true, which means that within(baltimore, united_states) is true as well. So Prolog prints

Yes

Prolog then backtracks again, trying to find additional ways to prove that Baltimore is within the United States. It finds no more answers.

13.1.4 Binding Patterns and Implications

You can use within using variables as well as constants. If you ask,

 within(san_francisco, X).

Prolog reports,

 Yes, X=california Yes, X=united_states

13.1.5 Facts as Rules

The rule

 inside(baltimore, maryland).

also has a head and a body. The head is inside(baltimore, maryland) and the body is empty. An empty body always succeeds. This fact is equivalent to

 inside(baltimore, maryland) :- true.

where true is a predicate that always succeeds. A rule that always succeeds is called a fact. The Prolog language does not make any distinction between facts and other rules, except for the syntactic sugar that lets you omit :- true from facts. This does not change the language; it only makes it a little easier to read and write.

Some facts may have variables in the head. For example, the fact

 inside(good, X).

means that there is some good in everybody.

13.1.6 More About Unification

When the same variable is used multiple times, two terms unify only when there is a consistent substitution. That is, you must be able to use the same value each time the variable is found.

 foo(b, X, b) foo(Y, a, Y)

These unify when X=a and Y=b.

By contrast, this unification cannot succeed:

 foo(b, X, c) foo(Y, a, Y)

There is no consistent way to substitute for Y in the second term, since in one case it must match b and in the other it must match c.

Two variables always unify with each other. This means that whenever you find a substitution for one, you must substitute it exactly with the other. For example:

 bar(X, a) bar(Y, Y)

To unify these terms, variables X and Y must unify, since they appear in corresponding places. Y and a also unify, for the same reason. Since X and Y are unified and Y and a are unified, X and a must also unify. This means that to unify these two expressions, you must substitute a for X and also substitute a for Y.

Another thing about unification: within a single expression, variable names are meaningful only to the point where they match each other. Between expressions, you can replace one variable name with another, as long as you do it consistently. Suppose you want to unify

 baz(X, Y) baz(Z, X)

These terms unify when the first X unifies with Z and the second X unifies with Y. The two Xs are different, since they appear in different expressions. For this reason, many Prolog implementations internally replace all variable names with meaningless ones beginning with underscores, like this:

 baz(_0, _1) baz(_2, _3)

This is equivalent to the two expressions shown earlier for baz. Within an expression, the same variable name must be used in each case. So the two expressions shown earlier for foo are equivalent to

 foo(b, _4, c) foo(_5, a, _5)

The point is that variable names are meaningless except within a single expression. They may be substituted for other variable names whenever you find it convenient.