10.3 First-Order Reasoning about Probability


10.3 First-Order Reasoning about Probability

There is an obvious first-order extension of the propositional logic QUn considered in Section 7.3. The syntax is just a combination of the syntax for first-order logic and that of QUn; I omit the formal definition. Call the resulting language QU,fon.QU,fon includes formulas such as x(1(P (x)) 1/2) 2(yQ(y)) < 1/3; quantifiers can appear in the scope of likelihood formulas and likelihood formulas can appear in the scope of quantifiers.

Just as in Chapter 7, the likelihood operator i can be interpreted as probability (if all sets are measurable), inner measure, lower probability, belief, or possibility, depending on the semantics. For example, in the case of probability, a relational probability structure has the form (W, D, 1, , n, π). (Note that, for the same reasons as in the case of knowledge, I am making the common-domain assumption.) Let Mmeas,fon consist of all relational (measurable) probability structures. I leave the straightforward semantic definitions to the reader.

If this were all there was to it, this would be a very short section. However, consider the two statements "The probability that a randomly chosen bird will fly is greater than .9" and "The probability that Tweety (a particular bird) flies is greater than .9." There is no problem dealing with the second statement; it corresponds to the formula (Flies(Tweety)) > .9. (I am assuming that there is only one agent in the picture, so I omit the subscript on .) But what about the first statement? What is the formula that should hold at a set of worlds whose probability is greater than .9?

The most obvious candidate is (x(Bird(x) Flies(x)) > .9. However, it might very well be the case that in each of the worlds considered possible, there is at least one bird that doesn't fly. Hence, the statement x(Bird(x) Flies(x)) holds in none of the worlds (and so has probability 0); thus, (x(Bird(x) Flies(x)) > .9 does not capture the first statement. What about x((Bird(x) Flies(x)) > .9) or, perhaps better, x((Flies(x) | Bird(x)) > .9)? This runs into problems if there is a constant, say Opus, that represents an individual, say a penguin, that does not fly and is a rigid designator. Then (Flies(Opus) | Bird(Opus)) = 0, contradicting both x((Flies(x) | Bird(x)) > .9) and x((Bird(x) Flies(x)) > .9). (It is important here that Opus is a rigid designator. The two statements x((Flies(x) | Bird(x)) > .9) and (Flies(Opus) | Bird(Opus)) = 0 are consistent if Opus is not a rigid designator; see Exercise 10.10.)

There seems to be a fundamental difference between these two statements. The first can be viewed as a statement about what one might expect as the result of performing some experiment or trial in a given situation. It can also be viewed as capturing statistical information about the world, since given some statistical information (say, that 90% of the individuals in a population have property P), then a randomly chosen individual should have probability .9 of having property P. By way of contrast, the second statement captures a degree of belief. The first statement seems to assume only one possible world (the "real" world), and in this world, some probability measure over the set of birds. It is saying that, with probability greater than .9, a bird chosen at random (according to this measure) will fly. The second statement implicitly assumes the existence of a number of possible worlds (in some of which Tweety flies, while in others Tweety doesn't), with some probability over these possibilities. Not surprisingly, the possible-worlds approach is well-suited to handling the second statement, but not the first.

It is not hard to design a language appropriate for statistical reasoning suitable for dealing with the first statement. The language includes terms of the form ||φ||x, which can be interpreted as "the probability that a randomly chosen x in the domain satisfies φ." This is analogous to terms such as (φ) in QU. More generally, there can be an arbitrary set of variables in the subscript. To understand the need for this, suppose that the formula Son(x, y) says that x is the son of y. Now consider the three terms ||Son(x, y)||x, ||Son(x, y)||y, and ||Son(x, y)||{x, y}. The first describes the probability that a randomly chosen x is the son of y; the second describes the probability that x is the son of a randomly chosen y; the third describes the probability that a randomly chosen pair (x, y) will have the property that x is the son of y. These three statements are all quite different. By allowing different sets of random variables in the subscript, they can all be expressed in the logic.

More formally, define a statistical likelihood term to have the form ||φ||X, where φ is a formula and X is a set of variables. A (linear) statistical likelihood formula is one of the form a1||φ1||X1 + + ak||φk||Xk >b. Formulas are now formed just as in first-order logic, except that linear statistical likelihood formulas are allowed. In this language, the statement "The probability that a randomly chosen bird will fly is greater than .9" can easily be expressed. With some abuse of notation, it is just ||Flies(x) | Bird(x)||x > .9. (Without the abuse, it would be ||Flies(x) Bird(x)||x > .9||Bird(x)||x or ||Flies(x) Bird(x)||x .9||Bird(x)||x > 0.)

Quantifiers can be combined with statistical likelihood formulas. For example, x(||Son(x, y)||y > .9) says that for every person x, the probability that x is the son of a randomly chosen person y is greater than .9; y(||Son(x, y)||x > .9) says that for every person y, the probability that a randomly chosen x is the son of y is greater than .9. Let QU,stat be the language that results from combining the syntax of first-order logic with statistical likelihood formulas.

As with , statistical likelihood terms can be evaluated with respect to any quantitative representation of uncertainty. For definiteness, I use probability here. A statistical -structure is a tuple (, μ), where is a relational structure and μ is a probability measure on dom(). To simplify matters, I assume that all subsets of dom() are measurable, that dom() is finite or countable, and that μ is countably additive. That means that μ is characterized by the probability it assigns to the elements of dom(). Let meas,stat consist of all statistical -structures of this form.

Statistical structures should be contrasted with relational probability structures. In a statistical structure, there are no possible worlds and thus no probability on worlds. There is essentially only one world and the probability is on the domain. There is only one probability measure, not a different one for each agent. (It would be easy to allow a different probability measure for each agent, but the implicit assumption is that the probability in a statistical structure is objective and does not represent the agent's degree of belief.) An important special subclass of statistical structures (which is the focus of Chapter 11) are structures where the domain is finite and the probability measure is uniform (which makes all domain elements equally likely). This interpretation is particularly important for statistical reasoning. In that case, a formula such as ||Flies(x) | Bird(x)||x > .9 could be interpreted as "more than 90 percent of birds fly."

There are a number of reasons for not insisting that μ be uniform in general. For one thing, there are no uniform probability measures in countably infinite domains where all sets are measurable. (A uniform probability measure in a countably infinite domain would have to assign probability 0 to each individual element in the domain, which means by countable additivity it would have to assign probability 0 to the whole domain.) For another, for representations of uncertainty other than probability, there is not always an obvious analogue of uniform probability measures. (Consider plausibility measures, for example. What would uniformity mean there?) Finally, there are times when a perfectly reasonable way of making choices might not result in all domain elements being equally likely. For example, suppose that there are seven balls, four in one urn and three in another. If an urn is chosen at random and then a ball in the urn is chosen at random, not all the balls are equally likely. The balls in the urn with four balls have probability 1/8 of being chosen; the balls in the urn with three balls have probability 1/6 of being chosen. In any case, there is no additional difficulty in giving semantics to the case that μ is an arbitrary probability measure, so that is what I will do. On the other hand, to understand the intuitions, it is probably best to think in terms of uniform measures.

One more construction is needed before giving the semantics. Given a probability measure μ on D, there is a standard construction for defining the product measure μn on the product domain Dn consisting of all n-tuples of elements of D: define μn(d1, , dn) = μ(d1) μ(dn). Note that if μ assigns equal probability to every element of D, then μn assigns equal probability to every element of Dn.

The semantic definitions are identical to those for first-order logic; the only new clause is that for statistical likelihood formulas. Given a statistical structure M = (, μ), a valuation V, and a statistical likelihood term ||φ||{x1, , xn}, define

That is, [||φ||{x1, , xn}]M, V is the probability that a randomly chosen tuple (d1, , dn) (chosen according to μn) satisfies φ. Then define

Note that the x in ||φ||x acts in many ways just like the x in x; for example, both bind free occurrences of x in φ, and in both cases the x is a dummy variable. That is, xφ is equivalent to yφ[x/y] and ||φ||x >b is equivalent to ||φ[x/y]||y > b if y does not appear in φ (see Exercise 10.11). Indeed, || ||x can express some of the general notions of quantification referred to in Section 10.1. For example, with a uniform probability measure and a finite domain, ||φ||x > 1/2 expresses the fact that at least half the elements in the domain satisfy φ, and thus is equivalent to the formula Hxφ(x) from Section 10.1.

Of course, statistical reasoning and reasoning about degrees of belief can be combined, by having a structure with both a probability on the domain and a probability on possible worlds. The details are straightforward, so I omit them here.

What about axioms? First consider reasoning about degrees of belief. It is easy to see that F1–5 are sound, as are QU1–3, QUGen, and Ineq from Section 7.3. They are, however, not complete. In fact, there is no complete axiomatization for the language QU fon with respect to meas,fon (even if n = 1); the set of formulas in QU,fon valid with respect to meas,fon is not recursively enumerable. Restricting to finite domains does not help (since first-order logic restricted to finite domains is by itself not axiomatizable), nor does restricting to finite sets of worlds. But, as in the case of first-order logic, restricting to bounded domains does help.

Let AXprob,fon, N consist of the axioms and inference rule of AXfoN together with those of AXprobn and one other axiom:

IV stands for Inequality of Variables. It is easy to see that IV is sound, as is the analogous property for equality, called EV.

EV just follows from the fact that variables are treated as rigid and have the same value in all worlds. EV is provable from the other axioms, so it is not necessary to include it in the axiomatization (Exercise 10.13). In fact, the analogues of IV and EV are both provable in the case of knowledge, which is why they do not appear in the axiomatization of Theorem 10.2.1 (Exercise 10.14).

Theorem 10.3.1

start example

AXprob, fon, N is a sound and complete axiomatization with respect to structures in meas,fon with a domain of cardinality at most N for the language QU, fon.

end example

Proof Soundness is immediate from the soundness of AXfoN in relational structures of size at most N, the soundness of AXprobn in the propositional case, and the validity of EV, proved in Exercise 10.13. Completeness is beyond the scope of this book.

Thus, there is a sense in which the axioms of first-order logic together with those for propositional reasoning about probability capture the essence of first-order reasoning about probability.

Much the same results hold for statistical reasoning. Consider the following axioms and rule of inference, where X ranges over finite sets of variables:

  • PD1. ||φ||X 0.

  • PD2. x1 xnφ ||φ||{x1, , xn} = 1.

  • PD3. ||φ ψ||X +||φ ψ||X = ||φ||X.

  • PD4. ||φ||X = ||φ[x/z]||X[x/z], where x X and z does not appear in X or φ.

  • PD5. ||φ ψ||XY = ||φ||X ||ψ||Y if none of the free variables of φ is contained in Y, none of the free variables of ψ is contained in X, and X and Y are disjoint.

  • PDGen. From φ ψ infer ||φ||X = ||ψ||X.

PD1, PD3, and PDGen are the obvious analogues of QU1, QU3, and QUGen, respectively. PD2 is an extension of QU2. PD4 allows renaming of variables bound by "statistical" quantification. As I mentioned earlier, there is an analogous property for first-order logic, namely xφ yφ[x/y], which follows easily from F2 and F3 (Exercise 10.11). PD5 says that if ψ and φ do not have any free variables in common, then they can be treated as independent. Its validity follows from the use of the product measure in the semantics (Exercise 10.12).

F1–5 continue to be sound for statistical reasoning, except that the notion of substitutability in F2 must be modified to take into account that || ||y acts like a quantifier, so that t not substitutable in φ if the variable y occurs in t and there is a term || ||y in φ.

As in the case of degrees of belief, there is no complete axiomatization for the language QU,stat with respect to meas,stat; the set of formulas in QU,stat valid with respect to meas,stat is not recursively enumerable. And again, while restricting to structures with finite domains does not help, restricting to bounded domains does. Let AXstatN consist of the axioms and inference rule of AXfoN together with PD1–5 and PDGen.

Theorem 10.3.2

start example

AXstatN is a sound and complete axiomatization with respect to structures in meas,stat with a domain of cardinality at most N for the language QU,stat.

end example




Reasoning About Uncertainty
Reasoning about Uncertainty
ISBN: 0262582597
EAN: 2147483647
Year: 2005
Pages: 140

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net