4.1 Probabilistic Independence


4.1 Probabilistic Independence

What exactly does it mean that two events are independent? Intuitively, it means that they have nothing to do with each other—they are totally unrelated; the occurrence of one has no influence on the other. Suppose that two different coins are tossed. Most people would view the outcomes as independent. The fact that the first coin lands heads should not affect the outcome of the second coin (although it is certainly possible to imagine a complicated setup whereby they are not independent). What about tossing the same coin twice? Is the second toss independent of the first? Most people would agree that it is (although see Example 4.2.1). (Having said that, in practice, after a run of nine heads of a fair coin, many people also believe that the coin is "due" to land tails, although this is incompatible with the coin tosses being independent. If they were independent, then the outcome of the first nine coin tosses would have no effect on the tenth toss.)

In any case, whatever it may mean that two events are "independent", it should be clear that none of the representations of uncertainty considered so far can express the notion of unrelatedness directly. The best they can hope to do is to capture the "footprint" of independence, in a sense that will be made more precise. In the section, I consider this issue in the context of probability. In Section 4.3 I discuss independence for other representations of uncertainty.

Certainly if U and V are independent or unrelated, then learning U should not affect the probability of V and learning V should not affect the probability of U. This suggests that the fact that U and V are probabilistically independent (with respect to probability measure μ) can be expressed as μ(U | V) = μ(U) and μ(V | U) = μ(V). There is a technical problem with this definition. What happens if μ(V) = 0? In that case μ(U | V) is undefined. Similarly, if μ(U) = 0, then μ(V | U) is undefined. (This problem can be avoided by using conditional probability measures. I return to this point later but, for now, I assume that μ is an unconditional probability measure.) It is conventional to say that, in this case, U and V are still independent. This leads to the following formal definition:

Definition 4.1.1

start example

U and V are probabilistically independent (with respect to probability measure μ) if μ(V) 0 implies μ(U | V) = μ(U) and μ(U) 0 implies μ(V | U) = μ(V).

end example

Definition 4.1.1 is not the definition of independence that one usually sees in textbooks, which is that U and V are independent if μ(U V) = μ(U)μ(V), but it turns out to be equivalent to the more standard definition.

Proposition 4.1.2

start example

The following are equivalent:

  1. μ(U) 0 implies μ(V | U) = μ(V),

  2. μ(U V) = μ(U)μ(V),

  3. μ(V) 0 implies μ(U | V) = μ(U).

end example

Proof I show that (a) and (b) are equivalent. First, suppose that (a) holds. If μ(U) = 0, then clearly μ(U V) = 0 and μ(U)μ(V) = 0, so μ(U V) = μ(U)μ(V).If μ(U) 0, then μ(V | U) = μ(U V)/μ(U), so if μ(V | U) = μ(V), simple algebraic manipulation shows that μ(V | U) = μ(U)μ(V). For the converse, if μ(U V) = μ(U)μ(V) and μ(U) 0, then μ(V) = μ(U V)/μ(U) = μ(V | U). This shows that (a) and (b) are equivalent. A symmetric argument shows that (b) and (c) are equivalent.

Note that Proposition 4.1.2 shows that I could have simplified Definition 4.1.1 by just using one of the clauses, say, μ(U) 0 implies μ(V | U) = μ(V), and omitting the other one. While it is true that one clause could be omitted in the definition of probabilistic independence, this will not necessarily be true for independence with respect to other notions of uncertainty; thus I stick to the more redundant definition.

The conventional treatment of defining U and V to be independent if either μ(U) = 0 or μ(V) = 0 results in some counterintuitive conclusions if μ(U) is in fact 0. For example, if μ(U) = 0, then U is independent of itself. But U is certainly not unrelated to itself. This shows that the definition of probabilistic independence does not completely correspond to the informal intuition of independence as unrelatedness.

To some extent it may appear that this problem can be avoided using conditional probability measures. In that case, the problem of conditioning on a set of probability 0 does not arise. Thus, Definition 4.1.1 can be simplified for conditional probability measures as follows:

Definition 4.1.3

start example

U and V are probabilistically independent (with respect to conditional probability space (W,, , μ))if V implies μ(U | V) = μ(U) and U implies μ(V | U) = μ(V).

end example

Note that Proposition 4.1.2 continues to hold for conditional probability measures (Exercise 4.1). It follows immediately that if both μ(U) 0 and μ(V) 0, then U and V are independent iff μ(U V) = μ(U)μ(V) (Exercise 4.2). Even if μ(U) = 0 or μ(V) = 0, the independence of U and V with respect to the conditional probability measure μ implies that μ(U V) = μ(U)μ(V) (Exercise 4.2), but the converse does not necessarily hold, as the following example shows:

Example 4.1.4

start example

Consider the conditional probability measure μs0 defined in Example 3.2.4. Let U ={w1, w3} and V ={w2, w3}. Recall that w1 is much more likely than w2, which in turn is much more likely than w3. It is not hard to check that μs0(U | V) = 0 and μs0(U) = 1, so U and V are not independent according to Definition 4.1.3. On the other hand, μs0(U)μs0(V) = μs0(U V) = 0. Moreover, μs0(V) = μs0(V | U) = 0, which shows that both conjuncts of Definition 4.1.3 are necessary; in general, omitting either one results in a different definition of independence.

end example

Essentially, conditional probability measures can be viewed as ignoring information about "negligible" small sets when it is not significant. With this viewpoint, the fact that μs0(U)μs0(V) = μs0(U V) and μs0(V | U) = μs0(V) can be understood as saying that the difference between μs0(U)μs0(V) and μs0(U V) is negligible, as is the difference between μs0(V | U) and μs0(V). However, it does not follow that the difference between μs0(U | V) and μs(U) is negligible; indeed, this difference is as large as possible. This interpretation can be made precise by considering the nonstandard probability measure μns0 from which μs0 is derived (see Example 3.2.4). Recall that μns0(w1) = 1 2, μns0 (w2) = , and μns0(w3) = 2. Thus, μns0 (V | U) = 2/(1 ) and μns0(V) = + 2. The closest real to both 2/(1 ) and + 2 is 0 (they are both infinitesimals, since 2/(1 ) < 22), which is why μs0(V | U) = μs0(V) = 0. Nevertheless, μns0(V | U) is much smaller than μns0 (V). This information is ignored by μs0; it treats the difference as negligible, so μs0(V | U) = μs0(V).




Reasoning About Uncertainty
Reasoning about Uncertainty
ISBN: 0262582597
EAN: 2147483647
Year: 2005
Pages: 140

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net