5.2 Expectation for Other Notions of Likelihood

How should expectation be defined for other representations of uncertainty? I start with sets of probability measures, since the results in this case are fairly straightforward and form the basis for other representations.

5.2.1 Expectation for Sets of Probability Measures

There are straightforward analogues of lower and upper probability in the context of expectation. If is a set of probability measures such that X is measurable with respect to each probability measure μ ∊ , then define E(X) = {E_μ(X) : μ ∊ }. E(X) is a set of numbers. Define the lower expectation and upper expectation of X with respect to , denoted E(X) and E(X), as the inf and sup of the set E(X), respectively. Clearly _*(U) = E(X_U) and *(U) = E(X_U). The properties of E and E are not so different from those of probabilistic expectation.

Proposition 5.2.1

The functions E and E have the following properties, for all gambles X and Y.

E is subadditive: E(X + Y) ≤ E(X) + E(Y);

E is superadditive: E(X + Y) ≥ E(X) + E(Y).
E and E are both positively affinely homogeneous:
E and E are monotone.
E(X) = − E(−X).

Proof See Exercise 5.8.

Superadditivity (resp., subadditivity), positive affine homogeneity, and monotonicity in fact characterize E (resp., E), although the proof of this fact is beyond the scope of the book.

Theorem 5.2.2

Suppose that E maps gambles measurable with respect to to ℝ and is superadditive (resp., subadditive), positively affinely homogeneous, and monotone. Then there is a set of probability measures on such that E = E (resp., E = E).

(There is another equivalent characterization of E; see Exercise 5.9.)

The set constructed in Theorem 5.2.2 is not unique. It is not hard to construct sets and ′ such that ≠ ′ but E = E_′ (see Exercise 5.10). However, there is a canonical largest set such that E = E; consists of all probability measures μ such that E_μ(X) ≥ E(x) for all gambles X.

There is also an obvious notion of expectation corresponding to Pl (as defined in Section 2.8). E_Pl maps a gamble X to a function f_X from to ℝ, where f_X(μ) = E_μ(X). This is analogous to Pl, which maps sets to functions from to [0, 1]. Indeed, it should be clear that E_Pl, so that the relationship between E_Pl and Pl is essentially the same as that between E_μ and μ. Not surprisingly, there are immediate analogues of Proposition 5.1.1 and 5.1.2.

Proposition 5.2.3

The function E_Pl is additive, affinely homogeneous, and monotone.

Proof See Exercise 5.13.

Proposition 5.2.4

Suppose that E maps gambles measurable with respect to to functions from I to ℝ and is additive, affinely homogeneous, and monotone. Then there is a (necessarily unique) set of probability measures on indexed by I such that E = E_Pl.

Proof See Exercise 5.14.

Note that if ≠ ′, then E_Pl ≠ E_{Pl_′}. As observed earlier, this is not the case with upper and lower expectation; it is possible that = ′ yet E = E_′ (and hence E = E_′). Thus, E_Pl can be viewed as capturing more information about than E. On the other hand, E captures more information than _*. Since _* (U) = E(U), it is immediate that if _* ≠ ′_*, then E ≠ E_′. However, as Example 5.2.10 shows, there are sets and ′ of probability measures such that _* = ′_* but E ≠ E_*.

As for probability, there are additional continuity properties for E, E, and E_Pl if consists of countably additive measures. They are the obvious analogues of (5.4) and (5.5).

(See Exercise 5.12.) Again, just as with upper and lower probability, the analogue of (5.6) does not hold for lower expectation, and the analogue of (5.7) does not hold for upper expectation. (Indeed, counterexamples for upper and lower probability can be converted to counterexamples for upper and lower expectation by taking indicator functions.) On the other hand, it is easy to see that the analogue of (5.5) does hold for E_Pl.

Analogues of (5.4) and (5.5) hold for all the other notions of expectation I consider if the underlying representation satisfies the appropriate continuity property. To avoid repetition, I do not mention this again.

5.2.2 Expectation for Belief Functions

There is an obvious way to define a notion of expectation based on belief functions, using the identification of Bel with (_Bel)_* (see Theorem 2.4.1). Given a belief function Bel, define E_Bel = E_{_Bel}. Similarly, for the corresponding plausibility function Plaus, define E_Plause = E_{_Bel}.

This is well defined, but, as with the case of conditional belief, it seems more natural to get a notion of expectation for belief functions that is defined purely in terms of belief functions, without reverting to probability. It turns out that this can be done using the analogue of (5.3). If (X) = {x₁,…, x_n}, with x₁ < … < x_n, define

An analogous definition holds for plausibility:

Proposition 5.2.5

E_Bel = E′_Bel and E_Plaus = E′_Plaus.

Proof See Exercise 5.15.

Equation (5.9) gives a way of defining expectation for belief and plausibility functions without referring to probability. (Another way of defining expectation for belief functions, using mass functions, is given in Exercise 5.16; another way of defining expected plausibility, using a different variant of (5.2), is given in Exercise 5.17.)

The analogue of (5.2) could, of course, be used to define a notion of expectation for belief functions, but it would not give a very reasonable notion. For example, suppose that W ={a, b} and Bel(a) = Bel(b) = 0. (Of course, Bel({a, b}) = 1.) Consider a gamble X such that X(A) = 1 and X(b) = 2. According to the obvious analogue of (5.1) or (5.2) (which are equivalent in this case), the expected belief of X is 0, since Bel(a) = Bel(b) = 0. However, it is easy to see that E_Bel(X) = 1 and E_Plaus(X) = 2, which seems far more reasonable. The real problem is that (5.2) is most appropriate for plausibility measures that are additive (in the sense defined in Section 2.8; i.e., there is a function ⊕ such that Pl(U ∪ V) = Pl(U) ⊕ Pl(V) for disjoint sets U and V). Indeed, the equivalence of (5.1) and (5.2) depends critically on the fact that probability is additive. As observed in Section 2.8 (see Exercise 2.56), belief functions are not additive. Thus, not surprisingly, using (5.2) does not give reasonable results.

Since E_Bel can be viewed as a special case of the lower expectation E (taking = _Bel), it is immediate from Proposition 5.2.1 that E_Bel is superadditive, positively affinely homogeneous, and monotone. (Similar remarks hold for E_Plaus, except that it is subadditive. For ease of exposition, I focus on E_Bel in the remainder of this section, although analogous remarks hold for E_Plaus.) But E_Bel has additional properties. Since it is immediate from the definition that E_Bel(X_U) = Bel(X_U), the inclusion-exclusion property B3 of belief functions can be expressed in terms of expectation (just by replacing all instances of Bel(V) in B3 by E_Bel(X_V)). Moreover, it does not follow from the other properties, since it does not hold for arbitrary lower probabilities (see Exercise 2.14).

B3 seems like a rather specialized property, since it applies only to indicator functions. There is a more general version of it that also holds for E_Bel. Given gambles X and Y, define the gambles X ∧ Y and X ∨ Y as the minimum and maximum of X and Y, respectively; that is, (X ∧ Y)(w) = min(X(w), Y(w)) and (X ∨ Y)(w) = max(X(w), Y(w)). Consider the following inclusion-exclusion rule for expectation:

Since it is immediate that X_U∪V = X_U ∨ X_V and X_U∩V = X_U ∧ X_V, (5.11) generalizes B3.

There is yet another property satisfied by expectation based on belief functions. Two gambles X and Y are said to be comonotonic if it is not the case that one increases while the other decreases; that is, there do not exist worlds w, w′ such that X(w) < X(w′) while Y(w) > Y(w′). Equivalently, there do not exist w and w′ such that (X(w) − X(w′))(Y(w) − Y(w′)) < 0.

Example 5.2.6

Suppose that

W ={w₁, w₂, w₃};
X(w₁) = 1, X(w₂) = 3, and X(w₃) = 0;
Y(w₁) = 2, Y(w₂) = 7, and Y(w₃) = 4;
Z(w₁) = 3, Z(w₂) = 5, and Z(w₃) = 3.

Then X and Y are not comonotonic. The reason is that X decreases from w₁ to w₃, while Y increases from w₁ to w₃. On the other hand, X and Z are comonotonic, as are Y and Z.

Consider the following property of comonotonic additivity:

Proposition 5.2.7

The function E_Bel is superadditive, positively affinely homogeneous, and monotone, and it satisfies (5.11) and (5.12).

Proof The fact that E_Bel is superadditive, positively affinely homogeneous, and monotone follows immediately from Proposition 5.2.3. The fact that it satisfies (5.11) follows from B3 and Proposition 5.2.5 (Exercise 5.18). Proving that it satisfies (5.12) requires a little more work, although it is not that difficult. I leave the details to the reader (Exercise 5.19).

Theorem 5.2.8

Suppose that E maps gambles to ℝ and E is positively affinely homogeneous, is monotone, and satisfies (5.11) and (5.12). Then there is a (necessarily unique) belief function Bel such that E = E_Bel.

Proof Define Bel(U) = E(X_U). Just as in the case of probability, it follows from positive affine homogeneity and monotonicity that Bel(∅) = 0, Bel(W) = 1, and 0 ≤ Bel(U) ≤ 1 for all U ⊆ W. By (5.11) (specialized to indicator functions), it follows that Bel satisfies B3. Thus, Bel is a belief function. Now if X is a gamble such that (X) = {x₁,…,x_n} and x₁ < x₂ < … < x_n, define

for j = 1, …, n. It is not hard to show that X = X_n and that X_j and (x_j+1 − x_j)X_X>xj are comonotonic, for j = 1, …, n − 1(Exercise 5.20). Now applying (5.12) repeatedly, it follows that

Now applying positive affine homogeneity, it follows that

Note that superadditivity was not assumed in the statement of Theorem 5.2.8. Indeed, it is a consequence of Theorem 5.2.8 that superadditivity follows from the other properties. In fact, the full strength of positive affine homogeneity is not needed either in Theorem 5.2.8. It suffices to assume that .

Lemma 5.2.9

Suppose that E is such that (a) , (b) E is monotone, and (c) E satisfies (5.12). Then E satisfies positive affine homogeneity.

Proof See Exercise 5.21.

It follows easily from these results that E_Bel is the unique function E mapping gambles to ℝ that is superadditive, positively affinely homogeneous, monotone, and it satisfies (5.11) and (5.12) such that E(X_U) = Bel(U) for all U ⊆ W. Proposition 5.2.7 shows that E_Bel has these properties. If E′ is a function from gambles to ℝ that has these properties, by Theorem 5.2.8, E′ = E_Bel′ for some belief function Bel′. Since E′(X_U) = Bel′(U) = Bel(U) for all U ⊆ W, it follows that Bel = Bel′.

This observation says that Bel and E_Bel contain the same information. Thus, so do (_Bel)* and E_{_Bel} (since Bel = (_Bel)_* and E_Bel = E_{_Bel}). However, this is not true for arbitrary sets of probability measures, as the following example shows:

Example 5.2.10

Let W ={1, 2, 3}. A probability measure μ on W can be characterized by a triple (a₁, a₂, a₃), where μ(i) = a_i. Let consist of the three probability measures (0, 3/8, 5/8), (5/8, 0, 3/8), and (3/8, 5/8, 0). It is almost immediate that _* is 0 on singleton subsets of W and _* = 3/8 for doubleton subsets. Let ′ = ′ ∪ {μ₄}, where μ₄ = (5/8, 3/8, 0). It is easy to check that ′_* = _*. However, E ≠ E_′. In particular, let X be the gamble such that X(1) = 1, X(2) = 2, and X(3) = 3. Then E(X) = 13/8, but E_′(X) = 11/8. Thus, although E and E_′ agree on indicator functions, they do not agree on all gambles. In light of the earlier discussion, it should be no surprise that _* is not a belief function (Exercise 5.23).

5.2.3 Inner and Outer Expectation

Up to now, I have assumed that all gambles X were measurable, that is, for each x ∊ (X), the set {w : X(w) = x} was in the domain of whatever representation of uncertainty was being used. But what if X is not measurable? In this case, it seems reasonable to consider an analogue of inner and outer measures for expectation.

The naive analogue is just to replace μ in (5.2) with the inner measure μ* and the outer measure μ*, respectively. Let E^?_μ and E^?_μ denote these notions of inner and outer expectation, respectively. As the notation suggests, defining inner and outer expectation in this way can lead to intuitively unreasonable answers. In particular, these functions are not monotone, as the following example shows:

Example 5.2.11

Consider a space W = {w₁, w₂} and the trivial algebra = { W}. Let μ be the unique (trivial) probability measure on . Suppose that X₁, X₂, and X₃ are gambles such that X₁(w₁) = X₁(w₂) = 1, X₃(w₁) = X₃(w₂) = 2, and X₂(w₁) = 1 and X₂(w₂) = 2. Clearly, X₁ ≤ X₂ ≤ X₃. Moreover, it is immediate from the definitions that E^?_μ(X₁) = 1 and E^?_μ(X₃) = 2. However, E^?_μ(X₂) = 0, since μ_*(w₁) = μ_*(w₂) = 0, and E^?_μ(X₂) = 3, since μ_*(w₁) = μ_*(w₂) = 1. Thus, neither E_μ^? nor E_μ^? is monotone.

Note that it is important that E^?_μ and E_μ^? are defined using (5.2), rather than (5.1). If (5.1) were used then, for example, E^?_μ and E_μ^? would be monotone. On the other hand, E^?_μ(X₁) would be 0. Indeed, E^?_μ(Y) would be 0 for every gamble Y. This certainly is not particularly reasonable either!

Since an inner measure is a belief function, the discussion of expectation for belief suggests two other ways of defining inner and outer expectation.

The first uses sets of probabilities. As in Section 2.3, given a probability measure μ defined on an algebra ′ that is a subalgebra of , let _μ consist of all the extensions of μ to . Recall from Theorem 2.3.3 that μ_*(U) = (_μ)_*(U) and μ*(U) = (_μ)* for all U ∊ . Define E_μ = E_{_μ} and E_μ = E_{_μ}.
The second approach uses (5.3); define E′_μ and E_μ′ by replacing the μ in (5.3) by μ_* and μ*, respectively.

In light of Proposition 5.2.5, the following should come as no surprise:

Proposition 5.2.12

E_μ = E′_μ and E_μ = E_μ′.

Proof Since it is immediate from the definitions that _μ is _Bel for Bel = μ_*, the fact that E_μ = E_μ′ is immediate from Proposition 5.2.5. It is immediate from Proposition 5.2.1(d) and the definition that E_μ(X) =−E_μ(−X). It is easy to check that E_μ′(X) =−E′_μ(−X) (Exercise 5.24). Thus, E_μ = E_μ′.

E_μ has much more reasonable properties than E^?_μ. (Since E_μ(X) =−E_μ(−X), the rest of the discussion is given in terms of E_μ.) Indeed, since μ_* is a belief function, E_μ is superadditive, positively affinely homogeneous, and monotone, and it satisfies (5.11) and (5.12). But E_μ has an additional property, since it is determined by a probability measure. If μ is a measure on , then the lower expectation of a gamble Y can be approximated by the lower expectation of random variables measurable with respect to .

Lemma 5.2.13

If μ is a probability measure on an algebra , and X is a gamble measurable with respect to an algebra ′ ⊇ , then E_μ(X) = sup{E_μ(Y) : Y ≤ X, Y is measurable with respect to }.

Proof See Exercise 5.25.

To get a characterization of E_μ, it is necessary to abstract the property characterized in Lemma 5.2.13. Unfortunately, the abstraction is somewhat ugly. Say that a function E on ′-measurable gambles is determined by ⊆ ′ if

for all ′-measurable gambles X, E(x) = sup{E(Y) : Y ≤ X, Y is measurable with respect to },
E is additive for gambles measurable with respect to (so that E(x + Y) = E(x) + E(Y) if X and Y are measurable with respect to .

Theorem 5.2.14

Suppose that E maps gambles measurable with respect to to ℝ and is positively affinely homogeneous, is monotone, and satisfies (5.11) and (5.12), and there is some ⊆ ′ such that E is determined by . Then there is a unique probability measure μ on such that E = E_μ.

Proof See Exercise 5.26.

5.2.4 Expectation for Possibility Measures and Ranking Functions

Since a possibility measure can be viewed as a plausibility function, expectation for possibility measures can be defined using (5.10). It follows immediately from Poss3 that the expectation E_Poss defined from a possibility measure Poss in this way satisfies the sup property:

Proposition 5.2.15

The function E_Poss is positively affinely homogeneous, is monotone, and satisfies (5.12) and (5.13).

Proof See Exercise 5.27.

I do not know if there is a generalization of (5.13) that can be expressed using arbitrary gambles, not just indicator functions. The obvious generalization—E_Poss(X ∨ Y) = max(E_Poss(X), E_Poss(Y))—is false (Exercise 5.28). In any case, (5.13) is the extra property needed to characterize expectation for possibility.

Theorem 5.2.16

Suppose that E is a function on gambles that is positively affinely homogeneous, is monotone, and satisfies (5.12) and (5.13). Then there is a (necessarily unique) possibility measure Poss such that E = E_Poss.

Proof See Exercise 5.29.

Note that, although Poss is a plausibility function, and thus satisfies the analogue of (5.11) with ≥ replaced by ≤ and ∨ switched with ∧, there is no need to state this analogue explicitly; it follows from (5.13). Similarly, subadditivity follows from the other properties. (Since a possibility measure is a plausibility function, not a belief function, the corresponding expectation is subadditive rather than superadditive.)

While this definition of E_Poss makes perfect sense and, as Theorem 5.2.16 shows, has an elegant characterization, it is worth noting that there is somewhat of a mismatch between the use of max in relating Poss(U ∪ V), Poss(U), and Poss(V) (i.e., using max for ⊕) and the use of + in defining expectation. Using max instead of + gives a perfectly reasonable definition of expectation for possibility measures (see Exercise 5.30). However, going one step further and using min for (as in Section 3.7) does not give a very reasonable notion of expectation (Exercise 5.31).

With ranking functions, yet more conceptual issues arise. Since ranking functions can be viewed as giving order-of-magnitude values of uncertainty, it does not seem appropriate to mix real-valued gambles with integer-valued ranking functions. Rather, it seems more reasonable to restrict to nonnegative integer-valued gambles, where the integer again describes the order of magnitude of the value of the gamble. With this interpretation, the standard move of replacing and + in probability-related expressions by + and min, respectively, in the context of ranking functions seems reasonable. This leads to the following definition of the expectation of a (nonnegative, integer-valued) gamble X with respect to a ranking function κ:

It is possible to prove analogues of Propositions 5.1.1 and 5.1.2 for E_κ (replacing and + by + and min, respectively); I omit the details here (see Exercise 5.32). Note that, with this definition, there is no notion of a negative-valued gamble, so the intuition that negative values can "cancel" positive values when computing expectation does not apply.