Notes | Reasoning about Uncertainty

Some of the material in Section 6.1 is taken from Reasoning About Knowledge [Fagin, Halpern, Moses, and Vardi 1995]. This book explores the topic of reasoning about knowledge, and its applications to artificial intelligence, distributed systems, and game theory, in much more detail. Frames were introduced by Lemmon and Scott [Lemmon 1977], who called them "world systems"; the term "frame" is due to Segerberg [1968]. (Epistemic frames are called Aumann structures in [Fagin, Halpern, Moses, and Vardi 1995], in honor of Robert Aumann, an economist who introduced epistemic reasoning to the economics/game theory literature.)

The notions of CONS, SDP, and UNIF were formalized in [Fagin and Halpern 1994], where epistemic probability frames were first considered. There is related work in the economics literature by Monderer and Samet [1989]. The common prior assumption and its implications have been well studied in the economics literature; some significant references include [Aumann 1976; Harsanyi 1968; Morris 1995].

The general framework presented here for ascribing knowledge in multi-agent systems first used in [Halpern and Moses 1990] and [Rosenschein 1985]. Slight variants of this framework were introduced in a number of other papers [Fischer and Immerman 1986; Halpern and Fagin 1989; Parikh and Ramanujam 1985; Rosenschein and Kaelbling 1986]. The presentation here is based on that of [Fagin, Halpern, Moses, and Vardi 1995, Chapter 4], which in turn is based on [Halpern and Fagin 1989]. The reader is encouraged to consult [Fagin, Halpern, Moses, and Vardi 1995] for further references and a much more detailed discussion of this approach, examples of its use, and a discussion of alternative approaches to representing multi-agent systems.

Markov chains (which is essentially what Markovian systems are) are of great practical and theoretical interest and have been studied in depth; a good introduction to the field is the book by Kemeny and Snell [1960]. An extension of Markov chains allows an agent to take an action in each round. The transition probabilities then depend on the action a as well as the states involved and thus have the form τ(g, a, g′). Intuitively, τ(g, a, g′) is the probability of making the transition from g to g′ if action a is taken. With each action is associated a reward or utility. The problem is then to find an optimal policy or strategy, that is, an optimal choice of action at each state (under various definitions of optimality). This model is called a Markov decision process (MDP); see [Puterman 1994] for an introduction to MDPs.

The approach for getting a probability assignment from a probability on runs was first discussed in [Halpern and Tuttle 1993]. The formal definition of synchronous systems and of systems where agents have perfect recall is taken from [Halpern and Vardi 1989]. For a discussion of perfect recall in the context of game theory, see [Fudenberg and Tirole 1991]. The definition of perfect recall given here is actually not quite the same as that used in game theory; see [Halpern 1997b] for a discussion of the differences.

The Listener-Teller protocol discussed in Section 6.7.1 is based on a nonprobabilistic version of the protocol discussed in [Fagin, Halpern, Moses, and Vardi 1995]. The importance of the role of the protocol in analyzing the second-ace puzzle was already stressed by Shafer [1985]. Morgan et al. [1991] seem to have been the first to observe in print that the standard analysis of the Monty Hall puzzle (e.g., that given by vos Savant [1990]) depends crucially on the assumption that Monty Hall must open a door in the second round. The analysis of the second-ace puzzle and the Monty Hall puzzle presented here is essentially taken from [Halpern 1998b].

As I mentioned earlier, the question of when conditioning is appropriate is highly relevant in the statistical areas of selectively reported data and missing data. Originally studied within these contexts [Rubin 1976; Dawid and Dickey 1977], it was later found to also play a fundamental role in the statistical work on survival analysis [Kleinbaum 1999]. Building on previous approaches, Heitjan and Rubin [1991] presented a necessary and sufficient condition for when conditioning in the "naive space" is appropriate. Nowadays this so-called CAR (Coarsening at Random) condition is an established tool in survival analysis. (See [Gill, van der Laan, and Robins 1997; Nielsen 1998] for overviews.) Theorem 6.8.1 is a slight restatement of the CAR condition. See [Gr unwald and Halpern 2002] for further discussion of when conditioning is appropriate.

The deterministic polynomial-time algorithm for testing primality is due to Agrawal, Keyal, and Saxena [2002]. The probabilistic primality-testing algorithms were developed by Rabin [1980] and Solovay and Strassen [1977]. The RSA algorithm was developed by Rivest, Shamir, and Adleman [1978]; their article also gives a brief introduction to public-key cryptography.

The need to go beyond SDP systems was first discussed in [Fagin and Halpern 1994]. The notion that different choices of probability assignment correspond to betting against adversaries with different information is discussed in [Halpern and Tuttle 1993], where the betting interpretation discussed after Example 6.9.2 is developed.

The coordinated attack problem was introduced by Gray [1978]. It has become part of the folklore of distributed systems; a formal proof of its impossibility (by induction on the number of messages) is given by Yemini and Cohen [1979]. The problem is discussed in detail in [Fagin, Halpern, Moses, and Vardi 1995; Halpern and Moses 1990].

The idea of factoring out all the nonprobabilistic choices in a system and viewing them as being under the control of some adversary is standard in the distributed computing and theoretical computer science literature (see, e.g., [Rabin 1982; Vardi 1985]); it was first formalized in the context of reasoning about knowledge and probability by Fischer and Zuck [1988].