… pour juger de ce que l'on doit faire pour obtenir un bien ou pour éviter un mal,
il ne faut pas seulement considérer le bien & le mal en soi,
mais aussi la probabilité qu'il arrive ou n'arrive pas;
& regarder géometriquement la proportion que toutes ces choses ont ensembles …
- Antoine Arnauld & Pierre Nicole's (1662, IV, 16) La logique, ou l'art de penser in the original French
… to judge what one ought to do to obtain a good or avoid an evil,
one must not only consider the good and the evil in itself,
but also the probability that it will or will not happen;
and view geometrically the proportion that all these things have together …
- Jeffrey's (1981, p. 473) translation
Decision Theory
Decision theory is concerned with the reasoning underlying an agent's choices
Let S denote a set of states s0, s1, etc
Let Φ denote a set of actions φ1, φ2, etc
Let O denote a set of outcomes (or act-state pairs): o11 for the act-state pair φ1-s1, o12 for the act-state pair φ1-s2, etc
In a decision problem, agent X is traditionally confronted with n alternative courses of action, where n ∈ ℕ+
X must determine which of these n alternative courses of action is the most appropriate
Suppose that there are 3 alternative courses of action: φ1, φ2, and φ3
Suppose further that each action, relative to the initial state s0, leads to one and only one successor state
P(s1|φ1) = P(o11|φ1) = 1
P(s2|φ2) = P(o22|φ2) = 1
P(s3|φ3) = P(o33|φ3) = 1
The decision problem reduces to a linear programming problem and agent X will be making decisions under conditions of certainty
Suppose instead that each action is associated with multiple possible outcomes:
φ1 is associated with the outcomes o11 (φ1-s1) and o12 (φ1-s2)
φ2 is associated with the outcomes o23 (φ2-s3) and o24 (φ2-s4)
φ3 is associated with the outcomes o35 (φ3-s5), o36 (φ3-s6), and o37 (φ3-s7)
However, suppose in addition that the conditional probabilities of these outcomes (e.g. P(o11|φ1), P(o12|φ1), P(o23|φ2), etc) are neither known nor estimable
P(o11|φ1) = NA
P(o12|φ1) = NA
P(o23|φ2) = NA
P(o24|φ2) = NA
⋮ ⋮
Agent X will be making decisions under conditions of uncertainty
Conversely, suppose that these conditional probabilities are either known or can be estimated
P(o11|φ1) = p, where p ∈ (0, 1)
P(o12|φ1) = 1 - p
P(o23|φ2) = q, where q ∈ (0, 1)
P(o24|φ2) = 1 - q
⋮ ⋮
Here, agent X will be making decisions under conditions of risk
Decision theory is designed to address individual decision-making under conditions of risk
Decision theory is as much a theory about the beliefs and preferences of the agent and how these beliefs and preferences cohere together as it is a theory of choice
The 2 fundamental components of decision theory are:
COMPONENT 1: Beliefs
COMPONENT 2: Preferences
Probability values are used to represent degrees of belief, credence, or confidence
Probability values are in turn constrained by the Kolmogorov axioms
Kolmogorov Axioms
AXIOM 1
P(Ω) = 1, where P denotes the probability measure and Ω denotes the sample space
AXIOM 2
P(A) ≥ 0, where A denotes an outcome
AXIOM 3
P(A ∪ B) = P(A) + P(B) - P(A ∩ B), where A and B denote distinct outcomes
NOTE: Where A and B are mutually exclusive:
P(A ∩ B) = P(∅) = 0
∴ P(A ∪ B) = P(A) + P(B)
RULE 1 (Monotonicity)
If A ⊆ B, then P(A) ≤ P(B), where A and B denote distinct outcomes
Alternatively:
If B ⊇ A, then P(B) ≥ P(B)
RULE 2 (Empty set)
P(∅) = 0
RULE 3 (Complement)
P(Ā) = P(Ω) - P(A) = 1 - P(A)
≻
Apples are better than lemons relative to the agent's preferences
(or the agent prefers apples to lemons)
∼
Apples are as good as lemons relative to the agent's preferences
(or the agent is indifferent between apples and lemons)
≺
Lemons are better than apples relative to the agent's preferences
(or the agent prefers lemons to apples)
Utility values are used to represent degrees of preference
Utility values are in turn constrained by the von Neumann-Morgenstern axioms
The von Neumann-Morgenstern axioms allow us to assign utility values in a manner that is consistent with the agent's preferences
Von Neumann-Morgenstern Axioms
AXIOM 1 (Completeness)
Where A and B denote distinct outcomes, one of the following preference order relations must hold:
RELATION 1: A ≻ B (the agent prefers A to B);
RELATION 2: A ∼ B (the agent is indifferent between A and B); or
RELATION 3: A ≺ B (the agent prefers B to A)
Alternatively:
For every A and B, either A ≽ B or A ≼ B
AXIOM 2 (Transitivity)
For every A, B, and C, if A ≽ B and B ≽ C, then A ≽ C
AXIOM 3 (Independence of Irrelevant Alternatives)
Suppose that A, B, and C are 3 lotteries with A ≽ B
Let t be the probability that a 3rd choice is present: t ∈ [0, 1]
If tA + (1 - t)C ≽ tB + (1 - t)C, then the 3rd choice C is irrelevant
∴ The preference order relation between A and B (viz. A ≽ B) holds, independently of the presence of C
AXIOM 4 (Continuity)
Relative to the preference order relation A ≽ B ≽ C:
There exists a probability p such that B is equally as good as pA + (1 - p)C