When one particular species of event has always … been conjoined with another,
we make no longer any scruple of foretelling one upon the appearance of the other,
and of employing that reasoning, which can alone assure us of any matter of fact or existence.
We then call the one object, Cause; the other, Effect.
 David Hume's (1748, §7) An Enquiry Concerning Human Understanding
Pearl Causal Model 


Graph Theory 


Relative to GRAPH 1: The parents of X are A and B The ancestors of X are D, C, A, and B The children of X are Y and Z The descendants of X are Y, Z, F, G, and H The indegree of X is 2 The outdegree of X is 2 The degree of X is 4


Dseparation 


The rules for dseparation (Pearl, 1988): RULE 1 for dseparation ('d' stands for 'directional'): X and Y are dconnected if there is an unblocked path between X and Y An unblocked path is a path that can be traced without meeting with any colliders In EXAMPLE 1: There is a collider at T The path XRST is unblocked ∴ X and R, X and S, X and T, R and S, R and T, and S and T are dconnected The path TUVY is also unblocked ∴ T and U, T and V, T and Y, U and V, U and Y, and V and Y are also dconnected However, X and Y, X and V, S and U, etc are dseparated: no path can be traced between them without meeting the collider at T RULE 2 for dseparation: X and Y are dconnected, conditioned on a set Z of nodes, if there is a colliderfree path between X and Y that does not traverse any members of Z In EXAMPLE 2: Let Z be the set { R, V } According to RULE 2: X and S are dseparated: the path XRS is blocked by Z U and Y are dseparated: the path UVY is blocked by Z S and U are dseparated: the path STU is not colliderfree Although T is not in Z, the path STU is still blocked since T is a collider (RULE 1) RULE 3 for dseparation: If a collider is a member of the conditioning set Z or the collider has a descendant in Z, then it no longer blocks any path that traces this collider In EXAMPLE 3: Let Z be the set { T } According to RULE 3: X and Y, X and V, S and U, etc are now dconnected, since the collider T is a member of the conditioning set Z (compare with EXAMPLE 1) In EXAMPLE 4: Let Z be the set { S_{1}, S_{2} } According to RULE 3: The collider at U has a descendant S_{1} in the conditioning set Z and the link at U is unblocked The collider at W has a descendant S_{2} in the conditioning set Z and the link at W is unblocked ∴ We now have a colliderfree path between X and Y (viz. XUVWY) that does not traverse any members of the conditioning set Z There are 3 blocking criteria:
A dseparated (or blocked) path does not transmit association A dconnected (or unblocked) path may transmit association 

DAGs & Bayesian Networks 


The Pearl Causal Model takes a set of data and produces a directed acyclic graph (DAG) A DAG is:
A DAG is also called a Bayesian network (BN) The DAG represents the causal structure of the system Conditional dependence relations between variables are represented as edges between vertices in the graph The main idea behind the Pearl Causal Model is to find a graph or set of graphs that best explain the data There are two primary METHODS for inferring BNs from the data: METHOD 1: Assign scores to graphs and search over the set of possible graphs, while attempting to maximize a particular scoring function An initial graph is generated in METHOD 1 and the search space is explored by altering this graph METHOD 2: Start with an undirected, fully connected graph and use repeated conditional independence tests to remove and orient edges in the graph After the edges are removed, the remaining ones are directed from cause to effect Dynamic Bayesian networks (DBNs) use BNs to show how variables influence each other across time (Friedman et al, 1998, Murphy, 2002)


Axioms 


AXIOM 1: Causal Markov Condition (CMC) A directed acyclic graph (DAG) G over V and a probability distribution P(V) satisfies the CMC iff for every W in V, W is independent of its noneffects, given its parents (Alternatively) Given any disjoint sets A, B, and C of variables, if A is dseparated from B conditional on C, then A is statistically independent of B given C See Pearl (1988) for proof Formally: W ⫫ { V \ Descendants(W) ∪ Parents(W) }Parents(W) In EXAMPLE 5: The CMC entails the following conditional independence relations: A ⫫ B D ⫫ { A, B }C
AXIOM 2: Causal Faithfulness Condition (CFC)
Given any graph, the CMC (or AXIOM 1) determines a set of independence relations However, a probability distribution P on a graph G that satisfies the CMC may include other independence relations besides those entailed by the CMC Recall the birth control pill EXAMPLE (Hesslow, 1976, Cartwright, 1989): BCP (birth control pills) and T (thrombosis) might be independent in the probability distribution satisfying the CMC for this graph, even though the graph does not entail their independence ∴ If all and only those conditional independence relations true in P are entailed by the CMC applied to G, then we say that P and G are faithful to one another (Alternatively) Given any disjoint sets A, B, and C of variables, if A is statistically independent of B given C, then A is dseparated from B conditional on C
AXIOM 3: Causal Sufficiency
A set V of variables is causally sufficient for a population iff in the population every common cause of any 2 or more variables in V is in V According to AXIOM 3, there are no hidden common causes 

Intervention & the Docalculus 


EXAMPLE: Here is the DAG for the relationship between temperature, ice cream sales, and crime rates: X denotes ice cream sales Y denotes crime rates Z denotes the temperature U_{X}, U_{Y}, and U_{Z} denote the error terms for X, Y, and Z (i.e. the effects of exogenous variables not included in the causal model Suppose that we intervene on the value of ice cream sales (X) We might fix the value of X to a low value (e.g. by shutting down all ice cream shops) When we intervene on X:
According to the docalculus: Given 2 disjoint sets of variables X and Y, the causal effect of X on Y is a function from X to the space of probability distributions on Y The causal effect of X on Y is denoted as 'P(y  do(x))' in the docalculus For each realization x of X, P(y  do(x)) gives the probability of Y = y induced by deleting from the model all equations corresponding to variables in X and substituting X = x in the remaining equations The graph corresponding to the reduced set of equations is the subgraph from which all directed edges entering X have been pruned by surgery Where x_{1}, …, x_{n} denote variables in a BN: P(x_{1}, …, x_{n}) = ∏ P(x_{i}Parents(x_{i}) According to Cochrane's EXAMPLE (Wainer, 1989): Soil fumigants (X) are being used to increase oat crop yields (Y) by controlling the eelworm population (Z) X denotes the soil fumigants Y denotes the oat crop yields Z denotes the eelworm population: Z_{0} denotes last year's eelworm population, Z_{1} denotes the quantity of eelworm population before treatment, Z_{2} denotes the quantity of eelworm population after treatment, and Z_{3} denotes the quantity of eelworm population at the end of the season B denotes the population of birds and other predators From P(x_{1}, …, x_{n}) = ∏ P(x_{i}Parents(x_{i}), we get: P(z_{0}, x, z_{1}, b, z_{2}, z_{3}, y) = P(z_{0})P(xz_{0})P(z_{1}z_{0})P(bz_{0}) × P(z_{2}x, z_{1})P(z_{3}z_{2}, b)P(yx, z_{2}, z_{3}) With the intervention do(X = x′): P(z_{0}, z_{1}, b, z_{2}, z_{3}, ydo(X = x′)) = P(z_{0})P(z_{1}z_{0})P(bz_{0}) × P(z_{2}x′, z_{1})P(z_{3}z_{2}, b)P(yx′, z_{2}, z_{3}) 