CausalityCausalityStefan Segers, KU Leuven
Problem: Creating a Causal Bayessian Network
from an empirical Data Set
IC* Algorithm, determine relationship
from an empirical Data Set
Example:Method:
IC* Algorithm, determine relationship
between nodes:
1. Potential Cause
2. Genuine Cause
Agility
2. Genuine Cause
3. Spurious Association
4. Undefined
Tools :
Tennis
BNT: Toolbox for Matlab
Tools :
Concept:
Real Network
Concept:
The decline of the amount of pirates in the world
causes global warming?causes global warming?
Results/Conclusions: Real Data example:
• Causal Relationships can only be found if
V-structures are naturally present in the
Southerner
V-structures are naturally present in the
data – one data set can often define
multiple correct causal bayessian networks
Gender• Correctness of output dependant on
hidden statistical parameters, discretisation
of continuous variablesof continuous variables
• Determinism (functional relationships) in
data needs to be accounted for, or causal data needs to be accounted for, or causal
relationships will not be found
References: Judea Pearl, Kevin Murphy, Richard Scheines
CausalityCausalityStefan Segers, KU Leuven
Creating a Causal Bayessian Network
from an empirical Data Setfrom an empirical Data Set
Example:
StaminaAgility StaminaAgility
Marathon MarathonTennis
Wealth Wealth
Concept: Latent Variables
samplingReal Network Network, output IC*
Concept: Latent Variables
The decline of the amount of pirates in the world
auses global warming?auses global warming?
Incorrect, only a correlation between the topics
Causal Relationship is a latent (unobserved) Causal Relationship is a latent (unobserved)
variable that causes both (“Time”)
Real Data example: American Census
Southerner Race
Wage
Gender
Union
Occupation
Union
MemberSectorMarital
Status
(sample of 534 persons, IC* output)
References: Judea Pearl, Kevin Murphy, Richard Scheines