Causality in complex networks

Post on 15-Jul-2015

89 views 1 download

Tags:

transcript

Causality in Complex Networks

Sebastian BenthallD-Lab

Why causality?

Causal relations are scientifically interesting because when exposed, they are:● a reliable mechanism● that supports interventionBy understanding causality, we can predict and control. This is the interest behind causal knowledge.

Why this talk?

● A “working group” talk - I don’t claim to be an expert

● I’m working through research problems that are challenging to me

● I’m presenting this work in progress both to inform and solicit feedback

● You are welcome to do the same with your work!

Potential outcomes

The Rubin Causal Model (RCM) or potential outcomes framework is ascendant.

Potential outcomes

“The causal effect of a treatment on a single individual or unit of observation is the comparison (e.g., difference) between the value of the outcome if the unit is treated and the value of the outcome if the unit is not treated.” (Angrist, Imbens, and Rubin, 1996)

Potential outcomes

“The causal effect of a treatment on a single individual or unit of observation is the comparison (e.g., difference) between the value of the outcome if the unit is treated and the value of the outcome if the unit is not treated.” (Angrist, Imbens, and Rubin, 1996)

Potential outcomes

Average effect: (10+9+7+12)/4 = 9.5

Controlled Outcome - Yi(0)

Treated Outcome - Yi(1)

Causal Effect of Treatment Yi(1) - Y(0)

Alice 20 30 10

Bob 15 24 9

Cathy 10 17 7

David 22 34 12

Potential outcomes

You only ever see some of these. This has been called the Fundamental Problem of Causal Inference

Controlled Outcome - Yi(0)

Treated Outcome - Yi(1)

Causal Effect of Treatment Yi(1) - Y(0)

Alice 20 30 10

Bob 15 24 9

Cathy 10 17 7

David 22 34 12

Potential outcomes

Stable Unit Treatment Value Assumption

"the [potential outcome] observation on one unit should be unaffected by the particular assignment of treatments to the other units"

Potential outcomesControlled Outcome - Yi(0) Treated Outcome - Yi(1) Causal Effect of Treatment

Yi(1) - Y(0)

Alice 20 30 10

Bob, if Alice in not treated

15 24 9

Bob, if Alice is treated

18 29 11

Cathy 10 17 7

David 22 34 12

Potential outcomes

So for every unit, we have to map out all the variables that can have an effect on the potential outcomes.

Spouse treated

Unit treated

Outcome

Potential outcomes

So for every unit, we have to map out all the variables that can have an effect on the potential outcomes.

A great tool for this: Pearl’s causal networks.

Causal networks

Pearl, Causality, http://bayes.cs.ucla.edu/BOOK-2K/

Note

There are differences between Pearl and Rubin’s frameworks but their core concepts are compatible,

so says Andrew Gelman, 2009:http://andrewgelman.com/2009/07/07/more_on_pearls/

Causal networks

N.B.An interesting thing about causal networks is that the conditional probability distributions can be arbitrarily complex.

Also, variables need not just be whole numbers or scalars. They could be a matrix.

Complex networks

Now I’m going to talk about networks that are not causal networks.

Sometimes in the literature these are called complex networks.

from http://www.spandidos-publications.com/ijo/43/6/1737

from http://noduslabs.com/radar/types-networks-random-small-world-scale-free/

Complex networks

There are lots of different kinds of networks observed in nature and society.

They can differ substantially in their emergent properties.

Complex networks

def: emergent property

“An emergent property is a property which a collection or complex system has, but which the individual members do not have. A failure to realize that a property is emergent, or supervenient, leads to the fallacy of division.”

Degree Distributionfrom http://www.alexeikurakin.org/main/lecture4Ext.html

Assortativity

from http://iopscience.iop.org/1742-5468/2008/03/P03008/figures

How to make a graph

Different processes for generating graphs have result in graphs with different properties.

How to make a graph

● Erdős–Rényi (ER) model: G(n,p): Create n nodes and create edges with probability p

● Barabási–Albert (BA) model:○ Begin with a fully connected network of m0 nodes○ Each new node is connected to m (< m0) nodes with

probability proportional to degree ki of each node i

How to predict a graph

The distribution of degree of an Erdős–Rényi graph is binomial.

The distribution of degree of a Barabási–Albert (BA) graph is scale-free/power-law.

Bayes Theorem

Recall Bayes Theorem:

P(H|D) ∝ P(D|H) P(H)

Bayes Theorem

Recall Bayes Theorem:

P(H|D) ∝ P(D|H) P(H)If we can show a difference in the

likelihoodof data under different hypotheses, we can

learn something

What process created this graph?

What process created this graph?

Histogram of observed node degrees d

What process created this graph?

Can in principle compare P(d|ER) and P(d|BA).

What process created this graph?

● We can in principle statistically distinguish hypotheses about generative processes based on emergent properties.

● Is this a causal inference?

Ways to model this

Process Graph

Ways to model this

All the logic of graph generation is buried in the conditional probability function P(B|A).

AProcess

BGraph

Ways to model this

All the logic of graph generation is buried in the conditional probability function P(B|A).“Logical causation” allows no intervention!

AProcess

BGraph

Ways to model this

The Barabási–Albert (BA) model suggests this interpretation of graph changing over time.

You can model time series data like this.

B0 B1 B2 B3 B4

Ways to model this

Now we can model a stable transition function within the graph and external causes.

B0 B1 B2 B3 B4

C

Ways to model this

In other words,endogenous process + exogenous shocks

B0 B1 B2 B3 B4

C

Tools in the toolbox

If we want to understand the effect of a kind of exogenous shock,and we know the endogenous process,then we can look for natural experiments that expose the treated outcome.

Tools in the toolbox

To know the endogenous process, then compute the likelihood of emergent properties.

B0 B1 B2 B3 B4

EEE E E

Problems

● The vastness of the hypothesis space of graph generation processes

● Related: how do you choose a prior?

Problems

● Computing the likelihood of a graph’s emergent properties given a particular generation process is○ tricky math○ maybe computationally hard

Dissatisfactions● We have continued to bury much of the

mechanism of interest in the conditional probability function.

● Suppose we want to cash this out as a finer-grained mechanism that supports finer interventions

● Can we think at multiple levels of abstraction at once?

Thanks! Contact me at: sb at ischool dot berkeley dot edu