+ All Categories
Home > Documents > Tetrad project .

Tetrad project .

Date post: 20-Jan-2016
Category:
Upload: dulcie-penelope-barnett
View: 236 times
Download: 0 times
Share this document with a friend
43
Tetrad project http://www.phil.cmu.edu/projects/te trad/current.html
Transcript
Page 1: Tetrad project   .

Tetrad project

http://www.phil.cmu.edu/projects/tetrad/current.html

Page 2: Tetrad project   .

Causal Models in the Cognitive Sciences

Page 3: Tetrad project   .

Two uses

Causal graphical models used in: Practice/methodology of cognitive science

Focus on neuroimaging, but lots of other uses

Framework for expressing human causal knowledge Are human causal representations “just” these

causal graphical models? Also (but not today): Are other cognitive

representations “just” graphical models (perhaps causal, perhaps not)?

Page 4: Tetrad project   .

Learning from neuroimaging Given neuroimaging data, what is the

causal structure inside the brain? Ignoring differences in timescale,

challenges in inverting the hemodynamic response curve, etc.

??

Page 5: Tetrad project   .

Learning from neuroimaging Big challenge: people likely have (slightly)

different causal structures in their brains ⇒ Full dataset is really from a mixed

population! ⇒ “Normal” causal search falls apart

Idea: perhaps the differences are mostly in parameters, not graphs Note that “no edge” ≡ “parameter = 0”

Page 6: Tetrad project   .

IMaGES algorithm

Given data from individuals D1, …, Dn, the score for graph G is computed by: Compute ML estimate of parameters for Di Use that ML estimate to get BIC for Di Score for G is the average BIC over all

datasets:

Do GES-style search over graphs (i.e., greedy edge addition, then greedy edge removal)

Page 7: Tetrad project   .

IMaGES application

Standard causal search: IMaGES:

Page 8: Tetrad project   .

Causal cognition

Causal inference: learning causal structure from a sequence of cases (observations or interventions)

Causal perception: learning causal connections through “direct” perception

Causal reasoning: using prior causal knowledge to predict, explain, control your world

Page 9: Tetrad project   .

Descriptive theories (in 2000) Paradigmatic causal inference situation:

A set of binary potential causes: C1, …, Cn

A known binary effect: E Minimal role for prior beliefs

Observational data about variable values Possible formats include: sequential, list, or

summary

Page 10: Tetrad project   .

Descriptive theories (in 2000) Goal of theories: model (mean) “strength

ratings” as a function of the observed cases) Or a series of (mean) ratings

Two theory-types: Dynamical vs. Long-run Dynamical predict belief change after single cases Long-run predict stable beliefs after “enough time” Similar to algorithmic vs. computational distinction

Page 11: Tetrad project   .

Dynamical theories (in 2000) Rescorla-Wagner (and variants)

Associative strength for each cue (to the effect) Causal version: associative strengths are causal

Schematic form of R-W:

ΔVi = RateParams × (Outcome – Prediction) That is, use error-correction to update the

associative strengths after each observed case Variant R-W models explain phenomena such as

backwards blocking by changing the prediction function

Page 12: Tetrad project   .

Long-run theories (in 2000)

In the long-run, causal strength judgments should be proportional to the: Conditional contrast (Conditional ΔP

theory):ΔPC.{X} = P (E |C & X ) – P (E |~C & X )

Causal strength estimate (Power PC):pC = ΔPC.F / [1 – P (E |~C & F)]

where F is a “focal set” of relevant events

Page 13: Tetrad project   .

Dynamical & long-run theories In the long-run, Rescorla-Wagner (and

variants) “converges” to conditional ΔP I.e., R-W is a dynamical version of conditional

ΔP Simple modification of the error-correction

equation converges to power PC

Primary debate (in 2000): which family of theories correctly describes causal learning?

Page 14: Tetrad project   .

Parameter estimation

Connect causal modelsand descriptive theories: B is a constant background cause Limited correlations allowed between C1, …, Cn

Additional restriction: Assume we have: P(E) = f(wC1, …, wCn, wB), or

more precisely: P(E | C1, …, Cn) = f(wC1, …, wCn, wB, C1, …, Cn)

C1

EwC

1

wC

n

Cn

wB

B…

Page 15: Tetrad project   .

Parameter estimation

Essentially every descriptive theory estimates the w-parameters in this causal Bayes net Different descriptive theories result from

different functional forms for P(E)

And all of the research on the descriptive theories implies that people can estimate parameters in this “simple” causal structure

Page 16: Tetrad project   .

Learning causal structure?

Additional queries: From a “rational analysis” point-of-view:

Can people learn structure from interventions? Or from patterns of correlations?

From a “process model” point-of-view: Is there a psychologically plausible process

model of causal graphical model structure learning?

Page 17: Tetrad project   .

Stick-Ball machine

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Page 18: Tetrad project   .

Stick-Ball machine

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Page 19: Tetrad project   .

Stick-Ball machine

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Page 20: Tetrad project   .

Stick-Ball machine

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Page 21: Tetrad project   .

Stick-Ball machine

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Page 22: Tetrad project   .

Experimental conditions

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Two conditions with “identical” statistics Intervention case

A & B move together four times Intervene on A twice, B doesn’t move Intervene on B twice, A doesn’t move

Pointing control A & B move together four times A moves twice (point at it after), B doesn’t

move B moves twice (point at it after), A doesn’t

move

Page 23: Tetrad project   .

Experimental logic

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

For causal models (& close-to-determinism):

Intervention case Pointing control

Observation Intervention

A B Correlated B moves after A

A B Correlated A moves after B

A U B Correlated Neither moves

U1A BU2Uncorrelated Neither moves

Page 24: Tetrad project   .

Experimental logic

Non-CGM causal inference theories make no prediction for this case, as there is no cause-effect division And on plausible variants that do predict, they

predict no difference between the conditions

Page 25: Tetrad project   .

Inference from interventions

Kushnir, T., Gopnik, A., Schulz, L., & Danks, D. 2003. Inferring hidden causes. Proceedings of the 25th Annual Meeting of the Cognitive Science Society.

Response percentages in each condition:

p<.001: each condition is different from chance

p<.01: conditions are different from each other

Intervention Case

Pointing Control

A causes B 0 0

B causes A 0 4

Common cause 67 17

Separate mechanisms 33 79

Page 26: Tetrad project   .

Other learning from interventions Learning from interventions

Gopnik, et al. (2004); Griffiths, et al. (2004); Sobel, et al. (2004); Steyvers, et al. (2003)

And many more since 2005 Planning/predicting your own

interventions Gopnik, et al. (2004); Steyvers, et al.

(2003); Waldmann & Hagmayer (2005) And many more since 2005

Page 27: Tetrad project   .

Learning from correlations

Lots of evidence that people (and even rats!) can extract causal structure from observed correlations And those structures are well-modeled as

causal graphical models

⇒ Lots of empirical evidence that we act “as if” we are learning (approx. rationally) causal DAGs

Page 28: Tetrad project   .

5 10 15 20 25 30 35 40

-50

0

50

Trials

Mean judgment

Positive contingent High P(E) non-contingent Low P(E) non-contingent Negative contingent

Developing a process model

Process of causal inference is under-studied To date, very few systematic studies

Ex: Shanks (1995)

Page 29: Tetrad project   .

Developing a process model

Features of observed data Slow convergence Pre-asymptotic “bump”

General considerations People have memory/computation bounds Error-correction models (e.g., Rescorla-

Wagner; dynamic power PC) work well for simple cases

Page 30: Tetrad project   .

Bayesian structure learning

Three possible causal structures:

Asymptotic prediction: Strength rating (wC) ∝

hhhh

CCCC dwDhPDhwPww,,

1

00

|,|

B C

E

h+

OR + +

B C

E

h0

+

B C

E

h–

AND + –

Computed using Bayesian updating!

Page 31: Tetrad project   .

Bayesian dynamic learning

When presented with a sequence of data, After each datapoint, update the structure

and parameter probability distributions (in the standard Bayesian manner)

Then use those posteriors as the prior distribution for the next datapoint

Repeat ad infinitum

Danks, D., Griffiths, T. L., & Tenenbaum, J. B. 2003. Dynamical causal learning. In Advances in Neural Information Processing Systems 15.

Page 32: Tetrad project   .

Bayesian dynamic learning

Bayesian learning on the Shanks (1995) data Assume effects rarely occur without the

occurrence of an observed cause

5 10 15 20 25 30 35 40

- 50

0

50

Danks, D., Griffiths, T. L., & Tenenbaum, J. B. 2003. Dynamical causal learning. In Advances in Neural Information Processing Systems 15.

Page 33: Tetrad project   .

Side-by-side comparison

Shanks (1995): Bayesian:

5 10 15 20 25 30 35 40

-50

0

50

Trials

Mean judgment

Positive contingent High P(E) non-contingent Low P(E) non-contingent Negative contingent

5 10 15 20 25 30 35 40

-50

0

50

Danks, D., Griffiths, T. L., & Tenenbaum, J. B. 2003. Dynamical causal learning. In Advances in Neural Information Processing Systems 15.

Page 34: Tetrad project   .

Bayesian learning as process model Challenges:

All of the terms in the Bayesian updating equation are quite computationally intensive

Number of hypotheses under consideration, and information needs, grow exponentially with the number of potential causes

No clear way to incorporate inference to unobserved causes

Page 35: Tetrad project   .

An alternate possibility

Constraint-based structure learning:Given a set of independencies, determine the causal Bayes nets that predict exactly those statistical relationships Range of algorithms for a range of

assumptions

Idea: Use associationist models to make the necessary independence judgments

Danks, D. 2004. Constraint-based human causal learning. In Proceedings of the 6th International Conference on Cognitive Modeling (ICCM-2004).

Page 36: Tetrad project   .

An alternate possibility

Wellen, S., & Danks, D. (2012). Learning causal structure through local prediction-error learning. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th annual conference of the cognitive science society (pp. 2529-2534). Austin, TX: Cognitive Science Society.

Page 37: Tetrad project   .

Lingering problem

Pos connection for 1st 20 cases, then Neg connection

Page 38: Tetrad project   .

Lingering problem

Pos connection for 1st 20 cases, then Neg connection

Page 39: Tetrad project   .

Lingering problem

Pos connection for 1st 20 cases, then Neg connection

Page 40: Tetrad project   .

Causal inference summary

Very large literature over past 15 years showing that our causal knowledge (from causal inference) is structured like a causal DAG And we learn (approx.) the right ones from data But we aren’t quite sure how we do it

And we do appropriate causal reasoning given that causal knowledge As long as we’re clear about what the knowledge

is!

Page 41: Tetrad project   .

Causal perception

Paradigmatic case: “launching effect”

Similar perceptions/experiences for other causal events (e.g., “exploding”, “dragging”, etc.) Including social causal events (e.g.,

“fleeing”)

Page 42: Tetrad project   .

Causal perception

Driven by fine-grained spatiotemporal details, including broader context

Page 43: Tetrad project   .

Causal perception vs. inference Behavioral evidence that they are different

Both in responses & phenomenology

Neuroimaging evidence that they are different Different brain regions “light up” in the different

types of experiments

Theoretical evidence that they are different “Best models” of the output representations

differ


Recommended