A1
New developments in structural equation modeling
Rex B Kline
Concordia University
Montréal
Set A: SCM UNL Methodology Workshop
A2
A3
A4
A5
Topics
o Graph theory
o Mediation:
Design Conditional Causal
A6
Topics
o Graph theory:
Pearl’s SCM
Causal reasoning
Causal estimation
A7
Topics
o Mediation:
Design requirements
Conditional process modeling
Cause × mediator (SCM)
A8
Graph theory
o Pearl, J. (2009a). Causal inference in statistics: An overview.
Statistics Surveys, 3, 96–146. o Pearl, J. (2009b). Causality: Models, reasoning, and
inference (2nd ed.). New York: Cambridge University Press.
o Pearl, J. (2012). The causal foundations of structural equation modeling. In R. H. Hoyle (Ed.), Handbook of
structural equation modeling (pp. 68–91). New York: Guilford Press.
A9
Graph theory
o Bollen, K. A., & Pearl, J. (2013). Eight myths about causality
and structural equation models. In S.L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 301–328). New York: Springer.
o Cole, S. R., Platt, R. W., Schisterman, E. F., Chu, H., Westreich, D., Richardson, D., & Poole, C. (2010). Illustrating bias due to conditioning on a collider. International Journal of Epidemiology, 39, 417–420
o Elwert, F. (2013). Graphical causal models. In S. L. Morgan (Ed.), Handbook of causal analysis for social research (pp. 245–273). New York, NY: Springer.
A10
Graph theory
o Elwert, F. (2014). Endogenous selection bias: The problem of
conditioning on a collider variable. Annual Review of
Sociology, 40, 31–53. o Glymour, M. M. (2006). Using causal diagrams to
understand common problems in social epidemiology. In M. Oakes & J. Kaufman (Eds), Methods in social epidemiology (pp. 387–422). San Francisco: Jossey-Bass.
o Hayduk, L., Cummings, G., Stratkotter, R., Nimmo, M., Grygoryev, K., Dosman, D., … Boadu, K. (2003). Pearl’s d-separation: One more step into causal thinking. Structural Equation Modeling, 10, 289–311.
A11
Graph theory
o Kenny, D. A. (2014). Mediation. Retrieved from
http://davidakenny.net/cm/mediate.htm#CI o Shipley, B. (2000). A new inferential test for path models
based on directed acyclic graphs. Structural
Equation Modeling, 7, 206–218. o Spector, P. E., & Brannick, M. T. (2011). Methodological
urban legends: The misuse of statistical control variables. Organizational Research Methods, 14, 287–
305.
A12
Graph theory
o Knüppel, S., & Stang, A. (2010). DAG Program: Identifying
minimal sufficient adjustment sets. Epidemiology, 21, 159. http://epi.dife.de/dag/
o Porter, K., Poole, D., Kisynski, J., Sueda, S., & Knoll, B., Mackworth, A., … Hoos, H., Gorniak, P., & Conati, C. (1999–2009). Belief and Decision Network Tool (Version 5.1.10) (computer software). http://aispace.org/bayes/
o Textor, J., Hardt, J., & Knüppel, S. (2011). DAGitty: A graphical tool for analyzing causal diagrams. Epidemiology, 5, 745. http://www.dagitty.net/
A13
Mediation
o Design:
Cheung, J., & MacKinnon, D. P. (2012). Mediational/indirect effects in structural equation modeling. In R. H. Hoyle (Ed.), Handbook of structural equation modeling (pp. 417–435). New York: Guilford Press.
Little, T. D. (2013). Longitudinal structural equation
modeling. New York, NY: Guilford. MacKinnon, D. P. (2011). Integrating mediators and
moderators in research design. Research on Social
Work Practice, 21, 675–681.
A14
Mediation
o Design:
Maxwell, S. E., & Cole, D. A. (2007). Bias in cross-sectional analyses of longitudinal mediation. Psychological
Methods, 12, 23–44. Selig, J. P., & Preacher, K. J. (2009). Mediation models for
longitudinal data in developmental research. Research
in Human Development, 6, 144–164. Wu, A. D., & Zumbo, B. D. (2008). Understanding and using
mediators and moderators. Social Indicators Research, 87, 367–392.
A15
Mediation
o Conditional:
Edwards, J. R., & Lambert, L. S. (2007). Methods for integrating moderation and mediation: A general analytical framework using moderated path analysis. Psychological Methods, 12, 1–22.
Hayes, A. F. (2013). Introduction to mediation, moderation,
and process control analysis: A regression-based
approach. New York: Guilford Press.
A16
Mediation
o Conditional:
Hayes, A. F., & Preacher, K. J. (2013). Conditional process modeling: Using structural equation modeling to examine contingent causal processes, In G. R. Hancock & R. O. Mueller (Eds.), Structural equation modeling: A
second course (pp. 219–266). Charlotte: IAP. Preacher, K. J., Rucker, D. D., & Hayes, A. F. (2007).
Addressing moderated mediation hypotheses: Theory, methods, and prescriptions. Multivariate Behavioral
Research, 42, 185–227.
A17
Mediation
o Causal:
Bullock, J. G., Green, D. P., & Ha, S. E. (2010). Yes, but what’s the mechanism? (Don’t expect an easy answer). Journal of Personality and Social Psychology, 98, 550–558.
Imai, K., Keele, L., & Yamamoto, T. (2010) Identification, inference and sensitivity analysis for causal mediation effects. Statistical Science, 25, 51–71.
A18
Mediation
o Causal:
Lange, T., Vansteelandt, S., & Bekaert, M. 2012). A simple unified approach for estimating natural direct and indirect effects. American Journal of Epidemiology, 176, 190–195.
Pearl, J. (2014). Interpretation and identification of causal mediation. Psychological Methods. Advance online publication. http://dx.doi.org/10.1037/a0036434
A19
Mediation
o Causal:
Petersen, M. L., Sinisi, S. E., & van der Laan, M. J. (2006). Estimation of direct causal effects. Epidemiology, 17, 276–284.
Valeri, L., & VanderWeele, T. J. (2013). Mediation analysis allowing for exposure–mediator interactions and causal Interpretation: Theoretical assumptions and implementation with SAS and SPSS macros. Psychological Methods, 18, 137–150.
A20
Mediation
o Hicks, R., & Tingley, D. H. (2012). MEDIATION: Stata module
for causal mediation analysis and sensitivity analysis [computer software].
http://EconPapers.repec.org/RePEc:boc:bocode:s457294
o Muthén, B. O. (2011). Applications of causally defined direct and indirect effects in mediation analysis using SEM in Mplus.
https://www.statmodel.com/download/causalmediation.pdf
o Tingley, D., Yamamoto, T., Hirose, K., Keele, L., & Imai, K. (2014). Package ‘mediation’ [computer software].
http://cran.r-project.org/web/packages/mediation/
A21
Intro to SCM
o Unifies:
Parametric & nonparametric
SEM and potential outcomes
Data, graphical analysis
A22
Intro to SCM
o Alternative to path analysis
o SEM program not needed
o But not latent variable models
A23
Intro to SCM
o Hayduk, L. A. & Littvay, L. (2012). Should
researchers use single indicators, best indicators, or multiple indicators in structural equation models? BMC Medical Research
Methodology, 12(159). http://www.biomedcentral.com/1471-2288/12/159
A24
Intro to SCM
o Multiple indicators
o Some weak
o Best indicator is better
A25
(a) Single endogenous indicator
A
1
X2
EX2
1
X1
EX1
1
C
1
DC
1
Y4
EY4
1
Y3
EY3
1
DY1
1
Y1
(b) rYY = .70 for Y1
.30 1
2
Ys
A
1
X2
EX2
1
X1
EX1
1
C
1
DC
1
Y4
EY4
1
Y3
EY3
1
1
Y1
EY1
1
B
1
DB
A26
E2 1
X2
B
1
(1 − 22
r ) 2
2s
E1 1
X1
A
1
(1 − 11r ) 2
1s
EY 1
Y
C
1
(1 − YYr ) 2
Ys
DY 1
A27
Intro to SCM o Bayesian networks
o Graph structure
o Dependence relations
A28
Intro to SCM
o Hypotheses as graphs
o Directly analyze graph (no data)
o Computer tools
A29
Intro to SCM
o Nonparametric causal models
o Ideas (no operationalization)
o Study planning
A30
Intro to SCM
o Directed acyclic graph (DAG)
o Directed cyclic graph (DCG)
o Recursive, nonrecursive
A31
Intro to SCM
o Causal effects identified?
o If no, what should be measured?
o If yes, how many estimates?
A32
Intro to SCM
o Identified model not required
o Estimate what you can
o Acknowledge info. gap for rest
A33
Intro to SCM
o Regression analysis
o Causal model
o Covariate selection
A34
Intro to SCM
o Bring on the data
o Model predictions
o Conditional independences
A35
Intro to SCM
o All testable hypotheses
o Vanishing partial correlations
o Vanishing tetrads
A36
SCM vocabulary
o Nodes, vertices (variables)
o Arcs, edges, links (paths)
o Adjacent (→), nonadjacent
A37
SCM vocabulary
o Parents, ancestors
o Children, descendants
o Path is any sequence of edges
A38
SCM vocabulary
o Directed path (causal)
o Undirected path (noncausal)
A39
SCM vocabulary
o Open (unblocked) path
o Closed (blocked) path
A40
SCM vocabulary
o Front-door path (causal)
o Back-door path (biasing)
A41
SCM vocabulary
o Estimate causal effect
o Block all open biasing paths
o Do not open any blocked path
A42
SCM vocabulary
o Ways to block or open paths:
Covariates
Sampling
A43
Basic graphs
o Chain:
X → W → Y
A44
Basic graphs
o Conditional independence:
X → W → Y
X ⊥ Y | W
A45
Basic graphs
o Fork:
X ← W → Y
A46
Basic graphs
o Conditional independence:
X ← W → Y
X ⊥ Y | W
A47
Basic graphs
o Inverted fork (collider):
X → W ← Y
X ⊥ Y
A48
Basic graphs
o Special role of colliders:
Controlling for a collider (or descendant) opens a blocked path
A49
Basic graphs
o Special role of colliders:
Controlling for a common outcome induces a spurious association between unrelated causes
A50
Basic graphs
o Special role of colliders:
Controlling for a common outcome adds a spurious component to related causes
A51
Basic graphs
o Inverted fork (collider):
X → W ← Y
X ⊥ Y
X Y | W ⊥
A52
Basic graphs
o Control for a collider (statistical):
rXY = 0 rYW = .40 rXW = .30
rXY·W = −.14
A53
Basic graphs
o Control for a collider (sampling):
Speed → Fatalities ← Alcohol
A54
Basic graphs
o Descendant of a collider:
X → W ← Y
↓ A
X Y | A ⊥
A55
Covariates o Achen, C. H. (2005). Let’s put garbage-can
regressions and garbage-can probits where they belong. Conflict
Management and Peace Science, 22, 327–339.
A56
Covariates o Regression assumes:
No causal effects between predictors Single equation
A57
Basic graphs
o Overcontrol bias:
X → W → Y
X ⊥ Y | W
Y on (X, W), BX = 0
A58
Basic graphs
o Endogenous selection bias:
X → W ← Y
X ⊥ Y
Y on (X, W), BX ≠ 0
A59
d-Separation
o Conditional independences
o Testable implications
o Basis for identification
A60
d-Separation
o Z d-separates X, Y if
1. Z closes all open paths
2. Z opens no blocked path
A61
X → A → B → Y
X ⊥ B | A
A62
X → A → B → Y
A ⊥ Y | B
A63
X → A → B → Y
X ⊥ Y | A
X ⊥ Y | B
X ⊥ Y | (A, B)
A64
A65
X → A ← B → Y
X ⊥ B
X B | A ⊥
A66
X → A ← B → Y
A ⊥ Y | B
A67
X → A ← B → Y
X ⊥ Y
X ⊥ Y | B
X Y | A ⊥
A68
X B | A ⊥
A69
A70
Identification
o Graphical criteria
o Sufficient (deconfounding) set
o Removes all noncausal aspects
A71
Identification
o Back-door criterion (total effects)
o Closes biasing (back-door) paths
o Leaves only causal
A72
Causal
X E Y A
C
X E Y
D
A73
Back-door
X C Y
X A D Y
X A D E Y
A
C
X E Y
D
A74
A
C
X E Y
D X C Y
A75
X A D Y
A
C
X E Y
D
A76
A
C
X E Y
D X A D E Y
A77
Back-door
X C Y
X A D Y
X A D E Y
Y on (X, A, C)
A78
Back-door
X C Y
X A D Y
X A D E Y
Y on (X, C, D)
A79
A
C
X E Y
D
Y on (X, C, D)
Y on (X, A, C)
A80
A81
Identification
o Single-door criterion (direct)
o Recursive, linear, continuous
o Multiple estimates
A82
Identification
o Single-door criterion (D → Y)
o Delete →
o Sufficient set d-separates
A83
A84
X
A
Y
B
C
X
A
Y
B
C
U1
U2
A85
A86
d-Separation
X → A → B → Y
(X, B) (X, Y) (A, Y)
3. Z closes all open paths
A87
Identification
o Merchant, A. T, & Pitiphat, W. (2002). Directed
acyclic graphs (DAGs): An aid to assess confounding in dental research. Community Dentistry and Oral Epidemiology, 30: 399–404.
o Fleischer, N. L., & Diez Roux , A. V. (2008). Using directed acyclic graphs to guide analyses of neighbourhood health effects: An introduction. Journal of Epidemiology & Community Health, 62, 842-846.
A88
Analysis o Roth, D. L., Wiebe, D. J., Fillingim, R. B., &
Shay, K. A. (1989). Life events, fitness, hardiness, and health: A simultaneous analysis of proposed stress-resistance effects. Journal of Personality and Social
Psychology, 57, 136–142.
A89
Figure 8.5. A recursive path model of health factors.
Exercise
Hardiness
Illness
DIl
1
Stress
1 DSt
Fitness
1 DFi
A90
Variable 1 2 3 4 5
1. Exercise —
2. Hardiness −.03 —
3. Fitness .39 .07 —
4. Stress −.05 −.23 −.13 —
5. Illness −.08 −.16 −.29 .34 —
M 40.90 0.0 67.10 4.80 716.70
SD 66.50 38.00 18.40 33.50 62.48
Note. These data (correlations, means, and
variances) are from D. Roth et al. (1989); N = 373.
A91
Independence Conditioning set Partial correlation
Exercise ⊥ Stress Hardiness −.058
Exercise ⊥ Illness Fitness, Stress .039
Hardiness ⊥ Fitness Exercise .089
Hardiness ⊥ Illness Fitness, Stress −.081
Fitness ⊥ Stress Exercise, Hardiness −.103
A92
Minimally sufficient set
Direct effect ∅ Exercise Hardiness Stress Fitness
Exercise Fitness .108 (.013) .390 — — — —
Hardiness Stress −.203 (.045) −.230 — — — —
Fitness Illness — −1.036 (.183) −.305 −.951 (.168) −.280 −.849 (.162) −.250 —
Stress Illness — .628 (.091) .337 .597 (.093) .320 — .574 (.089) .307
Note. Estimates are reported as unstandardized (standard error) standardized; ∅, empty set. Values in boldface control
for the all parents of each outcome.
A93
Extensions
o Locate instruments
o Models → counterfactuals
o Potential outcomes (PO)
A94
Strengths
o Unifying model (SEM, PO)
o Supports reasoning, planning
o Local fit, not global
A95
Limitations
o Classical measurement
o No global fit
o Few software tools
A96