1/44
Bayesian Structural Equations Modeling
M’hamed (Hamy) Temkit1
1Division of BiostatisticsMayo Clinic, Arizona
Applied Statistics Seminar, November 17, 2016
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
2/44
Outline
Introduction to SEM
Covariance Analysis
SEM Estimation (GLS vs MLE)
CFA
The General Model of SEM
LAAVAN
Bayesian Paradigm
Bayesian SEM
Bayesian CFA
BLAAVAN
CONCLUSION
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
3/44
Motivation
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
4/44
Motivation
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
5/44
Two Paradigms
Covariance Analysis
Σ = Σ(θ)
Bayesian Inference
p(θ | y) = p(y | θ)p(θ)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
6/44
Brief SEM Terminology
ξ1
X1
X2
δ1
δ2
λx11
λx21
ξ2
X3
X4
δ1
δ2
λx32
λx42
ξ3
X5
X6
δ1
δ2
λx53
λx63
η1
η 2
y1
y2
y3
y4
ε1
ε2
ε3
ε4
λy11
λy21
λy32
λy42
Measurement model
Structural model
β21
γ11
γ12
γ22
γ23
ϕ21
ϕ32
ϕ31
Endogenous latent variables
Exogenous latent variables
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
7/44
Background
Factor Analysis (Spearman, 1904)
Path Analysis (Sewal Wright 1918,1921,1934,1960)
Confirmatory Factor Analysis (CFA)(Joreskog, 1969 )
General SEM ( Joreskog (1973), Wiley (1973))
LISREL model (Wiley (1973), Joreskog (1977))
Generalized least squares Browne (1974,1982,1984)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
8/44
Relevant Reading References
Structural Equations With Latent Variables (Bollen, 1989)
Structural Equations Modeling With Amos (Byrn)
Latent Curve Models (Bollen, Curran 2006)
Structural Equation Modeling, A Bayesian Approach (Sik-YumLee 2007)
Structural Equation Modeling: A Multidisciplinary Journal
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
9/44
First Principle: Linear Regression
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
10/44
Linear Regression: The Machinery
yi = β0 + β1xi + εi , i = 1, n (regression line)
minn∑
i=1
(yi − β0 − β1xi )2 (OLS)
and if εi ∼ N(0, σ2) iid’s
maxn∏
i=1
1
2πσ2exp(− 1
2σ2
n∑i=1
(yi − β0 − β1xi )2) (ML)
β ∼ N(β, σ2(X ′X )−1)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
11/44
Pros and Cons of Regression (Linear Models)
Oversimplistic view of the Phenomena
Underestimates Measurement error (covariates are fixed)
Lacking in simultaneous equations in general (mediation )
Lacks flexibility to fit the SEM models
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
12/44
What is SEM
A melding of factor analysis and path (regression) analysisinto one comprehensive statistical methodolgy
Simultaneous equation modeling
Does the implied covariance matrix match up with theobserved covariance matrix
Degree to which they match represents the goodness of fit
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
13/44
Estimation (graph)
1.00 0.49
1.00 3.51
1.00 0.84
1.00 230.18
0.59
0.02
-0.00
1.09 1.32
1.20 0.47
0.44 0.34
1.18 -123.86
0.27
-0.02
1.22
0.00
0.51
x1 x2
x3 x4
x5 x6
x7 x8
Eps
Tlr
Eng
Rng
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
14/44
Estimation (equations)
Measurement Model:
x1 = a1 + epistemiology + e1
x2 = a2 + b2 epistemiology + e2
x3 = a3 + tolerance + e3
x4 = a4 + b4 tolerance + e4
x5 = a5 + engagement + e5
x6 = a6 + b6 engagement + e6
x7 = a7 + range + e7
x8 = a8 + b8 range + e8
Structural Model:
tolerance = a9 + b9 epistemiology + e9
range = a10 + b10 tolerance
b11 engagement + e10
cov(epist, engag) 6= 0
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
15/44
Estimation: objective function
S =
1n
∑ni=1(x1i − x1)2 1
n
∑ni=1(x1i − x1)(x2i − x2) · · · cov(x1, x8)
cov(x1, x2) var(x2) · · · cov(x2, x8)· · · · · · · · · · · ·
cov(x1, x8) cov(x2, x8) · · · var(x8)
Σ(θ) = cov(x1, x2, · · · , x8) =
var(x1) cov(x1, x2) · · · cov(x1, x8)
cov(x1, x2) var(x2) · · · cov(x2, x8)· · · · · · · · · · · ·
cov(x1, x8) cov(x2, x8) · · · var(x8)
S ≈ Σ(θ)
Basically, minimize f (Σ(θ), S)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
16/44
Generalized Least Squares (GLS)
x1, · · · , xn ∼ N(0,Σ(θ0)), xi ∈ Rp iid’s
vec SL−→ N(Σ(θ0),C )
G (θ) = 2−1tr(S − Σ(θ))V 2,V > 0
θL−→ N(θ0,D(θ0))
nG (θ)L−→ χ2
p∗−q
p∗ = p(p+1)2 , q parameters
H0 : Σ = Σ(θ) vs Ha : Σ 6= Σ(θ)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
17/44
Maximum Likelihood (ML)
x1, · · · , xn ∼ N(µ0,Σ(θ0)), xi ∈ Rp iid’s
(n − 1)S ∼Wp(R0, ρ0)
F (θ) = log det(Σθ) + tr((SΣ(θ))−1)− log det(S)− p
θML−→ N(θ0,C2(θ0))
nF (θM)L−→ χ2
p∗−q
H0 : Σ = Σ(θ) vs Ha : Σ 6= Σ(θ)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
18/44
SEM Modeling
Model ( Diagram )
Identifyability ( q ≤ 2−1p(p + 1)),check identifyabiltiy rules in Bollen (page 238)
Constraints ( loadings equal 1 )
EDA ( Distribution, correlation, outliers, etc...)
EDA ( Estimation )
Fit indices ( SMR ( residuals ))
Diagnostics ( residuals, outliers, etc... )
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
19/44
Measurement model (CFA)
xi = Λξi + εi , i = 1, · · · , n
ξ ∼ N(0,Φ), Latent variablesε ∼ N(0,Ψε), Ψε diagonalξ and ε are uncorrelated
Σ = ΛΦΛt + Ψε
Λ, Φ, Ψε are the parameters
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
20/44
CFA Example (graph)
1.00 0.55 0.73 1.00 1.11 0.93 1.00 1.18 1.08
0.55 1.13 0.84 0.37 0.45 0.36 0.80 0.49 0.57
0.81 0.98 0.38
0.41
0.26
0.17
x1 x2 x3 x4 x5 x6 x7 x8 x9
vsl txt spd
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
21/44
CFA (loadings and latents)
ξ =
vsltxtspd
Λ =
1 0 0λ21 0 0λ31 0 00 1 00 λ52 00 λ62 00 0 10 0 λ820 0 λ92
But also remember the variances and covariances
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
22/44
CFA using Laavan (R)
library(stringr)
library(lavaan)
library(DiagrammeR)
library(dplyr)
library(semPlot)
# specify the model
HS.model <-
" visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9 "
fit.HS <- sem(HS.model,
data=HolzingerSwineford1939)
summary(fit.HS)
semPaths(fit.HS, intercept = FALSE,
whatLabel = "est",
residuals = TRUE, exoCov = TRUE)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
23/44
CFA Example (output)
> summary(fit.HS)
lavaan (0.5-22) converged normally after 35 iterations
Number of observations 301
Estimator ML
Minimum Function Test Statistic 85.306
Degrees of freedom 24
P-value (Chi-square) 0.000
Parameter Estimates:
Information Expected
Standard Errors Standard
Latent Variables:
Estimate Std.Err z-value P(>|z|)
visual =~
x1 1.000
x2 0.554 0.100 5.554 0.000
x3 0.729 0.109 6.685 0.000
textual =~
x4 1.000
x5 1.113 0.065 17.014 0.000
x6 0.926 0.055 16.703 0.000
speed =~
x7 1.000
x8 1.180 0.165 7.152 0.000
x9 1.082 0.151 7.155 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
visual ~~
textual 0.408 0.074 5.552 0.000
speed 0.262 0.056 4.660 0.000
textual ~~
speed 0.173 0.049 3.518 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.x1 0.549 0.114 4.833 0.000
.x2 1.134 0.102 11.146 0.000
.x3 0.844 0.091 9.317 0.000
.x4 0.371 0.048 7.779 0.000
.x5 0.446 0.058 7.642 0.000
.x6 0.356 0.043 8.277 0.000
.x7 0.799 0.081 9.823 0.000
.x8 0.488 0.074 6.573 0.000
.x9 0.566 0.071 8.003 0.000
visual 0.809 0.145 5.564 0.000
textual 0.979 0.112 8.737 0.000
speed 0.384 0.086 4.451 0.000
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
24/44
Structural model (SEM)
η = Bη + Γξ + ζ
y = Λyη + εx = Λxξ + δ
B, Γ, Λy , Λx ,Φ, Ψ, Θε,Θδ, are the parameters
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
25/44
SEM Example (graph)
1.00 2.18 1.82
1.00 1.26 1.06 1.26 1.00 1.19 1.28 1.27
1.48 0.57
0.84
0.621.31
2.15 0.79 0.351.36
x1 x2 x3
y1 y2 y3 y4 y5 y6 y7 y8
i60
d60 d65
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
26/44
SEM Example (some equations)
[d60d65
]=
[0 0B21 0
] [d60d65
]+
[γ11γ21
] [i60]
+
[ξ1ξ2
]
Σ(θ) =
(Σyy (θ) Σyx(θ)Σxy (θ) Σxx(θ)
)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
27/44
SEM Example ( R code)
# specify the model
model <- ’
# latent variables
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual covariances
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
’
fit <- sem(model, data=PoliticalDemocracy)
summary(fit)
semPaths(fit, intercept = FALSE, whatLabel = "est",
residuals = FALSE, exoCov = FALSE)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
28/44
SEM Example (output)
summary(fit)
lavaan (0.5-22) converged normally after 68 iterations
Number of observations 75
Estimator ML
Minimum Function Test Statistic 38.125
Degrees of freedom 35
P-value (Chi-square) 0.329
Parameter Estimates:
Information Expected
Standard Errors Standard
Latent Variables:
Estimate Std.Err z-value P(>|z|)
ind60 =~
x1 1.000
x2 2.180 0.139 15.742 0.000
x3 1.819 0.152 11.967 0.000
dem60 =~
y1 1.000
y2 1.257 0.182 6.889 0.000
y3 1.058 0.151 6.987 0.000
y4 1.265 0.145 8.722 0.000
dem65 =~
y5 1.000
y6 1.186 0.169 7.024 0.000
y7 1.280 0.160 8.002 0.000
y8 1.266 0.158 8.007 0.000
Regressions:
Estimate Std.Err z-value P(>|z|)
dem60 ~
ind60 1.483 0.399 3.715 0.000
dem65 ~
ind60 0.572 0.221 2.586 0.010
dem60 0.837 0.098 8.514 0.000
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
29/44
SEM Example (output)
Covariances:
Estimate Std.Err z-value P(>|z|)
.y1 ~~
.y5 0.624 0.358 1.741 0.082
.y2 ~~
.y4 1.313 0.702 1.871 0.061
.y6 2.153 0.734 2.934 0.003
.y3 ~~
.y7 0.795 0.608 1.308 0.191
.y4 ~~
.y8 0.348 0.442 0.787 0.431
.y6 ~~
.y8 1.356 0.568 2.386 0.017
Variances:
Estimate Std.Err z-value P(>|z|)
.x1 0.082 0.019 4.184 0.000
.x2 0.120 0.070 1.718 0.086
.x3 0.467 0.090 5.177 0.000
.y1 1.891 0.444 4.256 0.000
.y2 7.373 1.374 5.366 0.000
.y3 5.067 0.952 5.324 0.000
.y4 3.148 0.739 4.261 0.000
.y5 2.351 0.480 4.895 0.000
.y6 4.954 0.914 5.419 0.000
.y7 3.431 0.713 4.814 0.000
.y8 3.254 0.695 4.685 0.000
ind60 0.448 0.087 5.173 0.000
.dem60 3.956 0.921 4.295 0.000
.dem65 0.172 0.215 0.803 0.422
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
30/44
Why Bayesian
Flexibility to utilize prior knowledge ( priors )
Robust to small sample sizes
Bayes Factor and flexibility in comparing models
Easy production of the Latent scores ( Factors )
Blaavan ( open software in R )
WinBUGS ( open software )
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
31/44
Bayesian References
A Bayesian approach to confirmatory factor analysis (Lee,1980)
Evaluation of the Bayesian and maximum likelihoodapproaches in analyzing structural equation models with smallsmall sample sizes (Lee, Song, 2004)
Structural Equation Modeling, A Bayesian Approach (Lee,2007)
Basic and Advanced Bayesian Structural Equation Modeling,With Applications in the Medical and Behavioral Sciences(Song, Lee, 2012)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
32/44
Bayesian estimation
log p(Θ|Y ,M) ∝ log p(Y |Θ,M) + log p(Θ)M: arbitrary SEM model
Y: observed dataset of raw observations, sample size nθ: Random vector of parameters in M
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
33/44
Conjugate priors
p(y |θ) =(nk
)θy (1− θ)n−y , θ ∈ (0, 1)
p(θ) ∝ θα−1(1− θ)β−1 , θ ∼ β(α, β)p(θ|y) ∝ p(y |θ)p(θ) ∝ θy (1− θ)n−y (1− θ)β−1
∝ θy+α−1(1− θ)n−y+β−1 ∼ β(y + α, n − y + β)The prior p(θ) and posterior p(θ|y) have the same distribution
form
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
34/44
Measurement model (CFA) Bayesian approach
yi = Λwi + εi , i = 1, · · · , n, yi ∈ Rk
wi ∼ N(0,Φ),w ∈ Rq
εi ∼ N(0,Ψε), Ψε diagonal , Ψεk elementswi and εi are independent
Λ, Φ, Ψε are the parametersLet Λt
k be the kth row of Λ
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
35/44
Measurement model (CFA) priors
The conjugate priors on the parameters are:
Ψεk ∼ IGamma(α∗0εk , β∗0εk)
[Λk |Ψεk ] ∼ N(Λ0k ,ΨεkH0yk)
Φ ∼ IWq(R∗0 , ρ0), R∗0 is pd
The problem is choosing the hyperparameters, such that we haveinformative vs. non informative priors
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
36/44
Measurement model (CFA) Gibbs Sampling (MCMC)
Let Y = y1, · · · , yn be the observed data matrixΩ = (w1, · · · ,wn) matrix of the the latent variables(Y ,Ω) is the complete dataset ( augmented data )
P(Λ, Φ, Ψε|Y ) the posterior is intractable
P(Λ, Φ, Ψε|Ω,Y ) usually standardP(Ω|Λ, Φ, Ψε,Y ) can be also derived based on Model M
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
37/44
Measurement model (CFA) Gibbs Sampling
The Gibbs sampling algorithm allows to sample fromP(Λ, Φ, Ψε,Ω|Y )
at the (j + 1)thiteration given Ωj , Λj , Φj , Ψjε
Generate Ωj+1 ∼ P(Ω|Λj , Φj , Ψjε,Y )
Generate Ψj+1ε ∼ P(Ψε|Ωj+1, Λj , Φj , Y )
Generate Φj+1 ∼ P(Φ|Ωj+1, Λj , Ψj+1ε ,Y )
Generate Λj+1 ∼ P(Λ|Ωj+1, Φj+1, Ψj+1ε ,Y )
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
38/44
Measurement model (CFA) Posterior Parameters Estimates
θt = (Λt , Φt , Ψtε), t = 1, · · · ,T ∗
θ =1
T ∗
T∗∑i=1
θt
var(θ) =1
(T ∗ − 1)
T∗∑i=1
(θt − θ)(θt − θ)t
along with 95% confidence intervals using the Q0.025 and Q0.975
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
39/44
Bayesian CFA Example using Blaavan
library(blavaan)
# specify the model
bHS.model <- " visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
# intercepts
x1 ~ 0
x2 ~ 0
x3 ~ 0
x4 ~ 0
x5 ~ 0
x6 ~ 0
x7 ~ 0
x8 ~ 0
x9 ~ 0
"
bfit.HS <- bsem(bHS.model,
data=HolzingerSwineford1939 )
summary(bfit.HS)
fitMeasures(bfit.HS,fit.measures="all", baseline.model= NULL)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
40/44
Bayesian CFA Example (output)
blavaan (0.2-2) results of 10000 samples after 5000 adapt+burnin iterations
Number of observations 301
Number of missing patterns 1
Statistic MargLogLik PPP
Value -4481.087 0.000
Parameter Estimates:
Latent Variables:
Estimate Post.SD HPD.025 HPD.975 PSRF Prior
visual =~
x1 1.000
x2 1.221 0.018 1.186 1.255 1.000 dnorm(0,1e-2)
x3 0.463 0.012 0.438 0.487 1.000 dnorm(0,1e-2)
textual =~
x4 1.000
x5 1.404 0.020 1.365 1.445 1.004 dnorm(0,1e-2)
x6 0.731 0.016 0.7 0.761 1.001 dnorm(0,1e-2)
speed =~
x7 1.000
x8 1.320 0.020 1.28 1.357 1.002 dnorm(0,1e-2)
x9 1.286 0.019 1.25 1.325 1.002 dnorm(0,1e-2)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
41/44
Bayesian CFA Example (output)
Covariances:
Estimate Post.SD HPD.025 HPD.975 PSRF Prior
visual ~~
textual 15.500 1.321 12.998 18.14 1.000 dwish(iden,4)
speed 20.910 1.764 17.576 24.439 1.000 dwish(iden,4)
textual ~~
speed 13.003 1.118 10.9 15.259 1.000 dwish(iden,4)
Intercepts:
Estimate Post.SD HPD.025 HPD.975 PSRF Prior
.x1 0.000
.x2 0.000
.x3 0.000
.x4 0.000
.x5 0.000
.x6 0.000
.x7 0.000
.x8 0.000
.x9 0.000
visual 0.000
textual 0.000
speed 0.000
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
42/44
Bayesian CFA Example (output)
Variances:
Estimate Post.SD HPD.025 HPD.975 PSRF Prior
.x1 0.716 0.088 0.547 0.891 1.001 dgamma(1,.5)
.x2 1.219 0.138 0.96 1.5 1.000 dgamma(1,.5)
.x3 0.993 0.086 0.832 1.164 1.000 dgamma(1,.5)
.x4 0.449 0.053 0.346 0.552 1.001 dgamma(1,.5)
.x5 0.314 0.069 0.184 0.452 1.002 dgamma(1,.5)
.x6 0.509 0.048 0.417 0.604 1.000 dgamma(1,.5)
.x7 0.877 0.084 0.717 1.045 1.000 dgamma(1,.5)
.x8 0.567 0.077 0.417 0.72 1.000 dgamma(1,.5)
.x9 0.478 0.068 0.347 0.61 1.000 dgamma(1,.5)
visual 24.998 2.118 20.929 29.176 1.000 dwish(iden,4)
textual 10.256 0.882 8.518 11.953 1.001 dwish(iden,4)
speed 17.812 1.539 14.813 20.859 1.001 dwish(iden,4)
> fitMeasures(bfit.HS,fit.measures="all", baseline.model= NULL)
npar logl ppp bic dic p_dic waic
21.000 -4398.287 0.000 8916.354 8837.747 20.586 8838.364
p_waic looic p_loo margloglik
20.848 8838.391 20.861 -4481.087
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
43/44
Conclusions
The frequentist SEM approach is based on MLE
The Bayesian approach with data augmentation and MCMCmethods is flexible to analyze SEM
The Bayesian approach may be used when prior knowledge isavailabe when small sample size
Some open problems (power, optimal designs, GSEM, etc...)
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling
44/44
THANK YOU!
M’hamed (Hamy) Temkit Division of Biostatistics
Bayesian Structural Equations Modeling