The Measurement and Analysis of Complex Traits
Everything you didn’t want to know about measuring behavioral and psychological constructs
Leuven Workshop August 2008
Overview
• SEM factor model basics
• Group differences: - practical
• Relative merits of factor scores & sum scores
• Test for normal distribution of factor
• Alternatives to the factor model
• Extensions for multivariate linkage & association
Structural Equation Model basics• Two kinds of relationships
– Linear regression X -> Y single-headed– Unspecified Covariance X<->Y double-headed
• Four kinds of variable– Squares – observed variables– Circles – latent, not observed variables
– Triangles – constant (zero variance) for specifying means– Diamonds -- observed variables used as moderators (on paths)
Single Factor Model
1.00
F
S1 S2 S3 Sm
l1 l2 l3lm
e1 e2 e3 e4
Factor Model with Means
1.00
F1
S1 S2 S3 Sm
l1 l2 l3lm
e1 e2 e3 e4
F2
MF
B8
mF1 mF2
mS1mS2 mS3
mSm
1.00
Factor model essentials
• Diagram translates directly to algebraic formulae
• Factor typically assumed to be normally distributed: SEM
• Error variance is typically assumed to be normal as well
• May be applied to binary or ordinal data– Threshold model
What is the best way to measure factors?
• Use a sum score
• Use a factor score
• Use neither - model-fit
Factor Score Estimation
• Formulae for continuous case– Thompson 1951 (Regression method)– C = LL’ + V– f = (I+J)-1L’V-1x– Where J = L’V-1 L
Factor Score Estimation
• Formulae for continuous case
– Bartlett 1938
– C = LL’ + V
– fb = J-1L’V-1x
– where J = L’V-1 L
• Neither is suitable for ordinal data
Estimate factor score by ML
M1 M2 M3 M4 M5 M6
F1
l6l1
Want ML estimate of this
ML Factor Score Estimation• Marginal approach
• L(f&x) = L(f)L(x|f) (1)• L(f) = pdf(f) • L(x|f) = pdf(x*) • x* ~ N(V,Lf)
• Maximize (1) with respect to f• Repeat for all subjects in sample
– Works for ordinal data too!
Multifactorial Threshold ModelNormal distribution of liability x. ‘Yes’ when liability x > t
0 1 2 3 4-1-2-3-40
0.1
0.2
0.3
0.4
0.5
x
t
Item Response Theory - Factor model equivalence
• Normal Ogive IRT Model
• Normal Theory Threshold Factor Model
• Takane & DeLeeuw (1987 Psychometrika)– Same fit– Can transform parameters from one to
the other
Item Response ProbabilityExample item response probability shown in white
0 1 2 3 4-1-2-3-4
0.1
0.2
0.3
0.4
0.5
.25
.5
.75
1
ResponseProbability
1
.5
.0
Do groups differ on a measure?
• Observed– Function of observed categorical variable (sex)– Function of observed continuous variable (age)
• Latent– Function of unobserved variable– Usually categorical – Estimate of class membership probability
• Has statistical issues with LRT
Practical: Find the Difference(s)
Item 1 Item 2 Item 3
1.00
Factor
l1 l2 l3
1Mean
r1 r2 r3
1
mean1 mean2 mean3
Item 1 Item 2 Item 3
Variance
Factor
l1 l2 l3
1Mean
r1 r2 r3
1
mean1 mean2 mean3
Sequence of MNI testing1. Model fx of covariates on factor mean & variance
2. Model fx of covariates on factor loadings & thresholds
2. Identify which loadings & thresholds are non-invariant
1 beats
2?
Yes Measurement invariance: Sum* or ML scores
No
3. Revise scale
MNI: Compute ML factor scores using covariates
* If factor loadings equal
Continuous Age as a Moderator in the Factor Model
Stims Tranq MJ
Factor
l1l2
l3
1
0.00
r1 r2 r3
1
mean1
mean2
mean3
Z4
j
Age
1.00
L5
1.00
O5 b
Age
d
V
1.00
W5
k
b - Factor Varianced - Factor Mean
j - Factor Loadings
k - Item meansv - Item variances
[dk and bj confound]
What is the best way to measure and model variation in my trait?
• Behavioral / Psychological characteristics usually Likert– Might use ipsative?
• What if Measurement Invariance does not hold?– How do we judge:
• Development• GxE interaction• Sex limitation
• Start simple: Finding group differences in mean
Simulation Study (MK)
• Generate True factor score f ~ N(0,1)• Generate Item Errors ej ~ N(0,1)
• Obtain vector of j item scores sj = L*fj + ej
• Repeat N times to obtain sample• Compute sum score• Estimate factor score by ML
Two measures of performance
• Reliability
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Two measures of performance
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
• Validity
Simulation parameters
• 10 binary item scale
• Thresholds – [-1.8 -1.35 -0.9 -0.45 0.0 0.45 0.9 1.35 1.8]
• Factor Loadings– [.30 .80 .43 .74 .55 .68 .36 .61 .49]
Mess up measurement parameters
• Randomly reorder thresholds
• Randomly reorder factor loadings
• Blend reordered estimates with originals 0% - 100% ‘doses’
Measurement non-invariance
• Which works better: ML or Sum score?
• Three tests:– SEM - Likelihood ratio test difference in latent factor
mean– ML Factor score t-test– Sum score t-test
MNI figures
More Factors: Common Pathway Model
More Factors: Independent Pathway Model
= 1 = 1 = 1
Independent pathway model is submodel of 3 factor common pathway model
Example: Fat MZT MZA
Results of fitting twin
model
ML A, C, E or P Factor Scores• Compute joint likelihood of data and factor scores
– p(FS,Items) = p(Items|FS)*p(FS)– works for non-normal FS distribution
• Step 1: Estimate parameters of (CP/IP) (Moderated) Factor Model
• Step 2: Maximize likelihood of factor scores for each (family’s) vector of observed scores– Plug in estimates from Step 1
Business end of FS script
The guts of it
! Residuals only
Shell script to FS everyone
Central Limit TheoremAdditive effects of many small factors
0
1
2
3
1 Gene 3 Genotypes 3 Phenotypes
0
1
2
3
2 Genes 9 Genotypes 5 Phenotypes
01234567
3 Genes 27 Genotypes 7 Phenotypes
0
5
10
15
20
4 Genes 81 Genotypes 9 Phenotypes
Measurement artifacts
• Few binary items• Most items rarely endorsed (floor effect)• Most items usually endorsed (ceiling effect)
• Items more sensitive at some parts of distribution
• Non-linear models of item-trait relationship
Assessing the distribution of latent trait• Schmitt et al 2006 MBR method
• N-variate binary item data have 2N possible patterns
• Normal theory factor model predicts pattern frequencies– E.g., high factor loadings but different thresholds– 0 0 0 0– 0 0 0 1 but 0 0 1 0 would be uncommon– 0 0 1 1– 0 1 1 1– 1 1 1 1 1 2 3 4
item threshold
Latent Trait (Factor) Model
M1 M2 M3 M4 M5 M6
F1
l6l1Discrimination
Difficulty
Use Gaussian quadrature weights to integrate over factor; then relax constraints on weights
Latent Trait (Factor) Model
M1 M2 M3 M4 M5 M6
F1
l6l1Discrimination
Difficulty
Difference in model fit: LRT~ 2
Chi-squared test for non-normality performs well
Detecting latent heterogeneityScatterplot of 2 classes
S1
S2
Mean S1|c1
Mean S1|c2
Mean S2|c1 Mean S2|c2
Scatterplot of 2 classesCloser means
S1
S2
Mean S1|c1
Mean S1|c2
Mean S2|c1 Mean S2|c2
Scatterplot of 2 classesLatent heterogeneity: Factors or classes?
S1
S2
Latent Profile Model
Class 1: p
Class 2: (1-p)
ClassMembership probability
S 1 | c 2 S 2 | c 2 S 3 | c 2 S p | c 2
1
e 1 | c 2
m 1 | c 2
e 2 | c 2
m 2 | c 2
e 3 | c 2
m 3 | c 2
e 4 | c 2
m p | c 2
S 1 | c 1 S 2 | c 1 S 3 | c 1 S p | c 1
1
e 1 | c 1
m 1 | c 1
e 2 | c 1
m 2 | c 1
e 3 | c 1
m 3 | c 1
e 4 | c 1
m p | c 1
Factor Mixture Model
1.00
F
S1 S2 S3 Sm
l1 l2 l3lm
e1 e2 e3 e4
1.00
F
S1 S2 S3 Sm
l1 l2 l3lm
e1 e2 e3 e4
Class 1: p
Class 2: (1- p)
ClassMembership probability
NB means omitted
Classes or Traits? A Simulation Study
• Generate data under:– Latent class models– Latent trait models– Factor mixture models
• Fit above 3 models to find best-fitting model– Vary number of factors– Vary number of classes
• See Lubke & Neale Multiv Behav Res (2007 & In press)
What to do about conditional data
• Two things– Different base rates of “Stem” item– Different correlation between Stem and “Probe”
items
• Use data collected from relatives
Data from Relatives: Likely failure of conditional independence
STEM P2 P3 P4 P5 P6
F1
l6l1
STEM P2 P3 P4 P5 P6
F2
l6l1
R > 0
Series of bivariate integrals
m/2 t1i t2i
Π ( (x1, x2) dx1 dx2 )j
j=1 t1 t2
i-1 i-1
-3 -2 -1 0 1 2 3 -3-2-10
1 2 3
0.30.40.5
Can work with p-variate integration, best if p<m “Generalized MML” built into Mx
Dependence 1Did your use of it cause you physical problems or make you depressed or very nervous?
Consequence: physical & psychological
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-4 -3 -2 -1 0 1 2 3 4
cannabis
cocaine
stimulants
sedatives
opioids
hallucinogens
Extensions to More Complex Applications
• Endophenotypes
• Linkage Analysis
• Association Analysis
Basic Linkage (QTL) Model
Q: QTL Additive Genetic F: Family Environment E: Random Environment3 estimated parameters: q, f and e Every sibship may have different model
= p(IBD=2) + .5 p(IBD=1)
F1
P1
F2
P2
11
1
1
Q1
1
Q2
1
E2
1
E1
Pihat
e f q q f e
Measurement Linkage (QTL) Model
Q: QTL Additive Genetic F: Family Environment E: Random Environment3 estimated parameters: q, f and e Every sibship may have different model
= p(IBD=2) + .5 p(IBD=1)
P1
F2
P2
1
1
1
Q2
1
E2
Pihat
F1
1 1
Q1
1
E1
e f q q f e
M1 M2 M3 M4 M5 M6
l6l1
M1M2M3M4M5M6
l6 l1
F1
1 1
Q1
1
E1
e f q
Fulker Association Model
Multilevel model for the means
LDL1 LDL2
M
G1 G2
S D
B W
w
-0.50
-1.00
RR
C
0.75
1.00
1.00
1.00
0.50
0.50
0.50
b
Geno1 Geno2
m m
Measurement Fulker Association Model (SM)M
G1 G2
S D
B W
w
-0.50
-1.001.00
1.00
1.00
0.50
0.50
0.50
b
Geno1 Geno2
m m
F2
M1M2M3M4M5M6
l6 l1
M1 M2 M3 M4 M5 M6
l6l1
F1
M
G1
G2
S D
B W
w
-0.50
-1.00
1.00
1.00
1.00
0.50
0.50
0.50
b
Geno1
Geno2
m m
Multivariate Linkage & Association Analyses
• Computationally burdensome• Distribution of test statistics questionable• Permutation testing possible
– Even heavier burden• Potential to refine both assessment and genetic models• Lots of long & wide datasets on the way
– Dense repeated measures EMA– fMRI– Need to improve software! Open source Mx