1
Variable selection for factor analysis and
structural equation models
Yutaka Kano & Akira Harada
Osaka University
International Symposium on Structural Equation Modeling, at Chicago, Dec. 13-15, 2000
2
SEM has come to Japan
3
SEM in Japan Japanese Books
• Toyoda (1992). CSA with SAS• Toyoda, et al. (1992). Exploring Causality:
An Introduction to CSA• Kano (1997). CSA with Amos, Eqs and Lisrel• Toyoda (1998). SEM: Introductory Course• Toyoda (editor, 1998). SEM: Case Studies• Yamamoto and Onodera (editor, 1999).
CSA with Amos• Toyoda (2000). SEM: Advanced Course
4
SEM in Japan Tutorial Seminar (organized by
academic society)• Behaviormetric Society of Japan
• 1995, 1998, 2000
• Japan Statistical Society• 1999
• Japan Psychological Association• 1998
• Japanese Association of Educational Psychology
• 1999
5
SEM in my class(graduate course)
1. What does SEM can do?• Path analysis, CFA, Multiple indicator analysis
2. How to create a program file
3. How to read an output file• Fit index, standardization, decomposition of
effects
6
4. CFA and model modification• Hypotheses on loadings• Analysis of MTMM matrix• LM and Wald tests• MIMIC model
5. Extended models• Mean structure model • Multi-sample analysis• Multi-sample analysis with mean structure• Model with binary independent variables
7
6. Other useful models• Analysis of experimental data with SEM
• Anove, Ancova, Manova, Latent mean analysis
• Longitudinal data and 3-mode data analysis• Latent curve model• Additive model, direct-product model, PARAFAC
7. Other topics• EFA versus CFA• Cautionary notes on causal analysis• Improper solution• Variable selection
8. Software• LISREL, EQS, AMOS, CALIS, SEPATH, etc
8
Variable selection in factor analysis
Exploratory analysis• SEFA(Stepwise variable selection in EFA)• http://koko15.hus.osaka-u.ac.jp/~harada/se
fa2001/stepwise/ Confirmatory analysis
• SCoFA(Stepwise Confirmatory FA)• http://koko16.hus.osaka-u.ac.jp/~harada/sc
ofa/input.html
9
Input Data
What SEFA or SCoFA needs are• correlation matrix• sample size• the number of variables• the number of factors
• and Internet!!
10
Illustration
Data• 24 Psychological variables
• p=24, n=145, k=4
• Joreskog(1978, Psychometrika)• Analyzed it with EFA and CFA• EFA….Chi-square=227.14, P-value=0.021• CFA….Chi-square=301.83, P-value=0.001
11
WebP
age for input
12
WebP
age for input
13
24 Psychological variables:Exploratory analysis
14
15
Predicted Chi- Squaresin EFA
175
180
185
190
195
200
205
210
215
220
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24
154.198)05.0(2167
16
17
24 Psychological variables:Confirmatory analysis
18
Specify factor loading matrix
1945.267)05.0(2
231 Original Model (p=24)
20
Predicted P-valuesin CFA
0.0000
0.0020
0.0040
0.0060
0.0080
0.0100
0.0120
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23 X24
P-values for 24 models
21X3-deleted Model (p=23)
22X3,X11-deleted Model (p=22)
23
Final results
EFA• Chi-square=227.14(186), P-value=0.021• Delete X11• Chi-square=190.01(176), P-value=0.107
CFA• Chi-square=301.83(231), P-value=0.001• Delete X3, X11• Chi-square=220.17(189), P-value=0.060
24
Theory of SEFA and SCoFA
Obtain estimates for a current model Construct predicted chi-square for each
one-variable-deleted model using the estimates, without tedious iterations
We will take a sort of LM approach
25
Known quantities and goal
saturated is)V(:)()V(:
:
ˆ:
,)()V(:
Statistics and Model Current
00
2
XX
X
AvsHT
STATISTICS
MLE
MODEL
examined be toent variablinconsistepossibly :
model)current a(in vector observed : ]',,,[
where
saturated is )V(:)()V(:
is want What we
1
2
21
222222
X
XXX
AvsHT
p
X
X
XX
26
Basic idea
)()V(:)()V(:
saturated is )V(:)(
)V(:
saturated is )V(:)()V(:
saturated is)V(:)()V(:
:used be tostatistics test New
2221
1211'20'02
2221
1211'2'2
222222
00
XX
XX
XX
XX
HvsHT
AvsHT
AvsHT
AvsHT
'020
'200'22
TT
TTTTTa
We construct T02’ as LM test
27
Final formula for T2
)(
)()'()()()'()()()(
')(
2222
122
12
1222
12
12
2222
'0202
Sv
Svn
TTT
NNNN
Note: This is Browne’s (Browne 1982) statistic of goodness-of-fit using general estimates
28
Summary 1 We introduced goodness-of-fit as a criteria
for variable selection in factor analysis You can easily access the programs on th
e internet• SEFA(Stepwise variable selection in EFA)
• http://koko15.hus.osaka-u.ac.jp/~harada/sefa2001/stepwise/
• SCoFA(Stepwise Confirmatory FA)• http://koko16.hus.osaka-u.ac.jp/~harada/scofa
/input.html
29
Summary 2
They print predicted values of fit indices for each one-variable-deleted model [one-variable-added models]• Chi-square, GFI, AGFI, CFI, IFI, RMSEA
They will be useful for many situations including scale construction
High communality variables can be inconsistent
30
References for variable selection Kano, Y. (in press).
Variable selection for structural models. Journal of Statistical Inference and Planning.
Kano, Y. and Harada, A. (2000). Stepwise variable selection in factor analysis. Psychometrika, 65, 7-22.
Kano, Y. and Ihara, M. (1994). Identification of inconsistent variates in factor analysis. Psychometrika, Vol.59, 5-20.