Download - Introduction to Mplus - Oregon State Universitypeople.oregonstate.edu/~acock/growth/2010 Mplus Workshop/Mplus... · Introduction to Mplus ... AMOS (now an SPSS product) was developed

7/9/10

1

Alan C. Acock University Distinguished Professor of Family Studies &

Knudson Chair for Family Research & Policy Oregon State University

College of Health and Human Sciences Summer Workshop Series

July 2010

Introduction to Mplus

A brief history  LISREL (Joreskog and Sorbom) was being

developed in the late 1960s and released commercially in the early 1970s  Originally relied on entering 8 matrices specifying

all the parameters that were being estimated or fixed at a certain value

 Today has a graphic interface that generates the commands from a path diagram

 Extremely capable alternative to Mplus

Alan C. Acock, July, 2010 Introduction to MPlus 1

7/9/10

2

A brief history  EQS (Bentler) was developed much later and

replaced the matrices with writing out a separate equation for each relationship  It now has a nice “Diagrammer”

Alan C. Acock, July, 2010 2

A brief history  AMOS (now an SPSS product) was developed

based on a graphic interface

  It has the slowest introduction of new capabilities


7/9/10

3

A brief history--Mplus  Version 6 April 2010  Version 5 November 2007  Version 4 February 2006  Version 3 March 2004  Version 2 February 2001  Version 1 November 1998


A brief history--MPlus  Very rapid development  Late development allowed a non graphic

interface to be highly efficient  Destroys the idea that a picture is worth 1000

words—Develop statistical applications, not drawing

 Need separate drawing program, but this is best for publication quality  Omni Graffle (Mac) for most figures here  Office Visio or Open Office Draw (PC)


7/9/10

4

A Brief History--Mplus


  Bengt Muthén is the statistician   Linda Muthén is the language/interface/business   Several people have contributed programming   Economy of Scale idea is reversed   Microsoft has 40,000 programmers so it takes a long time to

make a useful change   Mplus has a couple programmers so it rapidly adds features   Many new features are added between versions

Buying Mplus


 Greatly reduced Student prices  There are three modules (they apparently learned

this module idea from SPSS). You probably want all three

 There is an annual maintenance and this lets you  Get “free” support  Get “free” updates  I started with Mplus 3.0 and have only paid for

the annual maintenance fee ($175) since then

7/9/10

5

Resources for Learning  Barbara Byrne. (2010). Structural Equation

Modeling with MPlus: Basic Concepts, Applications, and Programming. (Was to be available July 1

 www.statmodel.com  Large, 752 page, User’s Manual as pdf file  Short courses on video  There are 8 of these, each is one day long  Download handouts to follow videos


Resources for Learning  www.ats.ucla.edu/stat/seminars/  UCLA has several online examples and videos  We will utilize files from the Mplus manual for many

of our examples. These typically involve simulated data. Sometimes we well assign hypothetical variable names to make these somewhat realistic

 Brown, Timothy. (2006). Confirmatory Factor Analysis for Applied Research. N.Y.: Guilford

 Kline, Rex. (2010). Principles & practices of structural equation modeling (3rd ed.). N.Y. Guilford


7/9/10

6

The Mplus Interface


The Mplus Interface


  The window shown above is the input window   You write Mplus programs in this window to read the data to

be analyzed and to specify your model of interest   You then save your Mplus program and select Run Mplus

from the Mplus menu to submit your program to the Mplus engine for processing

  ► File!  ► Open!  This is located at c:\Progrm Files\Mplus\Mplus Examples\User’s Guide Examples\!

7/9/10

7

Mplus Command Structure


  TITLE: (optional unless you want to know what the file is intended to do)

  DATA: (required),   VARIABLE: (required),   DEFINE: (some data transformations are available)   SAVEDATA: (used for specialized applications)   ANALYSIS: (for special analyses such as EFA   MODEL: (a series of equations)   OUTPUT: (many options are available)   MONTECARLO: (used for simulations, power analysis)

Mplus Commands


  The TITLE command allows you to specify a title   This can go on and on for many lines and usually should   Everything is a Title until a command name appears at the start of a new

line   I like to put the file name as the first line of a title

  The DATA command specifies where Mplus will locate the data & the format of the data. Mplus will read the following file formats:   tab-delimited text,   space-delimited text, and   comma-delimited text

  The input data file may contain records in free field format or fixed format   If you are using data stored in another form (e.g., Stata, SAS, SPSS, or

Excel), you will need to convert it to one of the formats with which Mplus can work

7/9/10

8

Here is the Data file


Data

Alan C. Acock, July, 2010 Introduction to Mplus 15

  We have labeled missing values with a -9. Easiest to pick one value that will work for all variables—can be any number or a dot

  Notice we have one observation, case 13, that has a missing value on all variables

  The data happens to be in a fixed format   Could be comma delimited, cvs file from Excel   Missing are Variable labels, Value labels   If you have the data in Stata you can use stata2mplus

to set things up for you

7/9/10

9

Data

Alan C. Acock, July, 2010 Introduction to Mplus 16

stata2mplus using example, replace !

  This creates two files:   classnsfh.inp that will run a basic analysis in Mplus and   classnsfh.dat, a comma delimit ASCII file that Mplus can read with

all missing values coded/recoded as -9999.

1,1,1,1,2,1,1,1,1,1,1,1,2,1,1

3,2,2,3,2,3,2,2,2,2,2,2,2,2,2

. . .

1,1,1,-9999,1,2,2,1,2,2,2,2,2,1,1

3,3,3,3,3,3,3,3,3,3,3,3,3,3,3

2,2,1,2,1,2,2,1,1,2,1,-9999,1,1,1

Note, recommended to make a separate folder for each project such. Save data and Mplus programs in that folder

Mplus commands


  The VARIABLE command names variables  These must be in the identical order to the way Stata/SAS/SPSS

wrote the data file (common mistake)  Mplus variable names may not have more than 8 characters.

Change variable names to be 8 characters or less or you will get error messages.

  The ANALYSIS command tells Mplus what type of analysis to perform. Often not needed

  Many analysis options are available.   Some of these such as Type = EFA make additional

commands unnecessary.

7/9/10

10

Mplus Rules


  All commands (Title, Data, etc.) begin on a new line

  All command names must be followed by a colon   For e.g., Title: The key word becomes blue   Semicolons separate command options—similar to SAS   The record length no longer than 80 columns   Variables can contain upper and/or lower case letters  Only variable names are case sensitive--SAY1 ≠ say1≠ Say1≠ SaY1!

Default Assumptions


F1 F2

x1 x2 x3 x4 x5 x6

e1 e2 e3 e4 e5 e6

1 1

7/9/10

11

Default Assumptions

Alan C. Acock, July, 2010 Introduction to MPlus 20

  Mplus assumes that you either have no missing values or are using full information maximum likelihood estimation and assuming missing values are missing at random (MAR)

  Parameters such as loadings can be fixed   Many loadings are fixed at 0.0 in the CFA models because the item

should not load on the factor.   There is no path from F1 to X4 in our figure

  Fixed parameters can be “freed,” meaning you will estimate them   We could add a path from to X4 or   Let E1 be correlated with E4!

  Fixed parameters are required to stay at a specified value, such as 0.0 or 1.0

Default Assumptions


  All free parameters are put into a vector and iterations change values of these free parameters, until the model’s fit is optimal

  Unless we tell it otherwise, Mplus will fix the first indicator’s loading at 1.0 as the reference indicator (except for EFA).   For example, F1X1 and F2X4 have fixed loadings of 1.0 by

default.   One way to change reference indicator is to reorder variables, e.g.   F1 by x2 x1 x3 makes x2 reference indicator

  Good to pick a strong indicator as the reference indicator—don’t get a significance test for reference indicator

7/9/10

12

Exploratory Factor Analysis


F1 F2

y1 y2 y3 y4 y5 y5

e1 e2 e3 e4 e5 e6

EFA with Continuous Variables


EFA with Continuous Variables!

TITLE: !efa1.inp!

This is an example of an exploratory!

factor analysis with continuous!

! ! !factor indicators!

DATA:! !FILE IS "c:\Mplus Examples\efa4.dat";!

VARIABLE: !NAMES ARE y1-y12;!

ANALYSIS: !TYPE = EFA 1 4;!

!! ! !ESTIMATOR = ml;!

!! ! !ROTATION = Geomin;!

OUTPUT: sampstat;!

7/9/10

13

EFA with Continuous Variables


  The Type = EFA 1 4 tells Mplus to perform exploratory factor analysis

  The 1 and 4 following the EFA specification tells Mplus to generate all possible factor solutions between and including 1 and 4

  The ESTIMATOR = ml option has Mplus use the maximum likelihood estimator to perform the factor analysis (default)

  This provides a chi-square goodness of fit test that the number of hypothesized factors is sufficient to account for the correlations among the six variables in the analysis

  This has an exclamation mark in front of it which makes it green. Anything green is a comment and is ignored by the program. This subcommand is not necessary because maximum likelihood estimation is the default

EFA with continuous variables


  Mplus uses the geomin rotation which is oblique as its default. More traditional rotations such as varimax are available. See help for a listing of options

  We do not need a MODEL: EFA 1 4 takes care of this.   If you have reason to believe that this assumption has not been met and

your sample is reasonably large (e.g., n ≥ 200), you may substitute mlm or mlmv in place of ml on the ESTIMATOR = line   The mlm option provides a mean-adjusted chi-square model test statistic

whereas the   mlmv option produces a mean and variance adjusted chi-square test of model

fit.   SEM users who are familiar with Bentler's EQS software program

should also note that the mlm chi-square test and standard errors are equivalent to those produced by EQS in its ML:ROBUST method

7/9/10

14

EFA results


  Sample correlations   Root Mean Square Error of Approximation (RMSEA)  Chi-square test of the one, two, three, and four factor

models  Sensitive to sample size (such that large samples often

return statistically significant chi-square values)  Non-normality in the input variables

  Standard errors and z-tests for loadings and correlations of factors

EFA results


 How many factors? Model 1 chi-square (54 degrees of freedom) = 1052.089; p < .001 Model 2 chi-square (43 degrees of freedom) = 723.022; p < .001 Model 3 chi-square (33 degrees of freedom) = 341.268; p < .001 Model 4 chi-square (24 degrees of freedom) = 25.799; p, not sign.

  Is model 4 better than model 3? Model 3 chi-square (33 degrees of freedom) = 341.268 Model 4 chi-square (24 degrees of freedom) = 25.799 Difference chi-square (9 degrees of freedom 315.469; p < .001

  . display 1-chi2(df,chi-square)!  . display 1-chi2(9,315.469)!

7/9/10

15

EFA Categorical Variables


  For the purposes of illustration, suppose that you recode each variable into a replacement variable where all six variables' values at the median or below are assigned a categorical value of 1.00 and all values above the median assigned a value of 2.00.

  For categorical variables, Mplus automatically recodes the lowest value to zero with subsequent values increasing in units of 1.00.

  While the four underlying latent factors remain continuous, the six categorical observed variables' response values are now ordered dichotomous categories.

  You may use the program that appeared in the initial exploratory factor analysis example, with the following modifications, and the new data file that contains the categorical variables ex4.2.dat, as shown below.

EFA Categorical Variables


  There are two estimators.  WLSMV (Weighted Least Squares, mean and variance

adjustment) is very fast and reasonably good   You should use this for initial runs, default

  Running this on a server used by many students, it ran in 1 second

 MLR (Robust Maximum Likelihood). This is painfully slow, even for a simple and well behaved example like the one we will estimate.   Save this till you are almost done

  Use this when you need to test for the number of factors

  This took 18 minutes to run.

  Under the Analysis section you need to specify this estimator as shown below.

7/9/10

16

EFA Categorical Program


TITLE:! !ex4.2.inp! !This is an example of an exploratory !! ! !factor analysis with categorical factor! !indicators. It uses weighted least squares!! ! !estimation it computes tetrachoric!! ! !correlations and does the Factor analysis

! !on them. The RMSEA and chi-square!! ! !values are reported.!DATA: ! ! !FILE IS ex4.2.dat;!VARIABLE: !! !NAMES ARE u1-u12;!! !CATEGORICAL ARE u1-u12;!ANALYSIS: !! !TYPE = EFA 1 4;!!! ! !ESTIMATOR = MLR;!! ! !PROCESSORS = 4 ;!

EFA Categorical Interpretation


  Run and interpret  Univariate propotions for each variable

  Review 4 factor solution  Chi-square is a bit problematic (Difftest requires CFA)  CFI  RMSEA  SRMR  GEOMIN (correlated solution)

7/9/10

17

EFA Categorical Interpretation


 Review 4 factor solution  Factor Correlations  Residual variances  Standard Errors  Est./S.E. = z-test

Comparison of Continuous and Categorical Solutions


7/9/10

18

EFA at 2 waves with factor loading invariance and correlated residuals


  Does a two factor solution for measures of internal and external locus of control change after some life event

  Use large, longitudinal national survey to find a subsample of people who experienced some event such as divorce and who have their internal and external locus of control measured at a wave preceding and following divorce

  The figure might look like the following where items 1-3 and 7-9 measure internal locus of control and items 4-6 and 10-12 measure external locus of control



7/9/10

19



TITLE: !ex5.26.inp!! ! !this is an example of an EFA!! ! !at two time points!! ! !with factor loading!! ! !invariance and correlated! ! ! !residuals across time !DATA: !FILE IS ex5.26.dat;!VARIABLES:NAMES ARE y1-y12;!MODEL: !f1-f2 BY y1-y6 (*t1 1);!! ! !f3-f4 BY y7-y12 (*t2 1);!! ! !y1-y6 PWITH y7-y12;!OUTPUT: !TECH1 STANDARDIZED;!



  The unstandardized loadings are equal, but the standardized loadings are not

  When comparing different people, groups, or waves, you want unstandardized coefficients

  Standardized coefficients are

  Soooo, Beta depends on the relationship, B, and the relative standard deviations

β = BSDy

SDx

7/9/10

20



  The f1-f2 by y1-y6 (*t1 1); means that factors 1 and 2 are measured by y1-y6 !

  The (*t1 1) means that f1 and f2 are a set of factors labeled t1 and the 1 is used to make the loadings invariant with any other set of factors that has a 1!

  The f3-f4 by y7-y12 (*t2 1); means that f3 and f4 are a set of factors called t2!

  The 1 after t2 means that the loadings must be the same as for t1 since that set also had a 1!



  The y1-y6 pwith y7-y12; allows the paired errors to be correlated. It is equivalent to  y1 with y7; y2 with y8; y3 with y9; y4 with y10; y5 with y11; y6 with y12;!

  There is no name for the error terms so y1 with y7 is not really y1 with y7, but e1 with e7—only way to do it without naming the error terms—drawing figure first helps.

  The tech1 gives a series of LISREL type matrices showing all the parameters being estimated

7/9/10

21

Cautions when doing EFA


  Although one or more of the observed variables may be categorical, all latent variables in the model are continuous

  The analysis specification and interpretation of the output, e.g., loadings & factor correlations, is the same whether one, a subset, or all observed variables are categorical

  Categorical observed variables may be dichotomous or ordered categorical outcomes of more than two levels), but nominal level observed variables with more than two categories may not be used in the analysis as outcome variables using this strategy

  Sample size is more stringent than for continuous variables; typically you want a minimum of 200 cases (preferably more) to perform any analysis with categorical outcome variables

Cautions when doing EFA


  Mplus provides z-tests for all loadings and correlations   Manual illustrates applications with censored and count

variables   Can apply mixture models with continuous indicators

 Are there sub groups that have different results  Variable: Names = y1 – 12;! Classes = c(2);! Type = mixture efa 1 4 ;!

7/9/10

22

Confirmatory Factor Analysis (CFA)


  What if you had an a priori hypothesis that the visual perception (Y1), cubes (Y2), and lozenges (Y3) variables belonged to a single factor—Visual

  Whereas the paragraph (Y4), sentence (Y5), and word meaning (Y6) variables belonged to a second factor--Cognitive?

  F1 will not have a loading on Y4, Y5, or Y6 & F2 will not have a loading on Y1, Y2, or Y3.

  Correlation of Y1 with Y4 is loading of F1 on Y1 ✕ correlation of F1 and F2 ✕ Loading of F2 on Y4!

  The diagram shown below illustrates the model visually

Confirmatory Factor Analysis (CFA)


F1 F2

y1 y2 y3 y4 y5 y5

e1 e2 e3 e4 e5 e6

1 1

7/9/10

23

CFA means study the correlations


Study Correlation matrix


  Compare this to the following   Y2 and Y1 have a different pattern with Y4-Y6. The

single correlation between F1 and F2 could not handle this

  The fit will not be very good

7/9/10

24

MPlus CFA program


TITLE: !ex5.1.inp!! ! !This program runs a CFA!DATA: !FILE IS ex5.1.dat; !VARIABLE:!NAMES ARE y1-y6;!MODEL: !f1 BY y1-y3; !! ! !f2 BY y4-y6;!OUTPUT: !sampstat stdyx mod(3.84);!

CFA program


  You must define what parameters are estimated;   All other parameters are assumed to be fixed.   Fixed parameters are either zero or some value you set.

 DEFAULTS FOR ANALSYSIS, FIML   To do listwise deletion we would specify this in the DATA command

  Listwise = on;   Put it under DATA:!

  The MODEL command allows you to specify the parameters of your model   The BY keyword to define the latent variables   The latent variable name appears on the left-hand of the BY whereas the

measured variables appear on the right-hand side of the BY keyword   Mplus will fix the loading for the first indicator at 1.0 unless you tell it

otherwise

7/9/10

25

CFA program


  WITH keyword—the WITH keyword to correlate the F1 latent factor with the F2 latent factor is a default  By Measured by  With Correlated with

  We do not need F1 with F2 because that is the default. If we wanted to see how the model did with these fixed we would add the line F1 with F2@0 ;!

  OUTPUT: command contains an added keyword, standardize or stdyx. This option instructs Mplus to output standardized parameter estimate values in addition to the default unstandardized values

Why is one loading fixed at 1.0?


  The default fixes the unstandardized loading of the first item after BY at 1.0

  This has to do with model identification   In exploratory factor analysis the variance of the factor (latent

variable) is fixed at 1.0 by the program. Given this, the program estimates the loadings

  With CFA, you need to set a variance for the latent variable because the size of the loadings are scaled from the size of the variance

  Setting the variance of the latent variable (factor) at 1.0 solves this problem with EFA and is an option with CFA. But, Mplus suggests a more general approach in which you fix one of the loadings of each latent variable (factor) at 1.0 with CFA

7/9/10

26

Fixed loading more general than fixed variance


  Comparing Groups: One group might be more variable than another   We might find that girls not only have higher verbal skills than boys, but

that they are either more homogeneous in these skills.   An intervention that not only improves the mean outcome, but does so

in a way that makes the distribution more homogeneous is preferred   In some cases we are interested in the variances of the latent variables as

an important topic and we could not study that if we fixed the variance at 1.0

  Regardless of which item you pick to fix the loading at 1.0, the standardized solution will always be the same because that solution rescales the variance of the latent variable to be 1.0 and the fully standardized solution also rescales the variance of each indicator to be 1.0

Run & Interpret Selected Output


  Estimator by default is ML   sampstat gives us means, covariance matrix, and correlation

matrix. Good to compare to what you had in Stata, SAS, etc.   Fit Statistics, Chi square, baseline chi-square, CFI, Information

criteria, RMSEA, & SRMR!  Unstandardized solution (loadings, z-tests, p’s)   F2 with F1!  Residual Variances   Standardized on all variables STDYX!

  The z-tests are different   Modification indices

7/9/10

27

A Figure of CFA results


Second Order CFA


  Conceptually, some attitudes or ideologies are generalizations   Liberalism may explain more specific forms of liberalism

such as economic liberalism, social liberalism, etc.   Alienation may explain value isolation, powerlessness,

normlessness etc.   Such examples are second order factor analysis where a

highly general second order factor explains the relationship between several first order factors

  Any correlation between the first order factors is because they have the common cause, that being the second order factor

7/9/10

28

Second Order Factor Analysis


F1 F2

y1 y2 y3 y4 y5 y6

e1 e2 e3 e4 e5 e6

F3 F4

y7 y8 y9 y10 y11 y12

e7 e8 e9 e10 e11 e12

F5

2nd Order Factor Analysis


  Just like we would like to have at least three indicators for a single latent variable, we would like to have three first order factors

  The first indicator of each first order factor is fixed at 1.0   The first of the first order factors has its loading on the

second order factor fixed at 1.0   The first order factors are uncorrelated, i.e., their

correlations are explained by the second order factor

7/9/10

29

2nd Order CFA


TITLE: !ex5.6.inp!! ! !This is an example of a second!! ! !order factor analysis!DATA: !File is ex5.6.dat;!VARIABLE:!Names are y1-y12;!MODEL:!! ! !f1 by y1-y3;!! ! !f2 by y4-y6;!! ! !f3 by y7-y9;!! ! !f4 by y10-y12;!! ! !f5 by f1-f4;!OUTPUT: !sampstat stdyx mod(3.84);

2nd Order CFA Interpretation


 Correlation matrix has significant correlations of indicators of different factors, e.g., y1-y3 with y4-y12!

 Standardized solution  R-square for latent variables F1-F4! These models make sense conceptually, but are

rarely a reasonable fit

7/9/10

30

Equality Constraints


  Are items truly interchangeable. Alpha assumes that all items are equally salient to the concept being measured. That is you weight each item equally with a 1.0 weight. CFA can test & extend this:   tau equivalence—All loadings are constrained to be equal  Compare fit of this model to a model in which they are

unconstrained

  Parallel equivalence. Tau equivalence plus all error terms are equal  Very hard to achieve and often we can proceed without this

condition

Equality Constraints—Marital Satisfaction of Husbands and Wives


  Lack of Tau equivalence  Women may weigh emotional support more than men

 Men may weight sexual satisfaction more than women

  If tau equivalence holds the latent variable has the same meaning in both groups  Without this equivalence we are comparing apples and oranges.

Why compare means if the concept has a different meaning for each group?

 Men may be more satisfied than women

7/9/10

31

Equality Constraints—Marital Satisfaction of Husbands and Wives


Equality Constraints—A little algebra


  In regression we can write:

  If we examine the figure we see that each observed variable, we will call it X, for each X  Where tau is the intercept, lambda-x is the matrix of

loadings, kappa is the mean of the latent variable   This adds 10 parameters we need to estimate, 8

intercepts and 2 latent variable means

My = a + bMx

Mx = τ x + λxκ

7/9/10

32

Equality Constraints—Identification


  This adds 10 parameters we need to estimate, 8 intercepts and 2 latent variable means.  Include the means along with the covariance matrix  Make some additional restrictions  We could fix one intercept at each wave at zero  Now we have added 8 means and 8 new parameters to

estimate (6 intercepts and 2 latent variable means)

Equality Constraints—Data


 We can enter the means and covariances or enter the means, SDs, and correlations

1.500 1.320 1.450 1.410 6.600 6.420 6.560 6.310 1.940 2.030 2.050 1.990 2.610 2.660 2.590 2.550 1.000 0.736 1.000 0.731 0.648 1.000 0.771 0.694 0.700 1.000 0.685 0.512 0.496 0.508 1.000 0.481 0.638 0.431 0.449 0.726 1.000 0.485 0.442 0.635 0.456 0.743 0.672 1.000 0.508 0.469 0.453 0.627 0.759 0.689 0.695 1.000

7/9/10

33

Equality Constraints—Process


  We estimate four models, each of which includes estimating means   First model estimates the means imposing the same form for

the model at both waves or for both wives and husbands  This model doesn’t make a lot of sense. With unequal loadings,

the meaning of satisfaction changes with some indicators becoming more salient and others less salient

 This could be interesting as, for example, sexual satisfaction may be less central and emotional support may be more satisfying in more mature marriages or for wives

Equality Constraints—Program for Form


Note, Observed variable name without () or [] refers to its error term

7/9/10

34

Equality Constraints—Process


  Mplus puts square brackets [] around intercepts and means, defined by context   [A1@0]; [A2@0]; fix first intercepts at zero   [SATIS1*]; [SATIS2*] make the latent means free   [B1 B2] (4); Assigning the same number to both B1 and B2

make them equal. Since these are observed variables, these refer to their intercepts being equal

  A1 A2 (7) etc. make the error terms equal as in parallel equivalence, but we have not forced the loading to be equal

  Next, we add restriction that loadings are equal

Equality Constraints—Equal Loadings


Note, A1 and A2 loadings fixed at 1.0 by Mplus, B1 & B2 (1), etc.

7/9/10

35

Equality Constraints—Equal Intercepts


Note. [B1 B2] 4 ; makes intercepts equal. [] for latent variable is a mean; [] for observed is an intercept

Equality Constraints—Equal Errors


Note, A1 A2 (7) means these errors are equal. No name for errors so observed variable name without () or [], actually refers to the variables error

7/9/10

36

Summarizing Results


Selected Results


7/9/10

37

Selected Results


Path Analysis


7/9/10

38

Path Analysis: Program


Selected result


  Interpret fit   Interpret standardized result

7/9/10

39

Selected result


Path model: Categorical, Censored, nominal outcome variables


  Some continuous variables are Censored either above or below  Marital satisfaction on a 1-7 scale has a clump at 7 who say they

are very satisfied, but there is unobserved variance among this group with some much more satisfied than others

  Some categorical variables have a binary set of options (divorced, not divorced)

  Some nominal categorical variables have three our more options (not employed for pay, employed part-time, employed full-time

7/9/10

40

Path model: Categorical, Censored, nominal outcome variables


  In following model,  y1 is a censored from above continuous variable  u1 is a binary categorical variable

  Odds ratio is odds of being in highest category

 u2 is a nominal categorical variable with 3 options, 0,1, & 2   “Odds ratio” is relative risk ratio of being in category 0 versus category 2   “Odds ratio” is relative risk ratio of being in category 1 versus category 2

  We cannot estimate a standardized solution or indirect effects   We use maximum likelihood robust as our estimator

Y1 censored above, u1 binary, & u2 has 3-categories


x1

x2

x3

y1

u1

u2

7/9/10

41

Y1 censored above, u1 binary, u2 3-categories—Program


Note, binary called categorical; more than 2 options called nominal. Censored above use the (a) to indicate this. Censored below would use (b)!

Selected Results


7/9/10

42

Interpretations for categorical/nominal outcomes


  Odds of being in highest category on U1 is 2.803 greater for a unit increase in X1—i.e., odds are 180.3% greater of being in highest category on U1 for each unit increase in X1!

  Odds of being in highest category on U1 is only .549 as great for a unit increase in X2—i.e., odds are 45.1% lower of being in highest category for each unit increase in X2!

  Relative risk ratio of being in category # 1 of U2 is 1.569 times as great as being in category #3 for each unit increase in Y1!

  Relative risk of being in category #2 of U2 is 5.644 times as great as being in category #3 for each unit increase in X2!

Putting it Together—SEM Model


7/9/10

43

SEM Program


Selected results


 Review fit statistics  Review unstandardized solution  Review stdyx solution  Review R-square for latent variables  Review total, direct, and indirect effects

7/9/10

44

Selected results


 Review modification indices   We could reduce Chi-square, which now is Chi-square(50) = 53.492,

by about 5.265 if we allowed the error term for Y5 to be correlated with the error term for Y3

  The correlation of the two errors would be about -.132—does this make sense?

  We would do these one at a time   Say Y5 and Y3 are pen and pencil tests and all the others are face

to face interviews   New Chi-square would be approximately Chi-square(49) = 53.492 –

5.265. A reduction in Chi-square of 5.265 with one degree of freedom would be highly significant. Not much need to improve on a CFI = .997; RMSEA = .012

SEM using EFA


  We can use modification indices to add parameters one at a time to improve the measurement model

  We will not get the optimum measurement model   Some indicators may have a very small loading and this will

provide a better fit than fixing them at zero   The added loadings can make the interpretation of the latent

variables confounded

7/9/10

45

SEM using EFA for two latent variables


SEM using EFA


  Ideally we probably want loadings of Y1-Y3 on F2 to be very weak and the loadings of Y1 – Y3 on F1 to be very strong

  Weak loadings may be more realistic than asserting that the loadings are exactly 0.0

  May avoid correlating some error terms   Here is the program

7/9/10

46

SEM using EFA—Program


Key things to remember


  BY means measured by. It is a loading of indicator on latent   ON is a structural path between two variables—Direct effect   WITH means correlated with. F1 with F2 or a1 with b1!

  Y ind X1 X2; Indirect effects of X1 & X2 on Y!  [variable] is the mean if the variable is a latent variable   [variable] is an intercept if the variable is observed   Variable, e.g., var1 with var2, refers to the error in var1 and var2. Errors are not given names in the program so this is the only way to show them