Session A:
Basics of Structural Equation Modeling and
The Mplus Computer Program
Kevin GrimmUniversity of California, Davis
June 9, 2008
Outline• Basics of Path Diagrams and Path Analysis
– Regression and Structural Regression– Structural Expectations
• Covariance Expectations• Mean Expectations
– The Common Factor Model
• 5 Steps of SEM
• The Mplus Computer Program
1. Basics of Path Diagrams and Path Analysis
Various Definitions
• Formal statistical statement about the relations among chosen variables
• Hypothesized pattern of (linear) relationships among a set of variables
• Collection of statistical techniques that allow the examination of a set of relationships between one or more IV’s (continuous or discrete) and one or more DV’s(continuous or discrete)
Some Advantages of SEM• Explicit representation of theory (no default model)
• Representation of complex multivariate theories
involving latent entities are possible
• Direct and indirect effects can be teased apart
• Analysis of multiple groups
• Variables can be outcomes (DV) and predictors (IV)
• Missing Data
Some Disadvantages of SEM
• Need for strong substantive theory regarding relationship
of variables
• Sample Size
• Need for some basic understanding of statistics
• Yield to temptation, “when the tail wags the dog”
• Equivalent models
• Yields to statements of causality easily
Special SEM
• General linear model (t-test, regression, multiple
regression, ANOVA, etc.)
• Path model
• Confirmatory factor analysis
• Latent variable path model
• Latent growth curve analysis
Some SEM Software
• LISREL
• Mplus
• AMOS
• SAS Proc Calis
• Mx
• COSAN
• Sepath
• LisComp
• Systat Ramona
Underlying Principles
• Because SEM concerns relationships among variables, the emphasis is mainly on moment matrices (i.e., the structure of the data)
• The hypothesized model attempts to explain the structure of the data, with parsimony and accuracy
• Some statistical statement is needed about the match between the observed data and the hypothesized model (e.g. Fit Statistics)
Some SEM Terminology• Manifest variables: Observed (i.e., measured) variables
defining the structure we wish to model
• Latent variables: Unobserved (i.e., unmeasured)
variables implied by the covariance among two or more
manifest variables
• Specification: Exercise of formally stating a model
Some SEM Terminology (2)
• Association: Non- (or bi-) directional (reciprocal)
relation between 2 variables
• Direct effect: (uni-) Directional (non-reciprocal)
relation between 2 variables (IV and DV)
• Indirect effect: effect of an IV on a DV through one
or more intervening or mediating variables
• Total effect: sum of direct and indirect effects of an
IV on a DV
Squares = Observed Variables
SEM Path DiagramsA Key
f
Y
U Circles = Latent Variables
Double-Headed Arrows = Variances/Covariances (Association)
Single-Headed Arrows = Regressions (Direct Effect)
Triangle = Assigned variable = Constant (=1.0) for modeling Means
1
Regression & Structural Regression
0 = intercept, predicted value of Y when predictors (X) are zero
1 = slope coefficient, predicted amount of change in Y for a 1 unit change in X
e = residual, part of Y not predicted by X, uncorrelated with X
Variance of Y is decomposed into variance explained by Xand unexplained variance (e)
0 11 1n n n nY X e
Structural Regression
X Y e1 eX
1
0X
0 11 1n n n nY X e
Structural Regression (Short-hand)
X Y1 eX
1
0X
0 11 1n n n nY X e
Structural Expectations
• Every Structural Model has a set of Structural Expectations. – Variance/Covariance Expectations– Mean Expectations
• Used to estimate parameters• Difference between Structural Expectations
and Observed Statistics (Covariance/Mean) is the model misfit
• Expectations can be calculated based on a path diagram or computed algebraically.
Calculating Covariance Expectations
X2X1
Y1 Y2
2X2
2X1
2Y2
2Y1
2
1
X1,Y1
Covariance Expectations
21[ 1, 1] XE X X
2 21 1 1 2[ 2, 2] X XE X X
21 1[ 1, 2] XE X X
1 1, 1 2[ 2, 2] X YE X Y
1, 1[ 1, 1] X YE X Y
Variance Expectations (Select)
Covariance Expectations (Select)
Calculating Mean Expectations
X2X1
Y1 Y22
1
1
X1
Y1
Mean Expectations (Select)
1[ 1] XE X
1 1[ 2] XE X
X Y e1 eX
1
0X
2 21 1[ ] 1 1X eE YY b b
21[ ] XE XY b
2[ ] XE XXExpected Variances
Expected Covariance
1 0[ ] XE Y b b[ ] XE X
Expected Means
Confirmatory Factor Analysis
Confirmatory Factor Analysis (CFA)
• Used to study how well a hypothesized structure fits to a sample of measurements
• Hypothesis-driven– Explicitly test a priori hypotheses (theory) about the
structures that underlie the data • Number of , characteristics of, and interrelations among
underlying factors
– Specify a common measurement base for comparisons across groups/occasions (factorial invariance)
Confirmatory Factor Analysis (CFA)
• Testing an a-priori hypothesis about the structures in the data
– Requires specific expectations regarding• The number of factors• Which variables reflect given factors• How the factors are related to one another
One Factor Model
f
Y2Y1 Y3
U1 U2 U3
1 2 3
1 2 3
Path Tracing Rules
Covariance Expectations for the Single Common Factor
2 + 23
22Y3
2 + 22
2Y2
2 + 21Y1
Y3Y2Y1=
Note: The main diagonal variances include the unique variances, 2j, but the off-
diagonal covariances do not. All terms include the variance of the common factor, 2.
Structural Expectations: Identification Constraints
• Additional constraints are often needed to obtain a unique set of estimates, and we typically assume the latent variable has no meaningful scaling, so we assume it has variance E{f, f’} = 2=1.0
• After this scaling each pair covariance is simply a product of the pairs of loadings: E{ 12} = 1 2
One Factor Model
f
Y2Y1 Y3
U1 U2 U3
1 2 3
1 2 3
Covariance Expectations for the Single Common Factor with 2 = 1
+ 23Y3
+ 22Y2
+ 21Y1
Y3Y2Y1=
Note: Each covariance, ij, expectation follows a simple pattern determined by the product of the respective loadings i j
Numerical Expectations for a Population Common Factor with
2 = 1 and = [.8, .7, .6]
2+ .64 = 1.0*.7 = .42*.8 = .48Y3
2+ .51 = 1.0*.8 = .56Y2
2+ .36 = 1.0Y1
Y3Y2Y1=
Structural Equation Modeling• Covariance expectations from a single common factor
model are a simple product of the loadings
• The extension to multiple factors is straightforward
• However, as models become more complicated (read realistic) the structural expectations become more complex as well
• Structural Equation Modeling (covariance analysis) is simply the method by which we test our expectations against our data
Introducing Means into the CFA ModelConsider The Common Factor Model
Y1
f
uy1
Y2
uy2
Y3
uy3
1
2 3
Introducing Means into the CFA ModelObserved Variable Means
1
Y1
f
uy1
Y2
uy2
Y3
uy3
1
2 3Y1 Y2 Y3
Path Tracing Rules
Structural Expectations: Means Observed Variable Means
Y3Y2Y1=
Y3Y2Y1
Introducing Means into the CFA ModelLatent Variable Mean
1
Y1
f
uy1
Y2
uy2
Y3
uy3
1
2 3
f
Path Tracing Rules
Structural Expectations: MeansLatent Variable Mean
fff=
Y3Y2Y1
5 Steps in SEM Analysis
5 Steps in SEM Analyses• SEM is often viewed as an advanced and novel form of
analysis but this approach is not new:
1. Theory-Data: form some basic ideas of merging theory and data
2. Specification: form explicit hypotheses using regression and factor analysis concepts to form “structural” restrictions
3. Estimation: use specialized computer software to estimate coefficients, standard errors and various statistical indicators
4. Evaluation: compare alternative structural restrictions in a series of statistical tests
5. Re-evaluation & Extension: “exploring” new ideas/models
Step 1: Theory-Data• “The purpose of statistical procedures is to assist
in establishing the plausibility of a theoretical model…” (Cooley, 1978)
• SEM is a general statistical framework that allows researchers to be explicit about theory and how it might be reflected in one’s data
• Statistical models are where theories and data “collide”– However, in their assumptions, statistical models
invoke a particular notion of reality that may or may not match one’s theoretical ideas
Step 1: Theory-Data• SEM is a “confirmatory” framework for testing an
a-priori hypotheses about the structures in the data
• Requires specific expectations regarding– One’s theory– How one’s theory may be reflected in one’s data
• Selection of persons• Selection of variables• Selection of occasions
Step 2: Model Specification
• There are many ways to use SEM programs (e.g., Mplus, AMOS, LISREL, Mx) to produce the same result
• Specify a set of expectations that match the theory to be tested– Path Diagram, Matrix, or Multiple Equation
specifications are functionally equivalent
Step 3: Parameter Estimation• A series of computational steps is taken, each successive
step to minimize the value of the fit function equation – a weighted distance between the observations and expectations.
• At each step in the minimization process, a vector of model parameter estimates is updated so that the model reproduces the observed covariance matrix as closely as possible. When these parameter estimates no longer improve the fit over the previous estimates, the process is said to have converged.
• The estimates at convergence are the parameter estimates.
Example of a Fit Function
F(ml) = ln|S|-ln| |+tr(S -1)-p,
where S is the sample covariance matrix and is the estimated covariance matrix based on the assumed model.
Step 4: Evaluating Fit of SEM
• Relative Fit:– A variety of structural models are typically fitted
to the same observations (data)• Nested models
• Fit Indices:– Lots of “fit” indices are available:
• simple residuals, standard errors, and likelihood ratio and chi-square tests
– Lawley & Maxwell, 1971; Browne, 1985; Browne & Cudeck, 1993
Step 4: Evaluating Fit of SEM
• Fit Indices (cont):– If we calculate the parameters based on the
principles of maximum likelihood estimation(MLE) we obtain a likelihood ratio statistic (L2) of “misfit”
– Under standard assumptions (e.g., normality of residuals) L2 follows a 2 distribution with df =Ns-Np
Step 4: Evaluating Fit of SEM
• “Testing” Fit:– We use L2 type tests to ask
• “Should we reject the hypothesis of this model?”
– Often this gets rephrased ... : • “Are the observed data consistent with the hypothetical
model?”• “Is the model plausible?”• “Does the model fit?”
Step 4: Evaluating Fit of SEM
• Probability of Close Fit:– Probability models based on normal distribution
theory are available • e.g., p(perfect fit) and p(close fit)
– These same statistical analyses can be used even when models are “complex”
• e.g., when latent variables, multiple groups, or incomplete data are the focus of study
– Major Bonus of SEM
Step 4: Evaluating Model Fit
• How good is the model?– How well does the model represent the data?– How well does the model represent the theory?
• Fit to the data– Measures of how well the estimated covariance matrix
derived from the model matches the observed covariance matrix
• Fit to the theory– Subjective interpretation
Confirmatory Hypothesis Tests
• When restrictions are stated in advance of the estimation, the hypothesis is clear and we can use statistical probability models
• These models have “degrees-of-freedom” and are rejectable– Examine overall fit, standard errors and residuals
• We do not conclude the model fits the data, but we can conclude the model does not fit the data, or that one model fits better than another – Relative fit: We need to examine most restrictions via the
comparison of at least two alternative models
Model Fit Statistics• 2 (or -2LL)
– df = degrees of freedom– Null hypothesis
• Estimated covariance matrix = Observed covariance matrix– (sensitive to sample size)
• RMSEA– Range: 0.00 to 1.00 – lower values indicate better fit– M. Browne’s rule of thumb: RMSEA < .05 indicates good fit
• CFI (Comparative Fit Index)• NFI (Normed Fit Index)• TLI (Tucker-Lewis Index)
– Range: 0.00 to 1.00+ – higher values indicate better fit
Nested Hypothesis Tests• Two alternative models may be nested
• Parameters are said to be “nested” when they are included in one model (M0) and then can be “removed” to form the alternative model (M1)
• The hierarchy of restrictions makes it easy to use statistical probability tests
• Under typical assumptions the difference between two “nested” models can be evaluated using a chi-square test
Relative Fit of Nested Models
• 2 difference tests (for nested models)– [(ModelB
2 ) - (ModelA2 )]/ dfB - dfA
• Information criteria for non-nested model comparisons (using same data)– AIC (Aikake Information Criteria) – BIC (Bayes Information Criteria)
• Lower values are better• Should be used in conjunction with judgments about the
theoretical interpretation of the models
Evaluating Relative Fit
• Evaluate Fit for Model A• Add restrictions to construct Model B• Evaluate Fit for Model B
• Evaluate difference in fit = 2/ df– Is the restricted (parsimonious) model of significantly
worse fit than the less restrictive (more complex) model – or is this complexity needed?
Relative Fit of Nested Models
f1
Y2Y1 Y3
U1 U2 U3
-.56 -.50 -.54
Y5Y4 Y6
U4 U5 U6
.76 .92 .87
f1
Y2Y1 Y3
U1 U2 U3
.86 .72 .65
f2
Y5Y4 Y6
U4 U5 U6
.75 .95 .87
.26 .48 .57 .42 .10 .24
-.63=1.0=1.0=1.0
.68 .74 .70 .41 .14 .23
2 = 11, df = 8, RMSEA = .0532 = 55, df = 9, RMSEA = .224
Model Comparison: 2 / df = 44/1 p < .05
Step 4: Evaluating Model Fit
• How good is the model?– How well does the model represent the data?– How well does the model represent the theory?
• Examine the relative fit of multiple models– Reject those models that fit relatively worse– Carry forward those models that fit relatively well
Step 5: Re-evaluation & Extension
• Moving from “Confirmation” to “Exploration”• A philosophical debate
– In most confirmatory analyses some results suggest alterations of the original concepts (specifying a different “theory”)
– Often the model is “modified” because the original model does not fit
– An “exploratory” phase begins
• Note: If there is a lack of prior directional hypotheses, probability models based on normal distribution theory are no longer available
A General Latent Variable Framework& Analysis Software
Muthén & Muthén, 1998-2006
Mplus Language ITITLE: Factor Model;
DATA:FILE = wisc3raw.dat;
VARIABLE:NAMES = id verb1 verb2 verb4 verb6
perfo1 perfo2 perfo4 perfo6 info1 comp1 simi1 voca1info6 comp6 simi6 voca6momed grad constant;
USEVARIABLES = info1 comp1 simi1 voca1;
ANALYSIS: TYPE = MEANSTRUCTURE;
Squares = observed variablesUSEVAR = var1 var2 …;
f
Y
UCircles = latent variables
Double-Headed Arrows = Covariancesvar1 WITH var2;
Single-Headed Arrows = Regressionsvar2 ON var1; or factor BY var1 var2 var3;
Double-Headed Arrows = Variancesvar1*; or factor@1;
Triangle = Assigned variable = Constant (=1.0) for modeling Means[var1];
1
MPlus Language IIANALYSIS: TYPE = MEANSTRUCTURE;! TYPE = BASIC; gives sample statistics! TYPE = NOMEANSTRUCTURE; allows for model without means
MODEL:!Factor Loadings
verb1 BY info1@1 comp1 simi1 voca1;!Factor Variance
verb1*1;!Mean of Factor
[verb1@0];!Means of Observed
[info1 comp1 simi1 voca1];!Variances of Observed
info1 comp1 simi1 voca1;
OUTPUT: SAMPSTAT STANDARDIZED;
Session B:
Alternative Structural Equation Models for Change over
Two-Occasions
Kevin GrimmUniversity of California, Davis
June 9, 2008
Overview1. Practical Preliminaries 2. Two-occasion longitudinal data 3. Type-A auto-regression models4. Type-D difference score models5. Combining alternative models6. Summary & Discussion
Practical Preliminaries
• Data set formatting
• Preliminary data examination– Correlations (covariances) over time– Means over time
• Longitudinal plots– Examine for Shapes, Outliers, Possible time bases, etc.
Longitudinal Data Formats• Two common data formats
– Single-record per person (wide form data) • all the data associated with one person appears in a
single record
id adhd2 adhd4 adhd5 adhd71 16.00 . . .2 16.00 9.00 5.00 3.003 26.00 29.00 . 6.004 9.00 20.00 22.00 21.00 5 16.00 13.00 16.00 15.00
– Multiple-record per person (long form data, person-period data, relational data) • the data associated with one person appear in multiple
records indexed by id and time variables.
id age read 1 2 16.001 4 .1 5 .1 7 .2 2 16.002 4 9.002 5 6.002 7 3.003 2 26.00...
Preliminary Examination of Longitudinal Data
• Sample Statistics– Correlations (covariances) over time
• Stability coefficients– Means over time
• Plots of intraindividual change over time
Sample Statistics in SAS/SPSS
*Examining Correlations Across Time;
PROC CORR DATA=wiscraw;VAR adhd2 adhd4 adhd5 adhd7;
RUN;
*Examining Means Across Time;PROC MEANS DATA=wiscraw;
VAR adhd2 adhd4 adhd5 adhd7;RUN;
*Note: Data must be in single-record per person (wide) format
*Correlations over time.
CORRELATIONS/VARIABLES= adhd2 adhd4 adhd5 adhd7/MISSING=PAIRWISE.
*Means over time.
DESCRIPTIVESVARIABLES= adhd2 adhd4 adhd5 adhd7
/STATISTICS=MEAN STDDEV MIN MAX.
Sample Statistics in MplusTITLE: Descriptive Sample Stats;
DATA: FILE = adhd_uncg.dat;
VARIABLE:NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_tot2 adhd_tot4 adhd_tot5 adhd_tot7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .USEVAR = adhd_tot2 - adhd_tot7;
ANALYSIS: TYPE = BASIC;
OUTPUT: SAMPSTAT;
*Note: Data must be in single-record per person (wide) format
Sample StatisticsLongitudinal Correlations & Means
328324376394N
12.1112.4513.7814.58Means
1.000.760.600.49adhd7
1.000.690.48adhd5
1.000.57adhd4
1.00adhd2adhd7adhd5adhd4adhd2
Plot of Intraindividual Change in SASPROC GPLOT DATA = temp_long (where = (new_id < 550));
SYMBOL1 REPEAT=5500 I=join V=dot H=..5 W=22 C=black;AXIS1
LABEL = (A=990 F=SWISSX H=11.3 'ADHD Total Score')ORDER = (00 to 660 by 110)MINOR = noneOFFSET = (22);
AXIS2LABEL = (F=SWISSX H=11.3 'Age')ORDER = (22 to 77 by 11)MINOR = noneOFFSET = (22);
PLOT adhd_t * age = id /NOLEGEND VAXIS=AXIS1HAXIS=AXIS2;
RUN;
*Note: Data must be in multiple-record per person (long) format
Plot of Intraindividual Change in SPSS
igraph/x1=var(age) type = scale/y=var(adhd_t) type=scale/style=var(id)/line(mode) key=off style=line interpolate=straight/scalerange=var(adhd_t) min=0 max=60.
*Note: Data must be in multiple-record per person (long) format
Longitudinal Plot of ADHD Total Score(N = 50)
2. Two-Occasion Longitudinal Data
Two-Occasion Data Are Valuable• Two-occasions are the first case of longitudinal
data collections
• There are several special properties of repeated measures data
• Most analyses deal with basic questions of representing change over time
• Different problems seem to suggest different models and methods of analysis
Example of Two-Occasion Data• Data from the RIGHT (Research Investigating
Growth and Health Trajectories) Track Research Project
• Focus on the development and developmental trajectories of early disruptive behavior
• Study participants – N=431 children– Measured at age 2, 4, 5 & 7 years of age
• In this illustration the Attention Deficit Hyperactivity Total Scores from the age 4 & age 7 assessments are used
Summary Statistics from the RIGHT Track Study
The CORR Procedure2 Variables: adhd_to4 adhd_to7
Simple StatisticsVariable N Mean Std Dev Sum Minimum Maximumadhd_to4 376 13.78 9.29 5182 0 51.00adhd_to7 328 12.11 10.13 3973 0 54.00
Pearson Correlation CoefficientsProb > |r| under H0: Rho=0Number of Observations
adhd_to4 adhd_to7adhd_to4 1.00000 0.60359
<.0001376 314
adhd_to7 0.60359 1.00000<.0001
314 328
Univariate Histograms of ADHDAge 4 and Age 7
Quotes from Bachman et al (2002, p.31) There are two approaches to the prediction of change in
such analyses, and we use both. The first approach involves computing a change score by
subtracting the “before” measure from the “after”measure, and then using the change score as the “dependent variable.”
The second approach uses the “after” measure as the “dependent variable” and include the “before”measure as one of the “predictors” (i.e., as a covariate).
In either case one could say that the earlier score is being “controlled,” but the means of controlling differ and the results of the analysis also can differ ---sometimes in important ways.
Bivariate ScatterplotX=Age 4 versus Y=Age 7
Plotting Individual “Trajectories”(n=50)
SEM and Two-Occasion Data AnalysesWe can use use regular regression or ANOVA programs to get reasonable answers to some change questions.Alternatively, we can use any standard SEM computer program and any available options for input and output.SEMs are based on: (a) an algebraic model, (b) a corresponding path diagram, (c) input the model to a SEM program, and (d) examine expectations generated.Empirical models can be fit using any technique for means and covariances or raw data (e.g., 2 in M+).SEM provide a framework for dealing with more than two alternative models, including those based on latent variable and requiring incomplete data analyses.
3. Type-A Auto-Regressive Models for
Repeated Measures
Most Common Linear Regression ModelsA linear model is expressed for n=1 to N as
Y[2]n = 0 + 1 Y[1]n + en0 is the intercept term -- the predicted score of Y[2] when
Y[1]=0
1 is the coefficient term -- the change in the predicted score of Y[2] for a one unit change in Y[1]e is the residual score -- an unobserved and random score which is uncorrelated with Y[1] but forms part of the variance of Y[2].
The ratio of the variance of e to Y[2] ( e2 / y
2 = 1-R2)can be a useful index of forecast efficiency.In our notation, Greek letters used for estimatedparameters.
Typical Autoregression Path Model for Two Repeated Measures
Y[1] Y[2]1 e
1
0
Y1
Y1
e
SAS/SPSS Autoregression Input Script• SAS
*Type A - Autoregressive Model for Two Occasion;PROC REG DATA = adhd;
MODEL adhd_to7 = adhd_to4 / STB;RUN;
• SPSS*Autoregressive Model.REGRESSION
/MISSING LISTWISE/STATISTICS COEFF OUTS R ANOVA/CRITERIA=PIN(.05) POUT(.10)/NOORIGIN/DEPENDENT adhd_to7/METHOD=ENTER adhd_to4 .
Auto-Regression Results (SAS)Dependent Variable: aadhd_to7
Number of Observations Read 431Number of Observations Used 3314Number of Observations with Missing Values 117
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 1 11534 11534 178.81 <.0001Error 312 20125 64.50274Corrected Total 313 31659
Root MSE 8.03136 R-Square 00.3643Dependent Mean 12.06524 Adj R-Sq 0.3623Coeff Var 66.56613
Parameter Estimates
Parameter Standard StandardizedVariable DF Estimate Error t Value Pr > |t| Estimate
Intercept 1 3.43388 0.78871 4.35 <.0001 0adhd_to4 1 0.64659 0.04835 13.37 <.0001 0.60359
Autoregressive Model in MplusTITLE: ADHD – Autoregressive Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4;
adhd_to4 adhd_to7;[adhd_to4 adhd_to7];
OUTPUT: SAMPSTAT;
Mplus Output
SUMMARY OF ANALYSISNumber of groups 1Number of observations 314Number of dependent variables 1Number of independent variables 1Number of continuous latent variables 0Observed dependent variablesContinuousADHD_TO7
Observed independent variablesADHD_TO4
SAMPLE STATISTICSMeans
ADHD_TO7 ADHD_TO4________ ________12.065 13.349
CovariancesADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 100.823ADHD_TO4 56.808 87.857
CorrelationsADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 1.000ADHD_TO4 0.604 1.000
Mplus Output
TESTS OF MODEL FIT
Chi-Square Test of Model FitValue 0.000Degrees of Freedom 0P-Value 0.0000
CFI/TLICFI 1.000TLI 1.000
LoglikelihoodH0 Value -2246.950H1 Value -2246.950
Information CriteriaNumber of Free Parameters 5Akaike (AIC) 4503.900Bayesian (BIC) 4522.647Sample-Size Adjusted BIC 4506.789
(n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.00090 Percent C.I. 0.000 0.000Probability RMSEA <= .05 0.000
Mplus Output Mplus Output
Two-TailedEstimate S.E. Est./S.E. P-Value
ADHD_TO7 ONADHD_TO4 0.647 0.048 13.415 0.000
MeansADHD_TO4 13.349 0.529 25.237 0.000
InterceptsADHD_TO7 3.434 0.786 4.368 0.000
VariancesADHD_TO4 87.854 7.011 12.530 0.000
Residual VariancesADHD_TO7 64.089 5.115 12.530 0.000
Results from Autoregression Model
ADHD[4] ADHD[7] e
1
=1
*
=1
Note: Fully saturated modelso 2=0 with df=0;Asterisk indicates t= p/se(p) >2.
Graphic Result of the Autoregression
4. Type DLatent Difference Score
Models forRepeated Measures
Statistical Features of Difference Scores• Using the same repeated measures scores, the difference
scores have useful properties for the means, such as
[2] = [1] + Dso
D = [2] - [1]n
• And also for the variances and covariances
2 = 1 + D + 2 1Dso
D = 1 + 2 - 2 12and
1D = 12 - 1• The statistics of the difference scores are a transformation
of the statistics in the measured variables, and this is useful.
Classic Critique of Difference Scores• There have been major critics of the use of difference
scores (e.g., Cronbach & Furby, 1970).
• A typically model for a set of observed repeated measures is
Y[1]n = yn + e[1]n andY[2]n = yn + e[2]n
where y = an unobserved “true score” for both occasions, and e[t]=an unobserved random error that is independent over each occasion.
• In this model the true score remains the same and allchanges are based on the random noise.
Classic Critique of Difference Scores• If the previous model holds then the simple difference
score can be rewritten asDn = Y[2]n - Y[1]n
= (yn + e[2]n)- (yn + e[1]n )= (yn - yn ) + (e[2]n- e[1]n )= e[2]n- e[1]n
• So the variance of the difference score is entirely based on the differences in the random error scores.
• This also implies that the reliability of the difference score is zero.
• For these and other reasons the use of the simple difference score has been frowned upon in much of developmental research.
Classic Resolution of Difference Critique• Other researchers (e.g., Nesselroade, 1972, 1974)
defined a model for a set of observed repeated measures as
Y[1]n = y[1]n + e[1]n andY[2]n = y[2]n + e[2]n
where y0 = an unobserved “true score” for he first occasion, y1 = an unobserved “true score” for the second occasion, and e[t]=an unobserved random error.
• It is also possible to redefine this model as a gain and write
Y[1]n = y[1]n + e[1]n andY[2]n = y[1]n + yn + e[2]n
where y[1]n = an unobserved “true score” for both occasions, y = an unobserved “true change” at the second occasion, and e[t]=an unobserved random error.
Classic Resolution of Difference Critique• However, iff this model holds then the difference score
Dn = Y[2]n - Y[1]n= (y[2]n+ e[2]n) - (y[1]+ e[1]n )= (y[2]n- y[1]n ) + (e[2]n- e[1]n )= yn + (e[2]n- e[1]n )= yn + en
• So the variance of the difference score is party based on the differences in the random error scores but also partly on the gain in the true score. The relative size of the true-score gain determines variance and reliability of the difference.
• This implies the difference score may be a very good way to consider measuring change, and researchers should consider them carefully, especially latent scores without accumulated errors.
Models with Difference Scores• Using the same repeated measures scores, we can write
the alternative difference score for any person n as• Y[2]n = Y[1]n + yn
where the y is an implied or latent difference score.• This can be verified simply by rewriting the model as
yn = Y[2]n - Y[1]n
• We can also add features of the means and covariances as yn = d + yn* and
E{ y* y*}= andE{Y[1]* y*}= 1
Difference Score Model in SAS/SPSS*Type D - Difference Score Model for Two Occasion;
DATA diff;SET adhd;delta = adhd_to7 - adhd_to4;constant = 11;
RUN;PROC REG DATA = diff;
MODEL delta = constant/ NOINT;RUN;
*Computing Difference Score & Difference Score Model.COMPUTE delta = adhd_to7 - adhd_to4 .COMPUTE constant = 1 .EXECUTE .
REGRESSION/MISSING LISTWISE/STATISTICS COEFF OUTS R ANOVA/CRITERIA=PIN(.05) POUT(.10)/ORIGIN/DEPENDENT delta/METHOD=ENTER constant .
SAS/SPSS OutputDependent Variable: delta
Number of Observations Read 431Number of Observations Used 314Number of Observations with Missing Values 117
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 1 517.46712 517.46712 6.87 0.0092Error 313 23570 75.30475Uncorrected Total 314 24088
Root MSE 8.67783 R-Square 0.0215Dependent Mean -1.28374 Adj R-Sq 0.0184Coeff Var -675.98120
Parameter Estimates
Parameter StandardVariable DF Estimate Error t Value Pr > |t|constant 1 -1.28374 0.48972 -2.62 0.0092
Path Diagram of the Latent Difference Score Model
Y[1] Y[2] y
1
1
1
1
1
1
Note: The y is an unobserved variable whose moments ( , 2, 1 ) areimplied by the formula Y[2]=Y[1]+ y.
Mplus Latent Difference ScoreTITLE: ADHD – Difference Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta];
OUTPUT: SAMPSTAT;
Mplus OutputSUMMARY OF ANALYSIS
Number of groups 1Number of observations 314
Number of dependent variables 1Number of independent variables 1Number of continuous latent variables 1
Observed dependent variables
ContinuousADHD_TO7
Observed independent variablesADHD_TO4
Continuous latent variablesDELTA
Mplus OutputChi-Square Test of Model Fit
Value 0.000Degrees of Freedom 0P-Value 0.0000
CFI/TLICFI 1.000TLI 1.000
LoglikelihoodH0 Value -2246.950H1 Value -2246.950
Information CriteriaNumber of Free Parameters 5Akaike (AIC) 4503.900Bayesian (BIC) 4522.647Sample-Size Adjusted BIC 4506.789
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.00090 Percent C.I. 0.000 0.000Probability RMSEA <= .05 0.000
MODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
DELTA BYADHD_TO7 1.000 0.000 999.000 999.000
ADHD_TO7 ONADHD_TO4 1.000 0.000 999.000 999.000
ADHD_TO4 WITHDELTA -31.048 4.906 -6.328 0.000
MeansADHD_TO4 13.349 0.529 25.236 0.000DELTA -1.284 0.489 -2.626 0.009
InterceptsADHD_TO7 0.000 0.000 999.000 999.000
VariancesADHD_TO4 87.855 7.011 12.530 0.000DELTA 24.491 5.991 4.088 0.000
Residual VariancesADHD_TO7 50.573 0.000 999.000 999.000
Mplus Output
Results of the Latent Difference Score Model
Y[1] Y[2] y
1
1
1
Note: Fully saturated modelso 2=0 with df=0;Asterisk indicates t= p/se(p) >2.
Summary of Latent Difference Models• The use of latent difference score (LDS) in SEM is
based on the same statistical information in the two-occasion data or the calculated difference score model.
• This means that variations in the parameters of each change model can be evaluated in the same way -- by the difference in goodness-of-fit tests.
• However, by avoiding the direct calculation of the different score we can now consider models where we attempt a model-based separation of the errors of measurement from the systematic change.
• The previous point will be emphasized in the next set of models when we measure: (a) more variables, (b) more time points, or (c) more of both.
5. Combining Features of Alternative Repeated
Measures Models
Interpreting Change From AutoregressionAssume scores over time for multiple variables have been
fitted using this form of regression over time Y[2]n = 0 + 1Y[1]n + en
1. Rewrite this expression as a residual change(Y[2]n - 1Y[1]n ) = 0 + en
2. Or rewrite this expression as a direct change(Y[2]n - Y[1]n) = 0 + 1 Y[1]n + en - Y[1]n
= 0 + ( 1– 1) Y[1]n + en= 0 + 1 Y[1]n + zn
3. Or rewrite this expression as a historical changeY[2]n = 0 + 1 Y[1]n + e[2]n
BUT if Y[1]n = 0 + 1 Y[0]n + e[1]n then(Y[2]n - Y[1]n ) = 1 (Y[1]n - Y[0]n ) + (e[2]n - e[1]n )
Models with Latent Difference Scores• Assuming we can write the alternative difference score
for any person n asY[2]n = Y[1]n + yn
where the y is an implied or latent difference score.• So suppose we consider features of a dual change
prediction system with yn = 0 + 1 Y[1]n + zn
• This model of y makes it easy to see the possible to plots the latent difference scores. To obtain the auto-regression from the difference score coefficients we write
0 = 0 and 1 = ( 1-1).• This is a non-trivial resolution of the fundamental
question of the alternative models.
A path diagram for the prediction of the Latent Difference Score
Y[1] Y[2] y
1
z1
1
1
1
Note: The y is an unobserved variable whose moments ( , 2, 1 ) areimplied by the formula Y[2]=Y[1]+ yAND y = 0 + 1 Y[1] + z.
Mplus Latent Difference ScoreTITLE: ADHD – Difference Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta];
delta ON adhd_to4; !Prediction of Change
OUTPUT: SAMPSTAT;
Mplus OutputChi-Square Test of Model Fit
Value 0.000Degrees of Freedom 0P-Value 0.0000
CFI/TLICFI 1.000TLI 1.000
LoglikelihoodH0 Value -2246.950H1 Value -2246.950
Information CriteriaNumber of Free Parameters 5Akaike (AIC) 4503.900Bayesian (BIC) 4522.647Sample-Size Adjusted BIC 4506.789
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.00090 Percent C.I. 0.000 0.000Probability RMSEA <= .05 0.000
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
DELTA BYADHD_TO7 1.000 0.000 999.000 999.000
DELTA ONADHD_TO4 -0.353 0.048 -7.332 0.000
ADHD_TO7 ONADHD_TO4 1.000 0.000 999.000 999.000
MeansADHD_TO4 13.349 0.529 25.236 0.000
InterceptsADHD_TO7 0.000 0.000 999.000 999.000DELTA 3.434 0.786 4.368 0.000
VariancesADHD_TO4 87.857 7.012 12.530 0.000
Residual VariancesADHD_TO7 50.573 0.000 999.000 999.000DELTA 13.519 5.115 2.643 0.008
Autoregression vs. Change Estimates1. We previous fit the auto-regression over time with MLE of
Y[2]n = 0 + 1Y[1]n + en= + Y[1]n + en
2. We now use the same data and rewrite this expression as a direct change with MLE of
(Y[2]n - Y[1]n) = 0 + ( 1– 1) Y[1]n + zn= 0 + 1 Y[1]n + zn
= + *Y[1]n + zn
3. The explained variance is the same residual variance ( e2) but
it is compared to the variance at time 2 ( 22) in auto-reg
model, but it is compared to the variance of the difference ( 2) in latent-difference model.
Testing Hypotheses with Difference Scores
• Hypothesis 1: No change has occurred
• Hypothesis 2: ADHD at age 4 is not predictive of change in ADHD from age 4 to age 7
No ChangeTITLE: ADHD – Difference Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta@0]; !No Average Change
delta WITH adhd_to4;OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 6.819Degrees of Freedom 1P-Value 0.0090
CFI/TLICFI 0.959TLI 0.959
LoglikelihoodH0 Value -2250.360H1 Value -2246.950
Information CriteriaNumber of Free Parameters 4Akaike (AIC) 4508.719Bayesian (BIC) 4523.717Sample-Size Adjusted BIC 4511.030
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.13690 Percent C.I. 0.054 0.240Probability RMSEA <= .05 0.042
No Prediction of ChangeTITLE: ADHD – Difference Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta];
delta ON adhd_to4@0; !No Prediction of ChangeOUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 49.623Degrees of Freedom 1P-Value 0.0000
CFI/TLICFI 0.656TLI 0.656
LoglikelihoodH0 Value -2271.762H1 Value -2246.950
Information CriteriaNumber of Free Parameters 4Akaike (AIC) 4551.523Bayesian (BIC) 4566.521Sample-Size Adjusted BIC 4553.834
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.39490 Percent C.I. 0.305 0.490Probability RMSEA <= .05 0.000
6. Summary & Discussion
Summary of Two Occasion Models• Given any two-occasion repeated measures data, we
can write many alternative structural change models.• It is not easy to distinguish these models by goodness-
of-fit tests, because some can be exactly identified, rotated, and fit as well as one another.
• Our interpretations of change from these models are fundamentally restricted by these initial choices -- so choose carefully and match the problem at hand.
• Measurement error can have direct impacts on the lowering the determination of individual differences in changes.
• Note: The two models discussed here, auto-regression and difference-scores can be distinguished when more than t > 3 repeated measures are available.
Alternative Indices of ChangeAssuming A[t]=The index at a particular time, we can
calculate:1. The Difference score
[2-1]= A[2] - A[1]2. The Ratio 1 score
[1]= A[2] / A[1]3. The Change Ratio 1 score
[ 1]= {A[2]-A[1]} / A[1]4. The Change Ratio 2 score
[ 2]= {A[2]-A[1]} / A[2]5. The Residual Change Ratio
[2 ] = A[2] - { 0+ 1 A[1]}with 0 and 1 to be determined from the data
References on Change Issues• Re-considering Lord’s Paradox? Holland, P.W. & Rubin, D.B.
(1983). In H. Wainer & S. Messick (Eds.), Principals of modern psychological measurement (pp. 3–25). Hillsdale, NJ: Erlbaum. (also see Laird, N, 1983, Am Stat., 37, 329-330).
• Additional problems due to Regression to the Mean? (Nesselroadeet al, 1981)
• The problems of un-reliability of difference scores? (Burr and Nesselroade, 1990)
• Alternative change measures in Pretest-Postest? (Bonate, 2000, “Analysis of Pretest-Posttest Designs)
• Issues in Significance Testing? Harlow, L.L. Mulaik, S.A. & Steiger, J.H. (1997). What if there were no significance tests? Hillsdale, NJ: Erlbaum.
References on SEM• Baltes, P.B., Dittmann-Kohli, F. & Kliegl, R. (1986). Reserved capacity
of the elderly in aging-sensitive tests of fluid intelligence: replication and extension. Psychology and Aging, 1 (2), 172-177.
• Joreskog, K.G. & Sorbom, D. (1979). Advances in factor analysis and structural equation models. In J. Magdson (Ed.), Cambridge, MA: AbtBooks.
• Loehlin, John C. (1998). Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis. 3rd ed. Mahwah, N.J.: Lawrence Erlbaum Associates.
• McArdle, J.J. (1996). Current directions in structural factor analysis. Current Directions in Psychological Science, 5 (1), 11-18.
• McDonald, R.P. (1985). Factor Analysis and Related Methods, NewJersey: Lawrence Erlbaum, Publishers.
Session C:Group Differences
over Two Occasions
Kevin GrimmUniversity of California, Davis
June 9, 2008
Overview1. Group Differences in Longitudinal Data2. Traditional ANCOVA in Two Occasions3. ANOVA Difference Score Regression 4. Multiple Group Structural Equation
Modeling5. Latent Class Mixture Modeling6. Summary & Discussion
1. Group Differences in Longitudinal Data
Group differences in change• The alternative models of change are often considered in the
context of a group difference• Differences between groups can be considered for any feature of the
model (e.g., means, regressions, etc.) • If assignment to groups is based on random selection, then the
inferences about the source of the subsequent changes is clear –the impact is due to the assignment • The SEM approach can add power to the model
• If assignment to groups is pre-existing or based on non-randomselection (e.g., self selection) then the inferences about the source of the changes is often ambiguous – the changes may be due to other features related to the initial selection• The SEM approach does not counteract the selection mechanism, but it
can help clarify some of the ambiguity.
Traditional concept of group differences on the distribution of a variable
Group AGroup B
Group information on ADHD Data• Gender
• Coded 0 for male (n = 145), 1 for female (n = 169)
• Questions related to group information• Do males/females tend to display more symptoms of
ADHD?• Are there more/less between-person differences in
ADHD for males/females?• Do males/females change more in their symptoms of
ADHD?• Are males/females more variable in their patterns of
change in ADHD?
Descriptive Information in ADHD by Gender
.68
10.68(9.97)
12.27(9.58)
Females(nf=169)
.49.60Age 4/Age 7 Correlation
13.68(9.95)
12.07(10.06)
ADHD Age 7Mean (SD)
14.60(9.03)
13.35(9.39)
ADHD Age 4 Mean (SD)
Males(nm=145)
Overall(N=314)Statistic
A note on Lord’s paradox• Lord, F.M. (1967) A paradox in the interpretation of group
comparisons, Psychological Bulletin, 68, 304-305.
• “A large university is interested in investigating the effects on the students of the diet provided in the university dining halls….Various types of data are gathered. In particular the weight of each student at the time of his arrival in September and his weight in the following June are recorded”
• Lord suggested that Analyst A used ANCOVA to remove the pre-existing differences and, and Analyst B used Repeated Measures ANOVA to examine changes
• The paradox Analyst A obtained significant group differences but Analyst B did not, and then Lord simply ended the paper…
2. Traditional ANCOVA in SEM
Multiple Regression ANCOVA• A linear model is expressed (for n=1 to N) as
Y[2]n = b0 + b1. Y[1]n+ b2. Gn + en
where G is a “binary” variable• Iff coded in dummy (0,1) form we can write
Y[2]n [: Gn =0] =b0 + b1. Y[1]n + b2. 0n+ enY[2]n [: Gn =1] =b0 + b1. Y[1]n + b2. 1n + en
so b0 is the intercept for the group coded 0,b1 is the slope for the group coded 0,b2 is the change in the intercept for the group
coded 1
An Auto-Regression model with Group Differences
Y[1] Y[2] e
1
e1
1
1
1
0
1
G
2
g
g2
1g
Autoregression Input Script (SAS)
TITLE1 ‘Structural Equation Models of Change';TITLE2 ‘Auto-Regression and Difference Score Models of Change with Group Information';
*Type A with Group - Traditional ANCOVA Model with Repeated Measures;DATA adhd2;
SET adhd;adhd_by_girl = adhd_to4 * girl;
RUN;
PROC REG DATA = adhd2;ModA1: MODEL adhd_to7 = adhd_to4 / STB;ModA2: MODEL adhd_to7 = adhd_to4 girl / STB;ModA3: MODEL adhd_to7 = adhd_to4 girl adhd_by_girl / STB;
RUN;
Autoregression Input Script (SPSS)
*Computing Interaction between gender and ADHD at age 4.
COMPUTE adhd_by_girl = adhd_to4 * girl .EXECUTE .
*Autoregressive Model with Group Information.
REGRESSION/MISSING LISTWISE/STATISTICS COEFF OUTS R ANOVA/CRITERIA=PIN(.05) POUT(.10)/NOORIGIN/DEPENDENT adhd_to7/METHOD=ENTER adhd_to4 /METHOD=ENTER girl adhd_to4
/METHOD=ENTER girl adhd_to4 adhd_by_girl .
SAS AutoregressionThe REG Procedure
Number of Observations Read 431Number of Observations Used 314Number of Observations with Missing Values 117
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 1 11534 11534 178.81 <.0001Error 312 20125 64.50274Corrected Total 313 31659
Root MSE 8.03136 R-Square 0.3643Dependent Mean 12.06524 Adj R-Sq 0.3623Coeff Var 66.56613
Parameter Estimates
Parameter Standard StandardizedVariable DF Estimate Error t Value Pr > |t| EstimateIntercept 1 3.43388 0.78871 4.35 <.0001 0adhd_to4 1 0.64659 0.04835 13.37 <.0001 0.60359
Autoregression with Intercept DifferenceAnalysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 2 11708 5854.06296 91.26 <.0001Error 311 19950 64.14934Corrected Total 313 31659
Root MSE 8.00933 R-Square 0.3698Dependent Mean 12.06524 Adj R-Sq 0.3658Coeff Var 66.38352
Parameter Estimates
Parameter Standard StandardizedVariable DF Estimate Error t Value Pr > |t| EstimateIntercept 1 4.37737 0.97266 4.50 <.0001 0adhd_to4 1 0.63666 0.04860 13.10 <.0001 .59431girl 1 -1.50657 0.91369 -1.65 0.1002 -0.07480
Mplus Autoregression & Group Input VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7 girl;
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4;
adhd_to4 adhd_to7;[adhd_to4 adhd_to7];
adhd_to7 ON girl;[girl*.5] (M_girl);girl (V_girl);girl WITH adhd_to4;
MODEL CONSTRAINT:v_girl = M_girl * (1 - M_girl);
OUTPUT: SAMPSTAT TECH1;
Mplus outputMeans
ADHD_TO7 ADHD_TO4 GIRL________ ________ ________
1 12.065 13.349 0.538
Covariances
ADHD_TO7 ADHD_TO4 GIRL________ ________ ________
ADHD_TO7 100.823ADHD_TO4 56.808 87.857GIRL -0.743 -0.579 0.249
Correlations
ADHD_TO7 ADHD_TO4 GIRL________ ________ ________
ADHD_TO7 1.000ADHD_TO4 0.604 1.000GIRL -0.148 -0.124 1.000
TESTS OF MODEL FIT
Chi-Square Test of Model FitValue 0.000Degrees of Freedom 1P-Value 1.0000
CFI/TLICFI 1.000TLI 1.014
LoglikelihoodH0 Value -2470.129H1 Value -2470.129
Information CriteriaNumber of Free Parameters 8Akaike (AIC) 4956.259Bayesian (BIC) 4986.254Sample-Size Adjusted BIC 4960.880
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.00090 Percent C.I. 0.000 0.000Probability RMSEA <= .05 1.000
Mplus output
MODEL RESULTSTwo-Tailed
Estimate S.E. Est./S.E. P-ValueADHD_TO7 ON
ADHD_TO4 0.637 0.048 13.164 0.000GIRL -1.507 0.909 -1.657 0.098
GIRL WITHADHD_TO4 -0.579 0.262 -2.214 0.027
MeansADHD_TO4 13.349 0.529 25.239 0.000GIRL 0.538 0.028 19.242 0.000
InterceptsADHD_TO7 4.377 0.968 4.522 0.000
VariancesADHD_TO4 87.857 7.011 12.531 0.000GIRL 0.249 0.002 116.258 0.000
Residual VariancesADHD_TO7 63.536 5.071 12.530 0.000
Mplus output Autoregression Model with Group Difference
Y[1] Y[2] e[2]
1
1
1
G
Predicted Regression Lines
Males
Females
Autoregression Model Group and Pre-Existing Initial Differences
Y[1] Y[2] e[2]
1
e21
0
1
e1
0
1
G
g
g
g2
g
Two Occasion ANCOVA with Interaction• The model for multiple groups can be written as
Y[2]n = b0 + b1. Y[1]n+ bg. Gn + bi. (Y[1]n .Gn) + enwhere G is a “binary” variable -- “yes or no” and the “product” variable is created as (Y[1]n . Gn)
• Iff coded in dummy (0,1) form we can write Y[2]n[: Gn =0]=b0 + b1 Y[1]n + b2 0n + b3 0n + enY[2]n[: Gn =1]=b0 + b1 Y[1]n + b21n + b3. Y[1]1n+ en
b0 is the intercept for the group coded 0b1 is the slope for the group coded 0b2 is the change in the intercept for the group 1b3 is the change in the slope for the group coded 1
Autoregression Model with Group Differences and Interaction
Y[1] Y[2] e
1
e1
1
1
1
0
1
G
2
g
g2
1g
G*Y[1]
3
g12
Autoregression with Int+Slope Differences
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 3 11897 3965.51217 62.21 <.0001Error 310 19762 63.74850Corrected Total 313 31659
Root MSE 7.98427 R-Square 0.3758Dependent Mean 12.06524 Adj R-Sq 0.3697Coeff Var 66.17580
Parameter Estimates
Parameter Standard StandardizedVariable DF Estimate Error t Value Pr > |t| EstimateIntercept 1 5.76989 1.26343 4.57 <.0001 0adhd_to4 1 0.54130 0.07364 7.35 <.0001 0.50530girl 1 -3.79182 1.61139 -2.35 0.0192 -0.18826adhd_by_girl 1 0.16810 0.09778 1.72 0.0866 0.15570
Mplus Autoregression & Interaction VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7 girl g_by_adhd;
DEFINE:g_by_adhd = girl * adhd_to4;
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4;adhd_to4 adhd_to7;[adhd_to4 adhd_to7];adhd_to7 ON girl;[girl*.5] (M_girl);girl (V_girl);girl WITH adhd_to4;
!New Additionsadhd_to7 ON g_by_adhd;[g_by_adhd*.5];g_by_adhd;g_by_adhd WITH girl adhd_to4;
MODEL CONSTRAINT:v_girl = M_girl * (1 - M_girl);
OUTPUT: SAMPSTAT TECH1;
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
ADHD_TO7 ONADHD_TO4 0.541 0.073 7.398 0.000GIRL -3.792 1.601 -2.368 0.018G_BY_ADHD 0.168 0.097 1.730 0.084
GIRL WITHADHD_TO4 -0.579 0.262 -2.214 0.027
G_BY_ADH WITHGIRL 3.050 0.199 15.342 0.000ADHD_TO4 41.960 5.431 7.726 0.000
MeansADHD_TO4 13.349 0.529 25.239 0.000GIRL 0.538 0.028 19.243 0.000G_BY_ADHD 6.605 0.524 12.616 0.000
InterceptsADHD_TO7 5.770 1.255 4.596 0.000
VariancesADHD_TO4 87.857 7.011 12.531 0.000GIRL 0.249 0.002 116.254 0.000G_BY_ADHD 86.504 6.232 13.880 0.000
Residual VariancesADHD_TO7 62.938 5.023 12.530 0.000
Predicted Regression Lines
Males
Females
3. Repeated Measures ANOVA and
Difference Score Regression
Difference Score ANOVA / ANCOVA• A difference model is expressed (for n=1 to N) as
Y[2]n = Y[1]n + Dyn and Dyn = md + Dyn*now we add G as a “binary” variable (“yes or no”)
Dyn = g0 + g1 Gn + dn
go is the intercept (average difference) for the group coded 0,and g1 is the change in the intercept for the group coded 1
• We can add an interaction here simply by writingDyn = g0 + g1 Gn + g2 {Y[1]n Gn }+ dn
so g2 is difference in the slope for group coded 1
A Latent Difference Score Modelwith Group Differences
Y[1] Y[2] y
1
d1
1
1
1
0
1d
G
1
g
g2
d1
1g
SAS/SPSS Input for Difference Score Model
PROC REG DATA = diff;ModD0a: MODEL delta = / STB;ModD1a: MODEL delta = adhd_to4 / STB;ModD1b: MODEL delta = girl / STB;ModD2: MODEL delta = adhd_to4 girl / STB;ModD3: MODEL delta = adhd_to4 girl adhd_by_girl / STB;
RUN;
*Difference Model with Group Information.REGRESSION
/MISSING LISTWISE/STATISTICS COEFF OUTS R ANOVA/CRITERIA=PIN(.05) POUT(.10)/NOORIGIN/DEPENDENT delta/METHOD=ENTER adhd_to4 /METHOD=ENTER girl adhd_to4 /METHOD=ENTER
girl adhd_to4 /METHOD=ENTER girl adhd_to4 adhd_by_girl .
Side-by-Side Distributions of Differences
0 = Boys 1 = Girls
Group Difference Score - SAS OutputDependent Variable: delta
Number of Observations Read 431Number of Observations Used 314Number of Observations with Missing Values 117
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 2 3619.94111 1809.97056 28.21 <.0001Error 311 19950 64.14934Corrected Total 313 23570
Root MSE 8.00933 R-Square 0.1536Dependent Mean -1.28374 Adj R-Sq 0.1481Coeff Var -623.90654
Parameter Estimates
Parameter Standard StandardizedVariable DF Estimate Error t Value Pr > |t| EstimateIntercept 1 4.37737 0.97266 4.50 <.0001 0adhd_to4 1 -0.36334 0.04860 -7.48 <.0001 -0.39308girl 1 -1.50657 0.91369 -1.65 0.1002 -0.08669
Mplus Latent Difference Score with Group Information
VARIABLE: NAMES = id girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7 girl;
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta];
delta ON adhd_to4 girl;OUTPUT: SAMPSTAT;
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
DELTA BYADHD_TO7 1.000 0.000 999.000 999.000
DELTA ONADHD_TO4 -0.363 0.048 -7.513 0.000GIRL -1.507 0.909 -1.657 0.098
ADHD_TO7 ONADHD_TO4 1.000 0.000 999.000 999.000
GIRL WITHADHD_TO4 -0.579 0.262 -2.214 0.027
MeansADHD_TO4 13.349 0.525 25.433 0.000
InterceptsADHD_TO7 0.000 0.000 999.000 999.000DELTA 4.377 0.968 4.522 0.000
VariancesADHD_TO4 87.857 7.011 12.531 0.000
Residual VariancesADHD_TO7 50.573 0.000 999.000 999.000DELTA 12.963 5.071 2.556 0.011
Results of Latent Difference Score Model with Group Differences
Y[1] Y[2] y
1
1
13.35*
1
87.85* 1
4.37*
-.36*
G
-1.51
.54
.25
-.58*
12.96*
Predicted Change in ADHD by Groups
Males
Females
Latent Difference Score ResultsBy Group
• The latent difference model impliesm[2] - m[1] = b0 + b1 *m[1] + b2 * Gn
since G is a “binary” variable then for two groupsm[2] - m[1]{|G=0} = 4.38 - .36*m[1] - 1.51*0 = -.43m[2] - m[1]{|G=1} = 4.38 - .36*m[1] - 1.51*1= -1.94
• The latent difference model implies m[2] = m[1] + b1 *m[1] + b0 + b2 * Gn
since G is a “binary” variable then for two groups m[2] {|G=0} = 13.35 + 4.38 - .36*m[1] - 1.51*0 = 12.92m[2] {|G=1} = 13.35 + 4.38 - .36*m[1] - 1.51*1 = 11.41
Difference Score, Group & InteractionSAS output
Dependent Variable: delta
Number of Observations Read 431Number of Observations Used 314Number of Observations with Missing Values 117
Analysis of Variance
Sum of MeanSource DF Squares Square F Value Pr > FModel 3 3808.35172 1269.45057 19.91 <.0001Error 310 19762 63.74850Corrected Total 313 23570
Root MSE 7.98427 R-Square 0.1616Dependent Mean -1.28374 Adj R-Sq 0.1535Coeff Var -621.95421
Parameter Estimates
Parameter StandardStandardizedVariable DF Estimate Error t Value Pr > |t| EstimateIntercept 1 5.76989 1.26343 4.57 <.0001 0adhd_to4 1 -0.45870 0.07364 -6.23 <.0001 -0.49624girl 1 -3.79182 1.61139 -2.35 0.0192 -0.21819adhd_by_girl 1 0.16810 0.09778 1.72 0.0866 0.18045
Latent Difference Score, Group & InteractionMplus
VARIABLE: NAMES = id girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7 girl g_by_adhd;
DEFINE:g_by_adhd = girl * adhd_to4;
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4@1;
adhd_to4 adhd_to7@;[adhd_to4 adhd_to7@0];
delta BY adhd_to7@1;delta;[delta];
delta ON adhd_to4 girl g_by_adhd;
OUTPUT: SAMPSTAT TECH1;
Mplus OutputTwo-Tailed
Estimate S.E. Est./S.E. P-ValueDELTA BY
ADHD_TO7 1.000 0.000 999.000 999.000DELTA ON
ADHD_TO4 -0.459 0.073 -6.268 0.000GIRL -3.792 1.601 -2.368 0.018G_BY_ADHD 0.168 0.097 1.730 0.084
ADHD_TO7 ONADHD_TO4 1.000 0.000 999.000 999.000
GIRL WITHADHD_TO4 -0.579 0.172 -3.366 0.001
G_BY_ADH WITHADHD_TO4 41.961 3.211 13.066 0.000
MeansADHD_TO4 13.349 0.345 38.660 0.000
InterceptsADHD_TO7 0.000 0.000 999.000 999.000DELTA 5.770 1.255 4.596 0.000
VariancesADHD_TO4 87.859 5.742 15.300 0.000
Residual VariancesADHD_TO7 50.573 0.000 999.000 999.000DELTA 12.370 5.024 2.462 0.014
Predicted Change in ADHD by Groups
Males
Females
Group Latent Difference Score Results• The latent difference model implies
m[2] - m[1] = b0 + b1 *m[1] + b2 * Gn + b3 * m[1] * Gn
since G is a “binary” variable then for two groupsm[2] - m[1]{|G=0} = 5.77 - .46* 13.35 = -.37m[2] - m[1]{|G=1} = 5.77 - .46* 13.35 - 3.79 + .17* 13.35
= -1.89• The latent difference model implies
m[2] = m[1] + b1 *m[1] + b0 + b2 * Gn + b3 * m[1] * Gn
since G is a “binary” variable then for two groups m[2] {|G=0} = 13.35 + 5.77 - .46* 13.35 = 12.98m[2] {|G=1} = 13.35 + 5.77 - .46* 13.35 - 3.79 + .17*13.35
= 11.46
4: Representing Group Information in
Multiple Groups (MG) SEM
Multiple Groups (MG) in SEMIn the typical path models the group (G) variable is included
to account for systematic variation in the patterns of effects based on conditional means:
Y[2]n= b0+ b1Y[1]n + bg Gn+ en
In the Multiple-Group models we add information about groups by writing alternatives based on allowing any model parameter to be conditional upon group membership:
Y[2]n(g) = b0
(g) +b1(g) Y[1]n
(g) + en(g)
We can even ask the general question “Does the pattern of effects change over G measured groups?”
Multiple-Groups (MG) SEM research• A common research question -- Are the model
parameters equal across groups of persons?• If (a) there are not many groups (G<10), (b) they
are nominal categories, and (c) independent, then we can examine these questions using a multiplegroup approach (from Joreskog, 1971; Sorbom, 1974, 1978)
• The initial data are scaled, transformed, checked for outliers, and then split up into G independentgroups of subjects. Some kind of SEM is used to examine a model both within and between groups
Multiple Group Autoregression Model
Y[1](m) Y[2](m) e(m)
1(m)
e(m)
1(m)
1m
1(m)
0(m)
M = Males
Y[1](f) Y[2](f) e(f)
1(f)
e(f)
1(f)
1f
1(f)
0(f)
F = Females
Multiple Group Approaches
• Top-Down– Begin with Invariance Model (All parameters equal across
groups)– Gradually relax equality constraints in a logical manner– Similar to Regression Approach Outline
• Bottom-Up– Begin with No Equality Constraints (All parameters
separately estimated for each group)– Gradually impose equality constraints in a logical manner
until loss in fit is deemed unacceptable
Multiple Group Invariance using MplusTITLE: ADHD – Autoregressive Model - Multiple Group - Invariance;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;GROUPING = girl (0 = male, 1 = female);
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4 (b1);
adhd_to4 (V_4)adhd_to7 (V_7);[adhd_to4] (M_4);[adhd_to7] (b0);
OUTPUT: SAMPSTAT;
Mplus OutputSUMMARY OF ANALYSIS
Number of groups 2Number of observations
Group MALE 145Group FEMALE 169
SAMPLE STATISTICS FOR MALE
MeansADHD_TO7 ADHD_TO4________ ________
1 13.675 14.604
CovariancesADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 98.372ADHD_TO4 43.882 81.067
CorrelationsADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 1.000ADHD_TO4 0.491 1.000
SAMPLE STATISTICS FOR FEMALE
MeansADHD_TO7 ADHD_TO4________ ________
1 10.684 12.272
CovariancesADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 98.796ADHD_TO4 64.678 91.173
CorrelationsADHD_TO7 ADHD_TO4________ ________
ADHD_TO7 1.000ADHD_TO4 0.681 1.000
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 15.743Degrees of Freedom 5P-Value 0.0076
Chi-Square Contributions From Each GroupMALE 8.296FEMALE 7.447
CFI/TLICFI 0.925TLI 0.970
LoglikelihoodH0 Value -2246.950H1 Value -2239.079
Information CriteriaNumber of Free Parameters 5Akaike (AIC) 4503.900Bayesian (BIC) 4522.647Sample-Size Adjusted BIC 4506.789
(n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.11790 Percent C.I. 0.055 0.184
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
Group MALEADHD_TO7 ON
ADHD_TO4 0.647 0.048 13.415 0.000Means
ADHD_TO4 13.349 0.529 25.237 0.000Intercepts
ADHD_TO7 3.434 0.786 4.368 0.000Variances
ADHD_TO4 87.855 7.011 12.530 0.000Residual Variances
ADHD_TO7 64.090 5.115 12.530 0.000
Group FEMALEADHD_TO7 ON
ADHD_TO4 0.647 0.048 13.415 0.000Means
ADHD_TO4 13.349 0.529 25.237 0.000Intercepts
ADHD_TO7 3.434 0.786 4.368 0.000Variances
ADHD_TO4 87.855 7.011 12.530 0.000Residual Variances
ADHD_TO7 64.090 5.115 12.530 0.000
Multiple Group - Partial InvarianceTITLE: ADHD – Autoregressive Model - Multiple Group - Intercept;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;GROUPING = girl (0 = male, 1 = female);
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4 (b1);
adhd_to4 (V_4)adhd_to7 (V_7);[adhd_to4] (M_4);[adhd_to7] (b0);
MODEL male:[adhd_to7];
MODEL female:[adhd_to7];
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 13.010Degrees of Freedom 4P-Value 0.0112
Chi-Square Contributions From Each GroupMALE 6.716FEMALE 6.294
CFI/TLICFI 0.937TLI 0.969
LoglikelihoodH0 Value -2245.584H1 Value -2239.079
Information CriteriaNumber of Free Parameters 6Akaike (AIC) 4503.167Bayesian (BIC) 4525.663Sample-Size Adjusted BIC 4506.633
(n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.12090 Percent C.I. 0.051 0.195
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
Group MALEADHD_TO7 ON
ADHD_TO4 0.637 0.048 13.164 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 4.377 0.968 4.522 0.000Variances
ADHD_TO4 87.858 7.012 12.530 0.000Residual Variances
ADHD_TO7 63.536 5.071 12.530 0.000Group FEMALEADHD_TO7 ON
ADHD_TO4 0.637 0.048 13.164 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 2.871 0.853 3.364 0.001Variances
ADHD_TO4 87.858 7.012 12.530 0.000Residual Variances
ADHD_TO7 63.536 5.071 12.530 0.000
Multiple Group - Partial InvarianceTITLE: ADHD – Autoregressive Model - Multiple Group – Intercept/Slope;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;GROUPING = girl (0 = male, 1 = female);
ANALYSIS: TYPE = MEANSTRUCTURE;
MODEL:adhd_to7 ON adhd_to4 (b1);
adhd_to4 (V_4)adhd_to7 (V_7);[adhd_to4] (M_4);[adhd_to7] (b0);
MODEL male:[adhd_to7];adhd_to7 ON adhd_to4;
MODEL female:[adhd_to7];adhd_to7 ON adhd_to4;
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 10.030Degrees of Freedom 3P-Value 0.0183
Chi-Square Contributions From Each GroupMALE 5.282FEMALE 4.749
CFI/TLICFI 0.951TLI 0.967
LoglikelihoodH0 Value -2244.094H1 Value -2239.079
Information CriteriaNumber of Free Parameters 7Akaike (AIC) 4502.188Bayesian (BIC) 4528.433Sample-Size Adjusted BIC 4506.231
(n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.12290 Percent C.I. 0.044 0.209
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
Group MALEADHD_TO7 ON
ADHD_TO4 0.541 0.073 7.398 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 5.770 1.255 4.596 0.000Variances
ADHD_TO4 87.858 7.012 12.530 0.000Residual Variances
ADHD_TO7 62.936 5.023 12.530 0.000Group FEMALEADHD_TO7 ON
ADHD_TO4 0.709 0.064 11.100 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 1.978 0.994 1.990 0.047Variances
ADHD_TO4 87.858 7.012 12.530 0.000Residual Variances
ADHD_TO7 62.936 5.023 12.530 0.000
Multiple Group - Partial InvarianceTITLE: Autoregressive - Multiple Group – Intercept/Slope/Residual Variance;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7;GROUPING = girl (0 = male, 1 = female);
ANALYSIS: TYPE = MEANSTRUCTURE;MODEL:
adhd_to7 ON adhd_to4 (b1);
adhd_to4 (V_4)adhd_to7 (V_7);[adhd_to4] (M_4);[adhd_to7] (b0);
MODEL male:[adhd_to7];adhd_to7 ON adhd_to4;adhd_to7;
MODEL female:[adhd_to7];adhd_to7 ON adhd_to4;adhd_to7;
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 5.402Degrees of Freedom 2P-Value 0.0671
Chi-Square Contributions From Each GroupMALE 3.055FEMALE 2.347
CFI/TLICFI 0.976TLI 0.976
LoglikelihoodH0 Value -2241.780H1 Value -2239.079
Information CriteriaNumber of Free Parameters 8Akaike (AIC) 4499.560Bayesian (BIC) 4529.555Sample-Size Adjusted BIC 4504.181
(n* = (n + 2) / 24)RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.10490 Percent C.I. 0.000 0.214
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
Group MALEADHD_TO7 ON
ADHD_TO4 0.541 0.080 6.794 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 5.770 1.367 4.221 0.000Variances
ADHD_TO4 87.857 7.012 12.530 0.000Residual Variances
ADHD_TO7 74.619 8.764 8.515 0.000Group FEMALEADHD_TO7 ON
ADHD_TO4 0.709 0.059 12.105 0.000Means
ADHD_TO4 13.349 0.529 25.236 0.000Intercepts
ADHD_TO7 1.978 0.911 2.171 0.030Variances
ADHD_TO4 87.857 7.012 12.530 0.000Residual Variances
ADHD_TO7 52.915 5.757 9.192 0.000
Multiple Group & Difference Models
• The same set of models can also be estimated with the Latent Difference Score Model for Repeated Measures – as opposed to the Autoregressive Model
• These models are not presented• Mplus input and output files for these models
are contained in the programming section
Multiple Group Latent Difference Score Model
Y[1](t) Y[2](t) y(t)
1(t)
(t)
1(t)
1t
(t)
T = Trained Group
1t
Y[1](c) Y[2](c) y(c)
1(c)
(c)
1(c)
1c
(c)
C = Control Group
1c
Alternative MG-SEM fitting strategies• Can start with a totally invariant “metric” model --
the same path model and the same parameter values -- and make progressive relaxations for improvement in fit. This is easy.
• Can examine alternatives with a similar “configuration” -- the same path model with possibly different parameter values -- and add restrictions to make improvements in fit. This is harder.
• In applied research a broad multi-parameter comparison is usually done -- i.e., Are all regressions or loadings (one-headed arrows) equal? This is reasonable.
Sequences of MG model goodness-of-fitRestricting all parameters to be invariant over groups we necessarily achieve the overall model with min. likelihood with the minimum number of parameters
L2 = N*f{ [m-m], [C-S]} and P =P[m+S ].Restricting some parameters to be invariant over groups we necessarily achieve less than the max. likelihood with less than the maximum number of parameters:
L2 > L(a)2 + L(b)2 but P < P[m(a)+S(a) +m(b)+S(b)].Assuming no parameters are “identical” or “invariant over groups” we necessarily achieve the max. likelihood but using the most parameters:
L2 = L(a)2 + L(b)2 but P = P[m(a)+S(a) +m(b)+S(b)].
Multiple Group SEM benefits• The multiple group analysis permits a clear and direct
test of critical hypotheses, even if these include multiple parameters.
• MG-SEM results can often be approximated by standard techniques, but the approximations are often not very good.
• MG-SEM models are often easier to understand and setup than the more complex approximations, especially for interactions.
• MG-SEM analyses are relatively new but are rapidly being used in all substantive areas.
5. Heterogeneity Using a Latent Group
Mixture Model
Latent-Class Models for Group Differences
Begin with the same two-occasion modelsAdd information about groups by writing a Mixturemodel (after Muthen & Muthen) for c=1 to C classes:
Class-Mixture model Y[2]n
(c) = p(c)*{b0(c) +b0
(c)Y[1]n(c) +en
(c) }where for each person the p(c)n =1
Key questions (1) Is there any empirical evidence for multiple latent classes? (2) Do parameters change over the latent classes? (3) Which individuals are likely to be in what class? (4) What other features are associated with the class membership?
Group Differences Related to aLatent Variable
Class AClass B
Mplus - Two Class LDS ModelTITLE: ADHD – Difference Model - Mixture Model;
DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;
VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7
adhd_to2 adhd_to4 adhd_to5 adhd_to7
adhd_in2 adhd_in4 adhd_in5 adhd_in7
adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;
USEVAR = adhd_to4 adhd_to7;
CLASSES = c(2);ANALYSIS: TYPE = MIXTURE; ESTIMATOR = ML; STARTS = 20 2;MODEL:
%OVERALL%adhd_to7 ON adhd_to4@1;
adhd_to4 (V_1)
adhd_to7@0;
[adhd_to4] (M_4);
[adhd_to7@0];
!Difference Score;
delta BY adhd_to7@1;
[delta] (M_Delta);
delta (V_delta);
delta WITH adhd_to4 (Cor);
%C#1%[delta*];[adhd_to4];
%C#2%[delta*];[adhd_to4];
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Loglikelihood
H0 Value -2203.236
Information Criteria
Number of Free Parameters 8
Akaike (AIC) 4422.472
Bayesian (BIC) 4452.467
Sample-Size Adjusted BIC 4427.094
(n* = (n + 2) / 24)
Note: Difference Score ModelLoglikelihood -2246.950BIC 4522.647AIC 4503.900Parameters 5
Mplus OutputFINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSESBASED ON THE ESTIMATED MODEL
LatentClasses
1 289.72755 0.922702 24.27245 0.07730
CLASSIFICATION QUALITYEntropy 0.932
CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP
Class Counts and Proportions
LatentClasses
1 293 0.933122 21 0.06688
Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column)
1 21 0.984 0.0162 0.063 0.937
Mplus OutputMODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value
Latent Class 1DELTA WITH
ADHD_TO4 -42.079 4.555 -9.238 0.000
MeansADHD_TO4 12.639 0.544 23.226 0.000DELTA -2.586 0.465 -5.561 0.000
Variances
ADHD_TO4 81.847 6.676 12.259 0.000
DELTA 54.824 4.821 11.372 0.000
Latent Class 2DELTA WITH
ADHD_TO4 -42.079 4.555 -9.238 0.000
MeansADHD_TO4 21.819 2.170 10.055 0.000DELTA 14.260 2.074 6.876 0.000
Variances
ADHD_TO4 81.847 6.676 12.259 0.000
DELTA 54.824 4.821 11.372 0.000
Categorical Latent VariablesMeans
C#1 2.480 0.289 8.586 0.000
Mplus - Two Class LDS Model (Mean/Var)TITLE: ADHD – Difference Model - Mixture Model;
DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;
VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7
adhd_to2 adhd_to4 adhd_to5 adhd_to7
adhd_in2 adhd_in4 adhd_in5 adhd_in7
adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;
USEVAR = adhd_to4 adhd_to7;
CLASSES = c(2);ANALYSIS: TYPE = MIXTURE; ESTIMATOR = ML; STARTS = 20 2;MODEL:
%OVERALL%adhd_to7 ON adhd_to4@1;
adhd_to4 (V_1)
adhd_to7@0;
[adhd_to4] (M_4);
[adhd_to7@0];
!Difference Score;
delta BY adhd_to7@1;
[delta] (M_Delta);
delta (V_delta);
delta WITH adhd_to4 (Cor);
%C#1%[delta*];[adhd_to4];delta adhd_to4;delta WITH adhd_to4;
%C#2%[delta*];[adhd_to4];delta adhd_to4;delta WITH adhd_to4;
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Loglikelihood
H0 Value -2152.832
Information Criteria
Number of Free Parameters 11
Akaike (AIC) 4327.664
Bayesian (BIC) 4368.908
Sample-Size Adjusted BIC 4334.019
(n* = (n + 2) / 24)
Note: 2-Class Means ModelLoglikelihood -2203.236BIC 4452.467AIC 4422.472Parameters 8
Note: Difference Score ModelLoglikelihood -2246.950BIC 4522.647AIC 4503.900Parameters 5
Mplus OutputFINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSESBASED ON THE ESTIMATED MODEL
LatentClasses
1 111.26242 0.354342 202.73758 0.64566
CLASSIFICATION QUALITYEntropy 0.659
CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP
Class Counts and Proportions
LatentClasses
1 99 0.315292 215 0.68471
Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column)
1 21 0.919 0.0812 0.095 0.905
Mplus OutputMODEL RESULTS
Two-Tailed
Estimate S.E. Est./S.E. P-Value
Latent Class 1
DELTA WITH
ADHD_TO4 -75.758 16.011 -4.731 0.000
Means
ADHD_TO4 21.364 1.301 16.425 0.000
DELTA -0.552 1.297 -0.426 0.670
Variances
ADHD_TO4 103.439 15.231 6.791 0.000
DELTA 171.130 26.855 6.372 0.000
Latent Class 2
DELTA WITH
ADHD_TO4 -11.498 2.330 -4.936 0.000
Means
ADHD_TO4 8.950 0.487 18.370 0.000
DELTA -1.685 0.394 -4.275 0.000
Variances
ADHD_TO4 24.704 3.511 7.036 0.000
DELTA 21.889 2.869 7.629 0.000
Categorical Latent Variables
Means
C#1 -0.600 0.203 -2.950 0.003
Mplus Model with Predictor of Latent ClassTITLE: ADHD – Difference Model - Mixture Model with Predictor;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to4 adhd_to7 girl;CLASSES = c(2);
ANALYSIS: TYPE = MIXTURE;ESTIMATOR = ML;
MODEL:%OVERALL%
adhd_to7 ON adhd_to4@1;
adhd_to4 (V_1)adhd_to7@0;[adhd_to4] (M_4);[adhd_to7@0];
!Difference Score;delta BY adhd_to7@1;[delta] (M_Delta);delta (V_delta);
delta WITH adhd_to4 (Cor);
c#1 ON girl;
%C#1%[adhd_to4];[delta*];adhd_to4 delta;delta WITH adhd_to4;
%C#2%[adhd_to4];[delta*];adhd_to4 delta;delta WITH adhd_to4;
Mplus OutputFINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES
BASED ON THE ESTIMATED MODEL
Latent
Classes
1 198.72610 0.63289
2 115.27390 0.36711
CLASSIFICATION QUALITY
Entropy 0.662
CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP
Class Counts and Proportions
Latent
Classes
1 209 0.66561
2 105 0.33439
Average Latent Class Probabilities for Most Likely Latent Class Membership (Row) by Latent Class (Column)
1 2
1 0.908 0.092
2 0.086 0.914
Mplus OutputLatent Class 1DELTA WITH
ADHD_TO4 -11.398 2.258 -5.049 0.000Means
ADHD_TO4 8.834 0.473 18.669 0.000DELTA -1.708 0.393 -4.349 0.000
VariancesADHD_TO4 24.019 3.270 7.345 0.000DELTA 21.486 2.777 7.736 0.000
Latent Class 2DELTA WITH
ADHD_TO4 -73.917 15.326 -4.823 0.000Means
ADHD_TO4 21.132 1.252 16.882 0.000DELTA -0.553 1.253 -0.441 0.659
VariancesADHD_TO4 102.188 14.600 6.999 0.000DELTA 166.588 25.460 6.543 0.000
Categorical Latent VariablesC#1 ON
GIRL 0.884 0.289 3.058 0.002Intercepts
C#1 0.094 0.255 0.369 0.712
LOGISTIC REGRESSION ODDS RATIO RESULTSCategorical Latent VariablesC#1 ON
GIRL 2.420
Current issues for Latent-Class modelsSeveral key questions are very hard to answer:
1. Is there empirical evidence for multiple latent classes? But the standard L2 tests are not necessarily nested!
2. Does the parameter change over the latent classes? But the change in fit L2 tests are not available!
3. Which individuals are likely to be in what class?But this can change and the fit stays the same!
4. What other features are associated with the class membership? But this can be an endless search!
6. Summary & Discussion
Using Regression in SEM• In theory, regression model concepts are an essential
part of all structural equation models, and vice-versa.
• The simultaneous analysis of individual and group differences is a major feature of data analysis in psychological and behavioral research.
• In practice, there remain a variety of problems with using new SEM programs – Small groups, incomplete data, and latent classes are still being explored.
• Group comparisons do not allow “causal inference”without good experimental design.
Resolving Lord’s paradox?• Lord (1967) raised the paradox that two correct but
different ways to analyze the same data (i.e., ANCOVA vs Repeated Measures ANOVA) could yield different answers in terms of the significance of group differences.
• Lord himself never did resolve this paradox, but other researchers (e.g., Wainer, Laird, etc.) showed how the different model assumptions led to the different conclusions, so everyone was correct.
• These results showed there was not a single correct model for the analysis of group differences in change.
• These results also showed that it may be most reasonable to consider a dual change model when the type of change is not clear in advance.
Benefits of the SEM approach• Multiple groups are a mainstay of psychological
research
• Organizes the mix of theory and methods
• Exposes many weakness of quant. methods
• Emphasis on invariance and replication
• Consistent with models of other sciences
• Benefits increasing with new computations
Limitations of the MG-SEM approach• Multiple group SEM requires relative large sample
sizes in each sub-group for powerful comparisons
• With smaller sizes, or complex models, the SEM computer programs do not always work well.
• The multiple group comparisons do not allow causal inference without good experimental design
• The effective use of the new programs for incomplete data and latent classes is still being explored.
References on Group/Mixture ModelingBonate, P. (2000). Analysis of Pretest-Posttest Designs. Basic, NY.
Holland, P.W. & Rubin, D.B. (1983). Re-considering Lord’s Paradox? In H. Wainer & S. Messick (Eds.), Principals of modern psychological measurement (pp. 3–25). Hillsdale, NJ: Erlbaum.
McArdle, J.J. (2006). Dynamic Structural Equation Modeling in Longitudinal Experimental Studies. In Kees van Montfort, Han Oud & Alberto Satorra (Editors), Longitudinal models in the behavioural and related sciences. Mahwah, NJ: Erlbaum.
Muthen, B. & Muthen, L. K. (2000). Integrating person-centered and variable-centered analysis: Growth mixture modeling with latent trajectory classes. Alcoholism:Clinical and Experimental Research, 24, 882-891.
Muthen, B. O. (2001). Latent variable mixture modeling. In G. A. Marcoulides & R. E. Schumacker (eds.), New Developments and Techniques in Structural Equation Modeling (pp. 1-33). Mahwah, NJ: Lawrence Erlbaum Associates.
Sörbom, D. (1978). An alternative to the methodology for analysis of covariance. Psychometrika, 43, 381-396.
Wainer, H. (1991). Adjusting for differential base rates: Lord’s paradox again. Psychological Bulletin, 109 (1), 147-151.
Session D: Alternate Models for Change over
Multiple Waves
Kevin GrimmUniversity of California, Davis
June 9, 2008
Overview
1. Multi-Wave longitudinal data2. Auto-Regressive/Time-Series models3. Latent Growth Curve Models4. Summary & Discussion
1. Multi-WaveLongitudinal data
Special Features of Multi-Wave DataMulti-wave longitudinal data share a few common features:
1. Some of the same entities (people, classes, colleges, countries) are repeatedly observed.
2. The procedures of measurement and scaling of observations are known and sometimes are repeated measures
3. The timing or ordered basis of the measurement of the observations is known.
Example Longitudinal ADHD Data
• N=272 children aged 2 to 7 (Complete Data)
• Assessments at ages 2, 4, 5, & 7
• ADHD Total Score
Measure of ADHD• ADHD Rating Scale-IV (ADHD RS-IV; DuPaul
et al., 1998)– Commonly used in clinical research including research
with very young children (Shelton et al., 2000)– 18-items – 4-point scale
• 0 (Never) to 3 (Very Often)– Nine Inattention symptoms & nine Hyperactivity
symptoms– IN score (ranging from 0-27)– HI score (ranging from 0-27)– Total AD/HD score (ranging from 0-54)
The CORR Procedure
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
adhd_to2 394 14.57559 7.84043 5743 0 41.00000adhd_to4 376 13.78305 9.29271 5182 0 51.00000adhd_to5 324 12.45603 9.26226 4036 0 54.00000adhd_to7 328 12.11428 10.13014 3973 0 54.00000
Pearson Correlation CoefficientsProb > |r| under H0: Rho=0
Number of Observations
adhd_to2 adhd_to4 adhd_to5 adhd_to7
adhd_to2 1.00000 0.56980 0.48636 0.49483<.0001 <.0001 <.0001
394 348 304 303
adhd_to4 0.56980 1.00000 0.68772 0.60359<.0001 <.0001 <.0001
348 376 315 314
adhd_to5 0.48636 0.68772 1.00000 0.75648<.0001 <.0001 <.0001
304 315 324 288
adhd_to7 0.49483 0.60359 0.75648 1.00000<.0001 <.0001 <.0001
303 314 288 328
Summary statistics for the ADHD Total Score Bivariate Scatterplots
Individual trajectories on the ADHD Total Score (N=50) Modeling Multi-Wave Data
• There are many classical models for the analysis of multivariate repeated measures
• These structural equation models combine factor analysis, time-series, and MANOVA, and are still widely available
• Subtle differences in the choice of models define the nature of developmental process and change --i.e., the dynamic systems
2: Auto-Regressive Time-Series Models
A “Markov Simplex” Time Series• We are interested in individual deviations
around the mean, so we detrend the data Y[t]Y*[t]n = Y[t]n - My[t]
• Next we can fit a “Markov-Chain” time-seriesmodel
Y*[t]n = [t] Y*[t-1]n + e[t]n
• In more general terms Y*[t]n = [t] Y*[t-1]n + [t-1] Y*[t-2]n + e[t]
A “First-Order” Markov Simplexmodel with time-based effects
Y[1]
e[2]
Y[2]
e[4]
Y[3]
e[4]
Y[4]
e[2]2
e[3]2
e[4]2
12
A “Second-Order” Markov Simplex model with time-based effects
Y[1]
e[2]
Y[2]
e[3]
Y[3]
e[4]
Y[4]
e[2]2
e[3]2
e[3]2
12
A “Fully Recursive” Markov chain model with time-based effects
Y[1]
e[2]
Y[2]
e[4]
Y[4]
e[6]
Y[6]
e[2]2
e[4]2
e[6]2
12
“Markov Simplex” Hypotheses• Some questions are about the auto-regression
[t] = 0 ?[t-1] = 0 ?[t] = [t-1] = [t-2] ?
• These questions can be formed about common factors
f[t]n = [t] f[t-1]n + u[t]n• In this context, only the covariances are
meaningful, and a series of models are fit to over time covariances to examine a series of hypotheses
• These can be fitted using standard regression or using MLE-SEM programs
Markov Simplex Model & ADHD Data
Y[2] Y[4] Y[5] Y[7]
e[4] e[5] e[7]
1 2 3
e[7]e[5]e[4]
Y[2]
ADHD Markov ModelMplus
TITLE: ADHD – Markov Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to2 adhd_to4 adhd_to5 adhd_to7;
ANALYSIS:MODEL=NOMEANSTRUCTURE;INFORMATION = EXPECTED;
MODEL:adhd_to7 ON adhd_to5;adhd_to5 ON adhd_to4;adhd_to4 ON adhd_to2;
adhd_to2 adhd_to4adhd_to5 adhd_to7;
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 23.519Degrees of Freedom 3P-Value 0.0000
CFI/TLICFI 0.961TLI 0.921
LoglikelihoodH0 Value -3688.229H1 Value -3676.469
Information CriteriaNumber of Free Parameters 7Akaike (AIC) 7390.457Bayesian (BIC) 7415.698Sample-Size Adjusted BIC 7393.503
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.15990 Percent C.I. 0.103 0.221Probability RMSEA <= .05 0.001
Markov Simplex Model for ADHD DataConstraint on Autoregressive Parameter
Y[2] Y[4] Y[5] Y[7]Y[3] Y[6]
e[4] e[5] e[6] e[7]e[3]
e
Y[2]
e e e e
TITLE: ADHD – Markov Model;DATA: FILE = adhd_uncg_wide.dat;
LISTWISE=ON;VARIABLE: NAMES = id
girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to2 adhd_to4 adhd_to5 adhd_to7;
ANALYSIS:MODEL=NOMEANSTRUCTURE;INFORMATION = EXPECTED;
MODEL:!Creating Phantom Variables for Placeholders;
adhd_to6 BY adhd_to2@0;adhd_to3 BY adhd_to2@0;
!Autoregressive Parameters;adhd_to7 ON adhd_to6 (beta); adhd_to6 ON adhd_to5 (beta);adhd_to5 ON adhd_to4 (beta); adhd_to4 ON adhd_to3 (beta);adhd_to3 ON adhd_to2 (beta);
!Variances;adhd_to2; adhd_to3 (V_e);adhd_to4 (V_e); adhd_to5 (V_e);adhd_to6 (V_e); adhd_to7 (V_e);
OUTPUT: SAMPSTAT;
ADHD Markov ModelMplus
TESTS OF MODEL FIT
Chi-Square Test of Model FitValue 69.459Degrees of Freedom 7P-Value 0.0000
CFI/TLICFI 0.880TLI 0.897
LoglikelihoodH0 Value -3711.198H1 Value -3676.469
Information CriteriaNumber of Free Parameters 3Akaike (AIC) 7428.397Bayesian (BIC) 7439.214Sample-Size Adjusted BIC 7429.702
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.18190 Percent C.I. 0.144 0.221Probability RMSEA <= .05
Mplus Output
MODEL RESULTSTwo-Tailed
Estimate S.E. Est./S.E. P-ValueADHD_TO6 BY
ADHD_TO2 0.000 0.000 999.000 999.000ADHD_TO3 BY
ADHD_TO2 0.000 0.000 999.000 999.000ADHD_TO6 ON
ADHD_TO5 0.794 0.020 39.563 0.000ADHD_TO3 ON
ADHD_TO2 0.794 0.020 39.563 0.000ADHD_TO7 ON
ADHD_TO6 0.794 0.020 39.563 0.000ADHD_TO4 ON
ADHD_TO3 0.794 0.020 39.563 0.000ADHD_TO5 ON
ADHD_TO4 0.794 0.020 39.563 0.000Residual Variances
ADHD_TO2 63.521 5.447 11.662 0.000ADHD_TO4 36.688 1.878 19.533 0.000ADHD_TO5 36.688 1.878 19.533 0.000ADHD_TO7 36.688 1.878 19.533 0.000ADHD_TO6 36.688 1.878 19.533 0.000ADHD_TO3 36.688 1.878 19.533 0.000
Mplus Output Quasi-Simplex Model for ADHD DataConstraints on Autoregressive Parameter
Y[2] Y[4] Y[5] Y[7]
e[4] e[5] e[7]
e e e
f[4] f[5] f[6] f[7]f[3]
f f f f f
f[2]
e[2]
e
f[2]
MODEL:!Creating Phantom Variables for Placeholders;
l_adhd6 BY adhd_to2@0;l_adhd3 BY adhd_to2@0;
!Creating "True Scores";l_adhd2 BY adhd_to2@1; l_adhd4 BY adhd_to4@1;l_adhd5 BY adhd_to5@1; l_adhd7 BY adhd_to7@1;
!Autoregressive Parameters for "True Scores";l_adhd7 ON l_adhd6 (beta); l_adhd6 ON l_adhd5 (beta);l_adhd5 ON l_adhd4 (beta); l_adhd4 ON l_adhd3 (beta);l_adhd3 ON l_adhd2 (beta);
!Residual Variances;adhd_to2 (V_e) adhd_to4 (V_e)adhd_to5 (V_e) adhd_to7 (V_e);
l_adhd2 l_adhd3 (V_f)l_adhd4 (V_f) l_adhd5 (V_f)l_adhd6 (V_f) l_adhd7 (V_f);
OUTPUT: SAMPSTAT;
Quasi-Simplex Model for ADHD DataMplus
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 26.435Degrees of Freedom 6P-Value 0.0002
CFI/TLICFI 0.961TLI 0.961
LoglikelihoodH0 Value -3689.687H1 Value -3676.469
Information CriteriaNumber of Free Parameters 4Akaike (AIC) 7387.373Bayesian (BIC) 7401.796Sample-Size Adjusted BIC 7389.114
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.11290 Percent C.I. 0.071 0.157Probability RMSEA <= .05 0.009
Mplus OutputMODEL RESULTS
Two-TailedEstimate S.E. Est./S.E. P-Value
L_ADHD7 ONL_ADHD6 0.961 0.026 36.933 0.000
L_ADHD6 ONL_ADHD5 0.961 0.026 36.933 0.000
L_ADHD5 ONL_ADHD4 0.961 0.026 36.933 0.000
L_ADHD4 ONL_ADHD3 0.961 0.026 36.933 0.000
L_ADHD3 ONL_ADHD2 0.961 0.026 36.933 0.000
VariancesL_ADHD2 44.284 5.596 7.913 0.000
Residual VariancesADHD_TO2 18.945 2.638 7.181 0.000ADHD_TO4 18.945 2.638 7.181 0.000ADHD_TO5 18.945 2.638 7.181 0.000ADHD_TO7 18.945 2.638 7.181 0.000L_ADHD6 11.822 2.672 4.425 0.000L_ADHD3 11.822 2.672 4.425 0.000L_ADHD4 11.822 2.672 4.425 0.000L_ADHD5 11.822 2.672 4.425 0.000L_ADHD7 11.822 2.672 4.425 0.000
Alternative Classic Time-Series ModelAR(1) + MA(1) = ARIMA for ADHD Data
Y[2] Y[4] Y[5] Y[7]
e[4] e[5] e[6] e[7]e[3]
e e e e e
y[3] y[6]
Y[2]
ARMA(1,1)Mplus
MODEL:!Creating Phantom Variables for Placeholders;
adhd_to3 BY adhd_to2@0;adhd_to6 BY adhd_to2@0;
!Autoregressive Parameters;adhd_to7 ON adhd_to6 (beta); adhd_to6 ON adhd_to5 (beta);adhd_to5 ON adhd_to4 (beta); adhd_to4 ON adhd_to3 (beta);adhd_to3 ON adhd_to2 (beta);
!Creating Latent Variables;l_adhd3 BY adhd_to3@1; l_adhd4 BY adhd_to4@1; l_adhd5 BY adhd_to5@1; l_adhd6 BY adhd_to6@1; l_adhd7 BY adhd_to7@1;
!Moving Average;adhd_to7 ON l_adhd6 (Ma1); adhd_to6 ON l_adhd5 (Ma1);adhd_to5 ON l_adhd4 (Ma1); !adhd_to4 ON l_adhd3 (Ma1);
!Variances;adhd_to2 adhd_to4@0 adhd_to5@0 adhd_to7@0;l_adhd3*10 (V_f) l_adhd4*10 (V_f) l_adhd5*10 (V_f)l_adhd6*10 (V_f) l_adhd7*10 (V_f);
!Unwanted Correlations;l_adhd3 WITH l_adhd4-l_adhd7@0; l_adhd4 WITH l_adhd5-l_adhd7@0;l_adhd5 WITH l_adhd6-l_adhd7@0; l_adhd6 WITH l_adhd7@0;adhd_to2 WITH l_adhd3-l_adhd7@0;
!Additional Variances;adhd_to3@0; adhd_to6@0;
OUTPUT: SAMPSTAT;
Mplus OutputTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 37.722Degrees of Freedom 6P-Value 0.0000
CFI/TLICFI 0.939TLI 0.939
LoglikelihoodH0 Value -3695.330H1 Value -3676.469
Information CriteriaNumber of Free Parameters 4Akaike (AIC) 7398.660Bayesian (BIC) 7413.084Sample-Size Adjusted BIC 7400.401
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.13990 Percent C.I. 0.099 0.184Probability RMSEA <= .05 0.000
MODEL RESULTSTwo-Tailed
Estimate S.E. Est./S.E. P-ValueADHD_TO6 ON
L_ADHD5 -0.368 0.063 -5.859 0.000ADHD_TO6 ON
ADHD_TO5 0.884 0.018 49.527 0.000ADHD_TO3 ON
ADHD_TO2 0.884 0.018 49.527 0.000ADHD_TO7 ON
ADHD_TO6 0.884 0.018 49.527 0.000L_ADHD6 -0.368 0.063 -5.859 0.000
ADHD_TO5 ONL_ADHD4 -0.368 0.063 -5.859 0.000
ADHD_TO4 ONADHD_TO3 0.884 0.018 49.527 0.000
ADHD_TO5 ONADHD_TO4 0.884 0.018 49.527 0.000
VariancesL_ADHD3 36.502 1.844 19.798 0.000L_ADHD4 36.502 1.844 19.798 0.000L_ADHD5 36.502 1.844 19.798 0.000L_ADHD6 36.502 1.844 19.798 0.000L_ADHD7 36.502 1.844 19.798 0.000
Residual VariancesADHD_TO2 63.521 5.447 11.662 0.000
Mplus Output Concepts From Time Series Models• There are many ways to use simplex concepts to
model longitudinal covariances or correlations.• These models are typically fit to much longer time
series, with T>100, so this example with T=4 (or T=6)is intended for illustration only.
• The results for the autoregressive (Markov simplex) models show some time-series prediction, but a simple prediction series is not likely to account for the data.
• Group differences in simplex model forms or parameters are useful, but may be awkward to incorporate with continuous covariates
• Most critically, these models suggest the tests of covariances can be useful in examining different hypotheses about change from the pattern of individual differences
3: Latent Growth Curve
SEM Basis of Latent Curve Models1. We start with a “first level” model of random effects
Y[t]n = in + A[t] sn + e[t]nwhere i latent scores representing an individual’s initial level or intercept; A[t] are group “basis” parameters represent some form of timing; s are latent slopes for the individual change over time, and the e[t] are errors of measurements.
2. There is also a second level of “fixed effects”in = 0 + dinsn = 1 + dsn
so the levels and slope scores have means ( j) and residuals (dj), and the residuals have variance components ( j
2 ).3. Fitted directly to overcome problems of “gain scores”
Summary of Latent Curve expectations• This model leads to restrictive hypotheses about both
the means and the covariances • The structure of the means over time is
E{Y[t],1} = E{Y[t]} = t = i + s A[t]• The structure of the variance over time is
E{Y[t]- t,Y[t]- t}=t2 = i
2 + { s2 A[t]2 + 2 is A[t]} + e
2
• The structure of the covariance over time isE{Y[t]- t,Y[t+k]- t+k}=
t,t+k = i2 + A[t] s
2A[t+k] + 2 A[t] is A[t+k]• The basis coefficients A[t] are not fixed, but can be,
and this permits the formation of many alternative and restrictive hypotheses (to be discussed).
Growth Curve Analysis– Latent Growth Curve (LGC) – Hierarchical Linear Modeling (HLM)
• Not necessarily linear, though– Mixed-Effects Models for Longitudinal Data– Multi-level models of change (MLMC)
• Class of methods used to study change– explicit testing of hypotheses regarding the
structure of longitudinal data
Latent Growth Curve Analysis
• Origins:• Rao (1958), Tucker (1958), Meredith & Tisak
(1984, 1990)
• Extensions:• Browne & DuToit, (1991), McArdle (1988, 2001),
McArdle & Epstein (1987)
• Overviews:• McArdle & Nesselroade (2003), Singer & Willet
(2003)
Key Questions from Latent Growth Curve Analysis
• How does the construct change over time/age?• Are there interindividual differences in the level and/or
rate of change over time/age?• How is the level of the construct related to the rate of
change?
• Extensions– What interindividual characteristic relate to interindividual
differences in the level and/or rate of change?– How are changes in one variable associated with change in another
variable?– Are there a time-dependent relationships in the development of
two or more variables?
Y1 Y2 Y3 Y4
20 2
1
g010
1
0 1
0 0 0
1 1 1
[ ] 1 [ ] [ ]n n n n
n n
n n
Y t g A t g e tgg
g1
A[t]
time
scor
e
n=1
n=3n=2
intercept
slope
01
2e
2e
2e
2e
Programs for Growth Curve Analysis• Mixed-Effects/Multilevel Programs
– SAS PROC MIXED/NLMIXED– SPSS MIXED– R (lme/nlme)– Stata– MLwin– HLM
• Structural Modeling Programs– Mplus– AMOS– LISREL– Mx
Longitudinal Plot of ADHD Total Score(N = 50)
Basic Growth Curve Models
0 1
0 0 0
1 1 1
[ ] 1 [ ] [ ]n n n n
n n
n n
Y t g A t g e tgg
• Linear Growth (A[t]=0,2,3,5 or A[t]=0,.4,.6,1)
• Latent Growth (A[t]=0, 1, 2, )
0
0 0 0
[ ] 1 [ ]n n n
n n
Y t g e tg
0 1
0 0 0
1 1 1
[ ] 1 [ ] [ ]n n n n
n n
n n
Y t g A t g e tgg
• Level Only
Level Only Model
0
0 0 0
[ ] 1 [ ]n n n
n n
Y t g e tg
2e
g0
Y[2] Y[4] Y[5] Y[7]
002
1
2e
2e
2e
e[2] e[4] e[5] e[7]
TITLE: ADHD – Level Only Model;DATA: FILE = adhd_uncg_wide.dat;
VARIABLE: NAMES = id girl minority ses2yr dadhp7 doddp7adhd_to2 adhd_to4 adhd_to5 adhd_to7 adhd_in2 adhd_in4 adhd_in5 adhd_in7 adhd_hy2 adhd_hy4 adhd_hy5 adhd_hy7;
MISSING = .;USEVAR = adhd_to2 adhd_to4 adhd_to5 adhd_to7;
ANALYSIS: TYPE= MEANSTRUCTURE;
MODEL:g0 BY adhd_to2@1 adhd_to4@1 adhd_to5@1 adhd_to7@1;[g0];g0;
[adhd_to2-adhd_to7@0];adhd_to2-adhd_to7 (resid1);
Mplus Script for Level Only Growth Model
SUMMARY OF ANALYSIS
Number of observations 272
SAMPLE STATISTICS
MeansADHD_TO2 ADHD_TO4 ADHD_TO5 ADHD_TO7________ ________ ________ ________14.192 13.076 12.162 11.928
CovariancesADHD_TO2 ADHD_TO4 ADHD_TO5 ADHD_TO7________ ________ ________ ________
ADHD_TO2 63.755ADHD_TO4 41.119 86.966ADHD_TO5 34.992 59.138 86.297ADHD_TO7 38.444 53.650 68.907 95.279
CorrelationsADHD_TO2 ADHD_TO4 ADHD_TO5 ADHD_TO7________ ________ ________ ________
ADHD_TO2 1.000ADHD_TO4 0.552 1.000ADHD_TO5 0.472 0.683 1.000ADHD_TO7 0.493 0.589 0.760 1.000
Mplus Output for Level Only Growth ModelTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 106.813Degrees of Freedom 11P-Value 0.0000
CFI/TLICFI 0.816TLI 0.900
LoglikelihoodH0 Value -3729.876H1 Value -3676.469
Information Criteria
Number of Free Parameters 3Akaike (AIC) 7465.751Bayesian (BIC) 7476.569Sample-Size Adjusted BIC 7467.056
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.17990 Percent C.I. 0.149 0.211Probability RMSEA <= .05 0.000
Mplus Output for Level Only Growth Model
MODEL RESULTS
Estimates S.E. Est./S.E.
G0 BYADHD_TO2 1.000 0.000 0.000ADHD_TO4 1.000 0.000 0.000ADHD_TO5 1.000 0.000 0.000ADHD_TO7 1.000 0.000 0.000
MeansG0 12.840 0.460 27.904
InterceptsADHD_TO2 0.000 0.000 0.000ADHD_TO4 0.000 0.000 0.000ADHD_TO5 0.000 0.000 0.000ADHD_TO7 0.000 0.000 0.000
VariancesG0 48.929 4.957 9.871
Residual VariancesADHD_TO2 34.633 1.715 20.199ADHD_TO4 34.633 1.715 20.199ADHD_TO5 34.633 1.715 20.199ADHD_TO7 34.633 1.715 20.199
Mplus Output for Level Only Growth ModelLevel Only Growth Model
0
20
2
12.848.934.6e
34.6
g0
Y[2] Y[4] Y[5] Y[7]
12.848.9 1
e[2] e[4] e[5] e[7]
34.6 34.6 34.6
(A) Predicted Plot for Level Only Growth Model and (B) Longitudinal Residual Plot
(A)
(B)
Table of Results
0.18RMSEA ( a)--2/ df
107/112/dfFit Statistics
34.6Residual Variance ( e2)
--Level/Slope Covariance ( 01)--Slope Variance ( 1
2)48.9Level Variance ( 0
2)Random Effects
--Basis Coefficients--Slope Mean ( 1)
12.8Level Mean ( 0)Fixed Effects
LatentLinearLevelParameter
SAS/SPSS Script for No Growth ModelTITLE1 'No Growth Model - Variance in Intercept';PROC MIXED DATA=adhd_long NOCLPRINT COVTEST;
CLASS id;MODEL adhd = /SOLUTION DDFM=BW NOTEST CHISQ
OUTP=nogrowthpred;RANDOM INTERCEPT / SUBJECT=id TYPE=UN;
RUN;
*Intercept only - No Growth Model.Mixed adhd
/print=solution/method=ml/fixed=intercept/random intercept | subject(id).
Linear Growth Model
0 1
0 0 0
1 1 1
[ ] 1 [ ] [ ][ ] 0,.4,.6,1
n n n n
n n
n n
Y t g A t g e tA tggg0
Y[2] Y[4] Y[5] Y[7]
e[2] e[4] e[5] e[7]
g1
0
01
0,1
11
0
2e
2e
2e
2e
.4 .61
MODEL:g0 BY
adhd_to2@1adhd_to4@1adhd_to5@1adhd_to7@1;
[g0];g0;
!Linear Slope;g1 BY
adhd_to2@[email protected][email protected]_to7@1;
[g1];g1;g0 WITH g1;
[adhd_to2-adhd_to7@0];adhd_to2-adhd_to7 (resid1);
Mplus Script for Linear Growth Curve
Chi-Square Test of Model FitValue 43.525Degrees of Freedom 8P-Value 0.0000
CFI/TLICFI 0.932TLI 0.949
LoglikelihoodH0 Value -3698.231H1 Value -3676.469
Information CriteriaNumber of Free Parameters 6Akaike (AIC) 7408.463Bayesian (BIC) 7430.098Sample-Size Adjusted BIC 7411.073
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.12890 Percent C.I. 0.092 0.166Probability RMSEA <= .05 0.000
Mplus Output for Linear Growth Curve
Estimates S.E. Est./S.E.
G0 BYADHD_TO2 1.000 0.000 0.000ADHD_TO4 1.000 0.000 0.000ADHD_TO5 1.000 0.000 0.000ADHD_TO7 1.000 0.000 0.000
G1 BYADHD_TO2 0.000 0.000 0.000ADHD_TO4 0.400 0.000 0.000ADHD_TO5 0.600 0.000 0.000ADHD_TO7 1.000 0.000 0.000
G0 WITHG1 3.467 4.861 0.713
MeansG0 14.016 0.469 29.866G1 -2.352 0.561 -4.193
VariancesG0 39.249 5.287 7.424G1 31.225 8.044 3.882
Residual VariancesADHD_TO2 28.262 1.714 16.492ADHD_TO4 28.262 1.714 16.492ADHD_TO5 28.262 1.714 16.492ADHD_TO7 28.262 1.714 16.492
Mplus Output for Linear Growth Curve Linear Growth Model Results
0 12 2
0 1
012
14.0, 2.4
39.2, 31.2,3.5
28.3e
g0
Y[2] Y[4] Y[5] Y[7]
e[2] e[4] e[5] e[7]
g11
0
28.3
.4 .61
28.3 28.3 28.3
(A) Predicted Plot for Linear Growth Model and (B) Longitudinal Residual Plot
(A)
(B)
Table of Results
0.130.18RMSEA ( a)--2/ df
44/8107/112/dfFit Statistics
28.3*34.6*Residual Variance ( e2)
3.5--Level/Slope Covariance ( 01)31.2*--Slope Variance ( 1
2)39.2*48.9*Level Variance ( 0
2)Random Effects
=0, =.4, =.6, =1
--Basis Coefficients-2.4*--Slope Mean ( 1)14.0*12.8*Level Mean ( 0)
Fixed EffectsLatentLinearLevelParameter
SAS/SPSS Script for Linear Growth Model
TITLE1 'Linear Growth Model';PROC MIXED DATA=adhd_long NOCLPRINT COVTEST
METHOD=ML;CLASS id;MODEL adhd = age
/SOLUTION DDFM=BW NOTEST CHISQ OUTP=linearpred;RANDOM INTERCEPT age / SUBJECT=id TYPE=UN GCORR;
RUN;
*Linear Growth Model.Mixed adhd with age
/print=solution/method=ml/fixed= intercept age/random intercept age | subject(id) covtype(un).
Latent Basis Growth Model
0 1
4 5
0 0 0
1 1 1
[ ] 1 [ ] [ ][ ] 0, , ,1
n n n n
n n
n n
Y t g A t g e tA tggg0
Y[2] Y[4] Y[5] Y[7]
e[2] e[4] e[5] e[7]
g1
0
01
0,1
11
0
2e
2e
2e
2e
415
MODEL:g0 BY
adhd_to2@1adhd_to4@1adhd_to5@1adhd_to7@1;
[g0];g0;
!Linear Slope;g1 BY
adhd_to2@0adhd_to4*.4adhd_to5*.6adhd_to7@1;
[g1];g1;g0 WITH g1;
[adhd_to2-adhd_to7@0];adhd_to2-adhd_to7 (resid1);
Mplus Script for Latent Basis Growth Curve Mplus Output for Latent Basis ModelTESTS OF MODEL FIT
Chi-Square Test of Model FitValue 28.090Degrees of Freedom 6P-Value 0.0001
CFI/TLICFI 0.958TLI 0.958
LoglikelihoodH0 Value -3690.514H1 Value -3676.469
Information CriteriaNumber of Free Parameters 8Akaike (AIC) 7397.028Bayesian (BIC) 7425.874Sample-Size Adjusted BIC 7400.509
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)Estimate 0.11690 Percent C.I. 0.075 0.161Probability RMSEA <= .05 0.006
MODEL RESULTS
Estimates S.E. Est./S.E.G0 BYADHD_TO2 1.000 0.000 0.000ADHD_TO4 1.000 0.000 0.000ADHD_TO5 1.000 0.000 0.000ADHD_TO7 1.000 0.000 0.000
G1 BYADHD_TO2 0.000 0.000 0.000ADHD_TO4 0.528 0.084 6.326ADHD_TO5 0.927 0.093 9.958ADHD_TO7 1.000 0.000 0.000
G0 WITHG1 -1.359 4.831 -0.281
MeansG0 14.215 0.485 29.310G1 -2.241 0.534 -4.199
VariancesG0 40.346 5.575 7.237G1 32.440 7.477 4.339
Residual VariancesADHD_TO2 26.753 1.622 16.492ADHD_TO4 26.753 1.622 16.492ADHD_TO5 26.753 1.622 16.492ADHD_TO7 26.753 1.622 16.492
Mplus Output for Latent Basis Model Linear Growth Model Results
0 1
4 52 2
0 1
012
14.2, 2.2.53, .93
40.3, 32.4,1.4
26.8e
g0
Y[2] Y[4] Y[5] Y[7]
e[2] e[4] e[5] e[7]
g11
0
26.8
.53 .931
26.8 26.8 26.8
(A) Predicted Plot for Latent Growth Model and (B) Longitudinal Residual Plot(A)
(B)
Table of Results
0.120.130.18RMSEA ( a)16/263/3--2/ df28/644/8107/112/df
Fit Statistics26.8*28.3*34.6*Residual Variance ( e
2)-1.43.5--Level/Slope Covariance ( 01)
32.4*31.2*--Slope Variance ( 12)
40.3*39.2*48.9*Level Variance ( 02)
Random Effects
=0, .53, .93, =1
=0, =.4, =.6, =1
--Basis Coefficients-2.2*-2.4*--Slope Mean ( 1)14.2*14.0*12.8*Level Mean ( 0)
Fixed EffectsLatentLinearLevelParameter
4. Summary & Discussion
Re-centering and Rescaling• Versions of the same growth model can be obtained
by modifying A[t] by a constant k. • For example, for a linear growth curve
A[t] = [0,1,2,3,4,5] or A[t] = [-2,-1,0,1,2] • This recentering moves the “intercept” to the third
occasion, and parameter estimates are altered, but fit remains the same
• It is also possible to rescale by A[t] = [(t-k1)/k2]k1=1 k2=5 so
A[t] = [0,1/5,2/5,3/5,4/5,5/5] = [0,.2,.4,.6,.8,1]• This rescaling alters the parameter estimates (namely
the slope parameters), but the fit remains the same.
Does the Single Latent Curve model fit?• The systematic inclusion of change components can be
directly indexed by the improvement in goodness-of-fit. • The growth model includes several components
Y[t]n = { in + A[t] sn } + e[t]n orY[t]n = y[t]n + e[t]n
where y[t] is now a latent form of a “predicted score.”• The relative the size of the error e may be compared to the
total variance of the scores e using standard reliability as y[t]
2 = [ y[t]2 - e
2 ] / y[t]2
but these indices are not the same at each time so we may consider a curve of reliability.
• This likelihood can be decomposed into separate fits for separate individuals (L2
n) so outliers can be considered.
Pitfalls in Interpretation
• The shape of the curve is defined by the relative size of the basis coefficients A[t]
• The scaling of the basis (A[t] = [(t-k1)/k2] determines the size (and sign) of the latent means and covariances – so these parameters are always relative to chosen scaling (k1 and k2)
• The mean ( g1) and variance of the slope ( g1) are invariant to placement of the intercept, but the latter does not directly represent the “change in variance” because the covariance of level and slope ( y0y1) must be considered as well
• Tests of individual differences in change should not simply be based on the slope variance but must also include the level-slope covariance
Pitfalls in Interpretation
• The correlation of the initial level and slope always describes some features of the growth plot, but its size and sign are dependent on the placement of the intercept (i.e., the 0 as defined in A[t], Rogosa & Willet, 1985)
• Unless this initial position (A[t]=0) has some specific meaning in substantive terms common for all persons (e.g., birth, surgery, etc.) this correlation ( 01) does not reflect a “law of initial values”
• The simple conversion to a regression ( 10) is not always a key substantive feature – take care in interpreting such “slope on level” regressions
Extensions• Growth Models with Extension Variables (McArdle & Epstein, 1987)
• Multiple Group Growth Models (McArdle, 1989)
• Multivariate Growth Models (McArdle, 1988)– Correlated Growth
• Structured Nonlinear Growth Models (Browne & du Toit, 1991; Browne, 1993)– Basis follows a structured nonlinear equation (e.g. exponential)
• Latent Difference Score Growth Models (McArdle, 2001; McArdle & Hamagami, 2001; Hamagami & McArdle, 2001)– Dynamic Growth
• Growth Mixture Models (Muthén & Shedden, 1999; Muthén & Muthén, 2000)– Clustering based on developmental trends
Growth Model with Extension Variable
g100 01
02
120, 1
1
Y[1] Y[2] Y[3] Y[4]
g0
e[1] e[2] e[3] e[4]
e e e e
A[t]
g0* g1
*
X1110
0 1
*0 00 10 0
*1 01 11 1
[ ] 1 [ ] [ ]
[ ]
[ ]
n n n n
n n n
n n n
Y t g A t g e tg X gg X g
Multiple Group Growth Model
g10(1)
1(1)0
2(1) 12(1)
0, 1(1)
1
Y[1] Y[2] Y[3] Y[4]
g0
e[1] e[2] e[3] e[4]
1(1)
30
e(1)
2(1)
Group 1 Group G
e(1)
e(1)
e(1)
g10
(g)1
(g)02(g) 1
2(g)
0, 1(g)
1
Y[1] Y[2] Y[3] Y[4]
g0
e[1] e[2] e[3] e[4]
1(g)
30
e(g)
2(g)
e(g)
e(g)
e(g)
Multivariate Growth
g1
g0g1
g02
g12
g0, g1
1
Y[1] Y[2] Y[3] Y[4]
g0
ye[1] ye[2] ye[3] ye[4]
yeyeyeye
h1
h0h1
h02
h12
h0, h1
X[1] X[2] X[3] X[4]
h0
xe[1] xe[2] xe[3] xe[4]
xexexexe
g1, h0
g0, h0 g1, h1
g0, h1
Structured Growth Curves
1 2
0 1
0 0 0
,1 1 1
( ) ( ) ( )
[ ] 1 [ ] [ ]
[ ] e ,e e
n n n n
n n
n n
t t t
Y t g A t g e tgg
where A t
g10 1
02
12
0, 1
1
Y[1] Y[2] Y[3] Y[4]
g0
e[1] e[2] e[3] e[4]
e e e e
A[t]
y[2]
1g0
g0
y[1]
Y[1]
ey[1]
y2
y[2]
Y[2]
ey[2]
y2
y[3]
y[3]
Y[3]
ey[3]
y2
y[4]
y[4]
Y[4]
ey[4]
y2
g1
g1
y y y
1[ ] 1 [ 1]n n y ny t g y t
Latent Difference Score – Dual Change Model
g12g0
2
g0,g1
sx,sy
x[2]
y[2]
x[3]
y[3]
x[4]
y[4]
y y y
x x x
yx yx yx
xy xy xy
2sx
2sy
X[2]
x[1] x[2]
sx[1] sx[2]
y[1] y[2]
sy[1] sy[2]
x[3]
sx[3]
y[3]
sy[3]
x[4]
sx[4]
y[4]
sy[4]
X[1] X[3] X[4]
Y[4]Y[1] Y[2] Y[3]
2sx
2sy
2sx
2sy
2sx
2sy
h0
g0
1
g0
h0
g0,h0
g0
h0
h0,h1
h1
g1
g1
h1
g0,g1
g1,h1
y
x
x
y
g1
h1
x0,y1
h0,g1
ey ey ey ey
g0
Y[1] Y[2] Y[3] Y[4]
ey[1] ey[2] ey[3] ey[4]
g0* g1
*
g1
0(k)
0 1
0,1
1(k)
1
ey
Y[5]
ey[5]
CReferences
Browne, M. W. (1993). Structured latent curve analysis. In C. M. Cuadras & C. R. Rao (Eds.), Multivariate analysis: Future directions 2 (pp. 171-197). Amsterdam: Elsevier Science.
Browne, M. W. & du Toit, S. (1991). Models for learning data. In L. M. Collins & J. L. Horn (Eds.), Best methods for the analysis of change: Recent advances, unanswered questions, future directions (pp. 47-68). Washington, DC: American Psychological Association.
Grimm, K. J. (2007). Multivariate longitudinal methods for studying developmental relationships between depression and academic achievement. International Journal of Behavioral Development, 31, 328-339.
Grimm, K. J. & Ram, N. (2008). Nonlinear growth models in Mplus & SAS. Manuscript Submitted for Publication.
Hamagami, F. & McArdle, J. J. (2001). Advanced studies of individual differences linear dynamic models for longitudinal data analysis. In G. A. Marcoulides & R. E. Schumacker, (Eds.), New developments and techniques in structural equation modeling (pp. 203-246). Mahwah, NJ: Erlbaum.
McArdle, J. J. (1986). Latent variable growth within behavior genetic models. Behavior Genetics,16, 163-200.
McArdle, J. J. (1988). Dynamic but structural equation modeling of repeated measures data. In J. R. Nesselroade & R. B. Cattell (Eds.), Handbook of multivariate experimental psychology(vol. 2, pp. 561-614). New York: Plenum.
McArdle, J. J. (1989). A structural modeling experiment with multiple growth functions. In R. Kanfer, P.L. Ackerman, & R. Cudeck (Eds.), Abilities, motivation, and methodology: The Minnesota Symposium on Learning and Individual Differences (pp. 71-117). Hillsdale, NJ: Earlbaum.
McArdle, J.J. (2001). A latent difference score approach to longitudinal dynamic structural analyses. In R. Cudeck, S. du Toit, & D. Sorbom (Eds.), Structural equation modeling: Present and future (p. 342-380). Lincolnwood, IL: Scientific Software International.
ReferencesMcArdle, J.J. & Epstein, D. (1987). Latent growth curves within developmental structural
equation models. Child Psychology, 58, 110-133.McArdle, J. J., & Hamagami, F. (2001). Latent difference score structural models for linear
dynamic analyses with incomplete longitudinal data. In L. Collins & A. Sayer (Eds.), Newmethods for the analysis of change (pp. 139–175). Washington, DC: American Psychological Association.
McArdle, J.J. & Nesselroade J.R. (2003). Growth curve analysis in contemporary psychologicalresearch. In J. A. Schinka & W. F. Velicer (Eds). Handbook of psychology: Research methods in psychology (Vol. 2, pp. 447-480). New York, NY: John Wiley & Sons.
Meredith, W. & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55, 107-122.Muthén, B. O. & Muthén, L. K. (2000). Integrating person-centered and variable-centered
analysis: Growth mixture modeling with latent trajectory classes. Alcoholism: Clinical and Experimental Research, 24, 882 – 891.
Muthén, B. & Shedden, K. (1999). Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics, 55, 463-469.
Rao, C.R. (1958). Some statistical methods for the comparison of growth curves. Biometrics, 14, 1-17.
Ram, N., & Grimm, K. J. (2007). Using simple and complex growth models to articulate developmental change: Matching method to theory. International Journal of Behavioral Development, 31, 303-316.
Rogosa, D. R. & Willet, J. B. (1985). Understanding correlates of change by modeling individual differences in growth. Psychometrika, 50, 203-228.
Singer, J. D & Willet, J. B. (2003). Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence.
Tucker, L.R (1958). Determination of parameters of a functional relation by factor analysis. Psychometrika, 23, 19-23.
Collaborators• Jack McArdle• John Nesselroade
Institute of Education Sciences
Longitudinal Research InstituteJefferson Psychometric Laboratory
Center for Developmental & Health Research Methodology
• Nilam Ram • Fumiaki Hamagami
Acknowledgments