Date post: | 07-Jul-2018 |
Category: |
Documents |
Upload: | franckiko2 |
View: | 222 times |
Download: | 1 times |
of 52
8/18/2019 Structural Equation Modelling for Small Samples
1/52
1
Structural Equation Modelling for small samples
Michel Tenenhaus HEC School of Management (GRECHEC),
1 rue de la Libération, Jouy-en-Josas, France [[email protected]]
Abstract
Two complementary schools have come to the fore in the field of Structural Equation Modelling
(SEM): covariance-based SEM and component-based SEM.
The first approach developed around Karl Jöreskog. It can be considered as a generalisation of both
principal component analysis and factor analysis to the case of several data tables connected by
causal links.
The second approach developed around Herman Wold under the name "PLS" (Partial Least
Squares). More recently Hwang and Takane (2004) have proposed a new method namedGeneralized Structural Component Analysis. This second approach is a generalisation of principal
component analysis (PCA) to the case of several data tables connected by causal links.
Covariance-based SEM is usually used with an objective of model validation and needs a large
sample (what is large varies from an author to another: more than 100 subjects and preferably more
than 200 subjects are often mentioned). Component-based SEM is mainly used for score
computation and can be carried out on very small samples. A research based on 6 subjects has been
published by Tenenhaus, Pagès, Ambroisine & Guinot (2005) and will be used in this paper.
In 1996, Roderick McDonald published a paper in which he showed how to carry out a PCA using
the ULS (Unweighted Least Squares) criterion in the covariance-based SEM approach. He
concluded from this that he could in fact use the covariance-based SEM approach to obtain resultssimilar to those of the PLS approach, but with a precise optimisation criterion in place of an
algorithm with not well known properties.
In this research, we will explore the use of ULS-SEM and PLS on small samples. First experiences
have already shown that score computation and bootstrap validation are very insensitive to the
choice of the method. We will also study the very important contribution of these methods to multi-
block analysis.
Key words: Multi-block analysis, PLS path modelling, Structural Equation Modelling, Unweighted
Least Squares
Introduction
Compare to covariance-based SEM, PLS suffers from several handicaps: (1) the diffusion of path
modelling softwares is much more confidential than that of covariance-based SEM softwares, (2) the
PLS algorithm is more an heuristic than an algorithm with well known properties and (3) the
possibility of imposing value or equality constraints on path coefficients is easily managed in
covariance-based SEM and does not exist in PLS. Of course, PLS has also some advantages on
covariance-based SEM (that’s why PLS exists) and we can list some of them: systematic
convergence of the algorithm due to its simplicity, possibility of managing data with a small number
of individuals and a large number of variables, practical meaning of the latent variable estimates,
general framework for multi-block analysis.
8/18/2019 Structural Equation Modelling for Small Samples
2/52
2
It is often mentioned that PLS is to covariance-based SEM as PCA is to factor analysis. But the
situation has seriously changed when Roderick McDonald showed in his 1996 seminal paper that he
could easily carry out a PCA with a covariance-based SEM software by using the ULS ( Unweighted
Least Squares) criterion and cancelling the measurement error variances. Furthermore, the
estimation of the latent variables proposed by McDonald is similar to using the PLS mode A and the
SEM scheme (i.e. using the “theoretical” latent variables as inner LV estimates). Thus, it became possible to use a covariance-based SEM software to mimic PLS.
In the first section of this paper, it is reminded how to use the ULS criterion for covariance-based
SEM and the PLS way of estimating latent variables for mimicking PLS path modelling. Then, the
second section is devoted to show how to carry out a PCA with a covariance-based SEM software
and to comment the interest of this approach for taking into account parameter constraints and for
bootstrapping. Multi-block analysis is presented in the third section as a confirmatory factor
analysis.
We have used AMOS 6.0 (Arbuckle, 2005) and XLSTAT-PLSPM, a module of the XLSTAT
software (XLSTAT, 2007), on practical examples to illustrate the paper. Listing the pluses and
minuses of ULS-SEM and PLS finally concludes the paper.
I. Using ULS and PLS estimation methods for structural equation modelling
We describe in this section the use of the ULS estimation method applied to the SEM parameter
estimates and that of the PLS estimation method for computing the LV values.
In a first part we remind the structural equation model following Bollen (1989). A structural
equation model consists of two models: the latent variable model and the measurement model.
The latent variable model
Letη
be a column vector consisting of m endogenous (dependent) centred latent variables, and ξ a
column vector consisting of k exogenous (independent) centred latent variables. The structural
model connecting the vectorη
to the vectorsη and ξ is written as
(1) η = Bη + Γξ + ζ
where B is a zero-diagonal m m× matrix of regression coefficients, a m k × matrix of regressioncoefficients and
ζ
a centred random vector of dimension m.
The measurement model
Each latent (unobservable) variable is described by a set of manifest (observable) variables. Thecolumn vector y j of the centred manifest variables linked to the dependent latent variable η j can be
written as a function of η j through a simple regression with usual hypotheses.
(2) y j j j j
η y = λ + ε
The column vector y, obtained by concatenation of the y j’s, is written as
(3) yy = Λ η + ε
where1
m y
j j=
= ⊕yΛ λ is the direct sum of 1 ,...,
y y
mλ λ and ε is a column vector obtained by concatenation
of theε j’s. It may be reminded that the direct sum of a set of matrices A1, A2,…, Am is a block
diagonal matrix in which the blocks of the diagonal are formed by matrices A1, A2,…, Am.
8/18/2019 Structural Equation Modelling for Small Samples
3/52
3
Similarly, the column vector x of the centred manifest variables linked to the latent independent
variables is written as a function of ξ:
(4) xx = Λ ξ + δ
Adding the usual hypothesis that the matrix I-B is non-singular, equation (1) can also be written as:
(5) 1( ) ( )−η = I - B Γξ + ζ
and consequently (3) becomes
(6) 1[( ) ( )]−y
y = Λ I - B Γξ + ζ + ε
Factorisation of the manifest variable covariance matrix
Let = Cov(ξ) = E (ξξ’), ψ = Cov(ζ) = E (ζζ’), Θε = Cov(ε) = E (εε’) and Θδ = Cov(δ) = E (δδ’).Suppose that the random vectorsξ
,ζ
,ε
andδ
are independent of each other and that the covariance
matrices Ψ, Θε
, Θδ
of the error terms are diagonal. Then, we get:
' xx = x x δΣ Λ ΦΛ + Θ ,
1 1[( ) ( )][( ) '] ' yy
− −y y ε
Σ = Λ I - B ΓΦΓ' + Ψ I - B Λ + Θ ,
[ ]1
' ( ) ' ' xy
−=
x yΣ Λ ΦΓ I - B Λ
From which we finally obtain:
(7)
[ ]11 1
1
[( ) ( )][( ) '] ' ( ) '
' ( ) ' ' '
yy yx
xy xx
−− −
−
⎡ ⎤⎡ ⎤⎢ ⎥= =⎢ ⎥
⎡ ⎤⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦
y y ε y x
x y x x δ
Λ I - B ΓΦΓ'+ Ψ I - B Λ + Θ Λ I - B ΓΦΛΣ ΣΣ
Σ Σ Λ ΦΓ I - B Λ Λ ΦΛ + Θ
Let { }θ , , , , ,= x y ε δΛ Λ B Γ,Φ, Ψ Θ Θ be the set of parameters of the model and (θ)Σ the matrix (7).
Model estimation using the ULS method
Let S be the empirical covariance matrix of the MV’s. The object is to seek the set of parametersˆ ˆ ˆ ˆ ˆ ˆˆ ˆ ˆθ { , , , , } x y= ε δΛ Λ B Γ,Φ, Ψ Θ ,Θ minimizing the criterion
(8)2
ˆ(θ)−S Σ
The aim is therefore to seek a factorisation of the empirical covariance matrix S as a function of the
parameters of the structural model. In SEM softwares, the covariance matrix estimations
ˆ ( )Cov=εΘ ε and ˆ ( )Cov=δΘ δ of the residual terms are computed in such a way that the diagonal of
the reconstruction error matrix ˆ(θ)−E = S Σ is null, even when it yields to negative variance(Heywood case).
8/18/2019 Structural Equation Modelling for Small Samples
4/52
4
Let’s denote by ˆii
σ the i-th term of the diagonal of ˆ ˆ ˆˆ ˆ ˆ( , , , , )x yΣ Λ Λ B Γ,Φ, Ψ 0,0 and byˆii
θ the i-th term
of the diagonal of ˆ ˆ⊕ε δΘ Θ . From the formula:
(9) ˆˆii ii ii
s σ θ = +
we may conclude that ˆiiσ is the part of the variance sii of the i-th MV explained by its LV (except in
a Heywood case) and ˆii
θ is the estimate of the variance of the measurement error relative to this
MV. As all the error terms ˆˆ( )ii ii ii iie s σ θ = − + are null, this method is not oriented towards the
research of parameters explaining the MV variances. It is in fact oriented towards the reconstruction
of the covariances between the MV’s, variances excluded.
The McDonald approach for parameter estimation
In his 96 paper, McDonald proposes to estimate the model parameters subject to the constraints that
all the ˆiiθ are null. The object is to seek the parameters ˆ ˆ ˆˆ ˆ ˆ, , , x yΛ Λ B Γ,Φ, Ψ minimizing the criterion
(10)2
ˆ ˆ ˆˆ ˆ ˆ( , , , , , )−x y
S Σ Λ Λ B Γ,Φ, Ψ 0 0
The estimations of the variances of the residual termsε
andδ
are integrated in the diagonal terms of
the reconstruction error matrix ˆ ˆ ˆˆ ˆ ˆ( , , , , , )= −x y
E S Σ Λ Λ B Γ,Φ, Ψ 0 0 . This method is therefore oriented
towards the reconstruction of the full MV covariance matrix, variances included. On a second step,
final estimation ˆ ˆ andε δ
Θ Θ of the variances of the residual termsε
andδ
are obtained by using again
formula (9).
Goodness of Fit
The quality of the fit can be measured by the GFI (Goodness of Fit Index) criterion of Jöreskog &
Sorbum, defined by the formula
(11)
2
2
ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )1GFI
−= −
x y ε δS Σ Λ Λ B Γ,Φ, Ψ Θ Θ
S
i.e. the proportion of2
S explained by the model. By convention, the model under study is
acceptable when the GFI is greater than 0.90.
The quantity2
ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )−x y ε δ
S Σ Λ Λ B Γ,Φ, Ψ Θ Θ can be deduced from the CMIN criterion given in
AMOS:
(12)21 ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )
2
N CMIN
−= × − x y ε δS Σ Λ Λ B Γ,Φ, Ψ Θ Θ
where N is the number of cases.
8/18/2019 Structural Equation Modelling for Small Samples
5/52
5
In practical applications of the McDonald approach, the difference between the GFI given by AMOS
and the exact GFI computed with formula (11) will be small:
(13)
2
2
22
2
ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )1
ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , )
1ii
i
GFI
θ
−= −
− −= −
∑
x y ε δ
x y
S Σ Λ Λ B Γ,Φ, Ψ Θ Θ
S
S Σ Λ Λ B Γ,Φ, Ψ 0,0
S
Using the McDonald approach, the GFI given by AMOS is equal to
(14)
2
2
ˆ ˆ ˆˆ ˆ ˆ( , , , , )1GFI
−= −
x yS Σ Λ Λ B Γ,Φ, Ψ 0,0
S
and
22
ˆ /iii
θ ∑ S is usually small. Furthermore, the exact GFI will always be larger than the GFI given by AMOS.
Evaluation of the latent variables
After having estimated the parameters of the model, we now present the problem of evaluating the
latent variables. Three approaches can be distinguished: the traditional SEM approach, the
"McDonald" approach, and the "Fornell" approach. As it is usual in the PLS approach, we now
designate one manifest variable with the letter x and one latent variable with the letter ξ , regardless
of whether they are of the dependent or independent type. The total number of latent variables is
n = k + m and the number of manifest variables related to the latent variable ξ j is p j.
The traditional SEM approach
To construct an estimation ˆ j
ξ of j
ξ , one proceeds by multiple regression of j
ξ on the whole set of
the centred manifest variables 11 11,..., n nnp np x x x x− − . In other words, if one denotes asˆ
xxΣ the
implied (i.e. predicted by the structural model) covariance matrix between the manifest variables,
and as ˆ j xξ
Σ the vector of the implied covariances between the manifest variables x and the latent
variable ξ j, one obtains an expression of ˆ jξ as a function of the whole set of manifest variables:
(15) 1ˆ ˆ ˆ j j xx x
X ξ ξ −= Σ Σ
where 11 11,..., n nnp np X x x x x⎡ ⎤= − −⎣ ⎦
. This method is not really usable, as it is more natural to
estimate a latent variable solely as a function of its own manifest variables.
8/18/2019 Structural Equation Modelling for Small Samples
6/52
8/18/2019 Structural Equation Modelling for Small Samples
7/52
8/18/2019 Structural Equation Modelling for Small Samples
8/52
8
summarize the manifest variables of the block. This relationship may also be reflective: each
manifest variable is then a reflection of a latent variable existing a priori, a theoretical concept one
would try to outline with measures. The formative mode does not require the blocks to be one-
dimensional, while that is compulsory for the reflective mode. Here, we are more in a formative
mode for the physico-chemical and sensorial blocks and reflective mode by construction for the
hedonic block. The two modes are indicated by the direction of the arrows in Figure 1.
With regard to the PLS algorithm, it is recommended that the method of calculating outer estimates
of the latent variables is selected depending on the type of relationship between the manifest
variables and their latent variables: Mode A for the reflective type and Mode B for the formative
type (Wold, 1985). The low number of products has obliged us to use Mode A to calculate the outer
estimates of the latent variables (although the mode of relationship between the manifest and latent
variables is formative for the physico-chemical and sensorial blocks).
The ULS-SEM approach presented here is clearly oriented towards the reflective mode. Therefore
this orange juice example will be analyzed with ULS-SEM and PLS approaches using the reflective
mode for the three blocks. Concerning the physico-chemical and sensorial blocks, the direction of
the arrows connecting the MV’s to their LV’s shown in Figure 1 should thus be inversed.
Figure 1: Theoretical model of relationships between the hedonic, physico-chemical and sensorial
data
Glucose
Fructose
Saccharose
Sweetening power
pH before processing
pH after centrifugation
Titer
Citric acid
VitaminC
Smell intensity
Odor typicity
Pulp
Taste intensity
Acidity
Bitterness
Sweetness
ξ
1
ξ
2
ξ
3
Judge 2Judge 3
Judge 96
1. Use of ULS-SEM
We now use the ULS-SEM approach on the orange juice data. Following McDonald, the
measurement error variances are put to 0. The results are given in Figure 2 and in Table 2.
All manifest variables have been standardized. The value 1 has been given to the path coefficients
related to the manifest variables pH before centrifugation, Sweetness and Judge2.
8/18/2019 Structural Equation Modelling for Small Samples
9/52
8/18/2019 Structural Equation Modelling for Small Samples
10/52
10
Table 2: Outputs of AMOS 6.0
Table 2.1: Regression Weights (non significant weights in bold):
Parameter Estimate Lower (90%) Upper (90%) P
SENSORIAL
8/18/2019 Structural Equation Modelling for Small Samples
11/52
11
Tableau 2.2: Variances
Parameter Estimate Lower Upper P
PHYSICO-CHEMICAL .921 .429 1.120 .020
d1 .298 .028 .364 .010
d2 .034 .000 .044 .177
Tableau 2.3: Squared Multiple Correlations
Parameter Estimate Lower Upper P
SENSORIAL .655 .529 .951 .010
HEDONIC .946 .919 1.000 .020
Tableau 2.4: Model Fit Summary
Model NPAR CMIN
Default model 42 105.613
Model RMR GFI AGFI PGFI
Default model .175 .904 .898 .855
Comment: The GFI is equal to .904 and suggests that the model is acceptable.
Figure 3: Loading plot for the PCA of judges
8/18/2019 Structural Equation Modelling for Small Samples
12/52
12
The main objective of component-based SEM is the construction of scores. Following the
McDonald approach, we use the path coefficients given in Figure 2 and in Table 2.1. We obtain the
following constructs:
For the Physico-chemical block
Scor e( Physi co- Chemi cal ) ∝ - . 765*Gl ucose - . 764*Fr uct ose +. 890*Saccharose+. 219*( Sweet eni ng power) + 1*( pH bef ore cent r i f ugat i on) + . 998*( pH af t ercent r i f ugat i on) - . 869*Ti t er - . 877*( Ci t r i c aci d) - . 064*( Vi t ami n C)
where all the variables (score and manifest variables) are standardized.
For the Sensorial block
Scor e( Sensor i al ) ∝ . 244*( Smel l i nt ensi t y) + . 935*( Odor t ypi ci t y) +. 657*Pul p - . 565*( Tast e i nt ensi t y) - . 946*Aci di t y - . 974*Bi t t er ness +
1*Sweetness
with the same standardization than for the previous block.
For the Hedonic block
Scor e( Hedoni c) = 1*J udge2 + . 956*J udge3 + … + . 821*J udge96
with the same standardization than for the previous blocks.
The latent variable scores are given in Table 2.5 and their correlations in Table 2.6. The correlations
between these scores and the manifest variables are given in Table 2.7.
Tableau 2.5: ULS-SEM latent variable scores
Physico-chemical Sensorial Hedonic
Pampryl r.t. -0.72 -1.26 -1.10
Tropicana r.t. 1.05 0.43 0.66
Fruivita refr. 0.81 0.87 1.17
Joker r.t. -1.54 -0.77 -0.84
Tropicana refr. 0.56 1.27 0.85
Pampryl refr. -0.16 -0.53 -0.74
Tableau 2.6: ULS-SEM latent variable score correlation matrix
Physico-chemical Sensorial Hedonic
Physico-chemical 1.000 .810 .867
Sensorial .810 1.000 .961
Hedonic .867 .961 1.000
8/18/2019 Structural Equation Modelling for Small Samples
13/52
13
Tableau 2.7: Correlations between the ULS-SEM LV scores and the MV’s
Physico-chemical Sensorial Hedonic
Glucose -0.898 -0.585 -0.673
Fructose -0.898 -0.575 -0.673
Saccharose 0.926 0.755 0.817Sweetening power 0.078 0.288 0.242
pH before centrifugation 0.950 0.896 0.947
pH after centrifugation 0.939 0.904 0.946
Titer -0.973 -0.735 -0.765
Citric acid -0.977 -0.740 -0.774
Vitamin C -0.195 -0.040 -0.001
Smell intensity 0.229 0.410 0.174
Odor typicity 0.806 0.976 0.893
Pulp 0.558 0.704 0.625
Taste intensity -0.401 -0.646 -0.552
Acidity -0.745 -0.927 -0.950
Bitterness -0.775 -0.951 -0.976
Sweetness 0.871 0.967 0.979
Judge2 0.640 0.928 0.887Judge3 0.647 0.756 0.877Judge6 0.656 0.662 0.794
Judge11 0.872 0.785 0.919Judge12 0.718 0.929 0.823
Judge25 0.971 0.817 0.864Judge30 0.742 0.518 0.637Judge31 0.343 0.693 0.712
Judge35 0.771 0.936 0.926
Judge48 0.460 0.837 0.834Judge52 0.791 0.840 0.944
Judge55 0.504 0.878 0.863Judge59 0.534 0.592 0.458
Judge60 0.870 0.854 0.924
Judge63 0.343 0.693 0.712Judge68 0.909 0.670 0.666Judge77 0.734 0.473 0.396
Judge79 0.718 0.929 0.823Judge84 0.953 0.934 0.941
Judge86 0.453 0.685 0.762Judge91 0.827 0.845 0.927
Judge92 0.724 0.419 0.595Judge96 0.554 0.679 0.744
The estimate of the hedonic score, shown in Table 2.5, enables us to classify the products by order of
preference:
Fruivita refr. > Tropicana refr. > Tropicana r.t. > Pampryl refr. > Joker r.t. > Pampryl r.t.
Using the significant regression weights of Table 2.1 and the correlations given in Table 2.7, we may
conclude that the physico-chemical score is correlated negatively with the fructose, glucose, titer and
citric acid characteristics and positively with the saccharose, pH before and after centrifugation
8/18/2019 Structural Equation Modelling for Small Samples
14/52
14
characteristics. The sensorial score is correlated positively with odor typicity and sweetness and
negatively with acidity and bitterness.
The hedonic score related to the homogenous group of judges is correlated positively with the
physico-chemical (.867) and sensorial scores (.961). Consequently, this group of judges likes
products with odor typicity and sweetness (Fruivita refr., Tropicana r.t., Tropicana refr.) and rejects
products with an acidic and bitter nature (Joker r.t., Pampryl refr., Pampryl r.t.). This result is
verified in Table 3.
Table 3: Sensorial characteristics of the products ranked according to the hedonic score
odor hedonic
Product sweeteness typicity acidity bitterness score
_____________________________________________________________________
Fruivita refr. 3.4 2.88 2.42 1.76 1.17
Tropicana refr. 3.3 3.02 2.33 1.97 0.85
Tropicana r.t. 3.3 2.82 2.55 2.08 0.66
---------------------------------------------------------------------
Pampryl refr. 2.9 2.73 3.31 2.63 -0.74Joker r.t. 2.8 2.59 3.05 2.56 -0.84
Pampryl r.t. 2.6 2.53 3.15 2.97 -1.10
2. Use of PLS Path modeling
For estimating the parameters of the model, we have used the module XLSTAT-PLSPM of the
XLSTAT software (XLSTAT, 2007). The variables have all been standardized. To calculate the
inner estimates of the latent variables, we have used the centroid scheme recommended by Herman
Wold (1985).
Table 4 contains the output of this modelling of the orange juice data with comments. Figure 4includes the regression coefficients between the latent variables of the model shown in Figure 1 and
the correlation coefficients between the manifest and latent variables.
Coefficient validation
Although it gives robust and stable results with the various methods used on these orange juice data
(the same items appear to be significant in Tenenhaus, Pagès, Ambroisine, Guinot (2005) and in the
present paper), we may think that bootstrap validation carried out on only 6 cases cannot be very
reliable. One reason is the following: In this example, data structure comes from the opposition
between the two groups {Fruivita refr., Tropicana r.t., Tropicana refr.} on one side and {Pamprylrefr., Joker r.t., Pampryl r.t.} on the other side. If one of these groups of products is not selected in
the bootstrap sampling selection, then the correlations between the latent variables disappear.
Maybe, non representative samples should be eliminated.
Bootstrap has been based on 200 samples and 90% confidence intervals have been asked for.
Results of bootstrap validation for the inner model are shown in Table 3.7. The confidence intervals
indicate the regression coefficients which are significant. We can also look to the usual Student t
related to the regression coefficients. By convention, a coefficient is significant if the absolute value
of t is larger than 2. In this specific example, both methods give the same results. The relationship
between the hedonic data and the physico-chemical data is not significant (t = 1.522), while that
between the hedonic data and the sensorial data is (t = 3.546). There is also a significant connection
between the physico-chemical and the sensorial data (t = 2.864).
8/18/2019 Structural Equation Modelling for Small Samples
15/52
15
Figure 4: XLSTAT-PLSPM software output for the orange juice data
8/18/2019 Structural Equation Modelling for Small Samples
16/52
16
However, the strong correlation between the hedonic data and the physico-chemical data suggests
that a PLS regression of the hedonic score should be carried out on the physico-chemical and
sensorial scores. This PLS regression (with one component) leads to the following equation:
Hedonic score = 0.49*(Physico-Chemical score) + 0.53*(Sensorial score)
with an R 2
= 0.948 to be compared with R 2
= 0.960 in the model shown in Figure 4.Bootstrap validation for PLS regression yields to the same significant regression coefficients as for
OLS regression (see Table 4). If the PLS regression is validated by Jack-knife on the observed latent
variables, both coefficients are now significant (Table 4 and Figure 5).
Figure 5: XLSTAT-PLSPM software output: Validation of the PLS regression of the hedonic score
on the physico-chemical and sensorial scores
Hedonic / Standardized coefficients
(95% conf. inter val)
Physico-chemical
Sensorial
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Variable
S t a n d a r d i z e d c o e f f i c i e n t s
Table 3: XLSTAT-PLSPM outputs for the orange juice example
Table 3.1: Block dimensionality
Latent variable Dimensions Critical value Eigenvalues
Physico-chemical 9 1.800 6.213
1.410
1.046
0.317
0.013
Sensorial 7 1.400 4.7441.333
0.820
0.084
0.019
Hedonic 23 4.600 14.655
3.663
2.199
1.837
0.646
Comment: The critical value is equal to the average eigenvalue. In this example the number of
eigenvalues is equal to 5 as the number of observations (6) is smaller than the number of variables
and because the variables are centered. Each block can be considered as unidimensional.
8/18/2019 Structural Equation Modelling for Small Samples
17/52
17
Table 3.2: Checking block dimensionality (larger correlation per row is in bold )
Variables/Factors correlations (Physico-chemical):
F1 F2 F3 F4 F5
Glucose 0.914 0.388 -0.057 0.109 -0.013
Fructose 0.913 0.378 -0.083 0.121 -0.050
Saccharose -0.912 0.261 0.286 -0.127 0.049
Sweetening power -0.035 0.947 0.319 -0.020 0.017
pH before centrifugation -0.945 0.019 -0.026 0.325 0.006
pH after centrifugation -0.933 0.071 -0.069 0.346 -0.006
Titer 0.974 -0.144 0.070 0.150 0.062
Citric acid. 0.978 -0.136 0.049 0.144 0.052
Vitamin C 0.212 -0.328 0.916 0.080 -0.035
Variables/Factors correlations (Sensorial):
F1 F2 F3 F4 F5Smell intensity 0.460 0.754 -0.468 0.008 0.004
Odor typicity 0.985 0.134 -0.058 0.077 0.041
Pulp 0.722 0.617 0.298 -0.096 -0.031
Taste intensity -0.650 0.429 0.626 0.005 0.048
Acidity -0.913 0.348 -0.021 0.205 -0.057
Bitterness -0.935 0.188 -0.285 -0.028 0.093
Sweetness 0.955 -0.159 0.187 0.161 0.048
Variables/Factors correlations (Hedonic):
F1 F2 F3 F4 F5
judge2 0.894 -0.218 0.307 -0.203 0.132 judge3 0.890 -0.318 -0.310 -0.018 0.104
judge6 0.798 -0.039 -0.166 0.522 -0.247
judge11 0.919 0.051 -0.177 0.278 0.212
judge12 0.814 0.221 0.213 -0.429 -0.243
judge25 0.849 0.422 -0.177 -0.166 0.205
judge30 0.625 0.399 -0.631 -0.058 -0.221
judge31 0.733 -0.565 0.321 0.179 0.090
judge35 0.925 0.035 0.329 0.179 -0.060
judge48 0.852 -0.474 0.207 -0.039 -0.074
judge52 0.948 -0.049 -0.286 0.068 -0.115
judge55 0.878 -0.398 0.131 -0.164 -0.166
judge59 0.438 0.475 0.740 0.176 -0.062
judge60 0.922 0.051 -0.149 -0.154 0.317
judge63 0.733 -0.565 0.321 0.179 0.090
judge68 0.638 0.763 -0.063 -0.072 -0.041
judge77 0.363 0.862 0.235 -0.169 0.203
judge79 0.814 0.221 0.213 -0.429 -0.243
judge84 0.928 0.352 0.115 0.032 0.012
judge86 0.778 -0.410 -0.372 -0.229 -0.187
judge91 0.927 0.043 0.069 0.364 0.043
judge92 0.585 0.348 -0.282 0.676 0.000
judge96 0.755 -0.285 -0.331 -0.430 0.231
8/18/2019 Structural Equation Modelling for Small Samples
18/52
18
Table 3.3: Model validation
Goodness of fit index:
GoF GoF (Bootstrap) Standard error Critical ratio (CR)
Absolute 0.731 0.732 0.049 14.943Relative 0.823 0.801 0.048 17.146
Outer model 0.911 0.852 0.039 23.286
Inner model 0.903 0.940 0.048 18.815
Lower bound(90%)
Upper bound(90%) Minimum
1stQuartile Median
3rdQuartile Maximum
Absolute 0.645 0.799 0.522 0.711 0.738 0.762 0.821
Relative 0.707 0.847 0.525 0.790 0.811 0.824 0.855
Outer model 0.784 0.893 0.707 0.816 0.863 0.865 0.911
Inner model 0.885 0.999 0.669 0.921 0.941 0.966 1.000
Comment: Number of bootstrap samples = 200. Level of the confidence intervals: 90%
Absolute Goodness-of-Fit
2 2
Endogenous VL
1 1( , ) ( ; explaining )
Nb of endogenous LV jh j j i j
j h
GoF Cor x R J
= ξ × ξ ξ ξ∑∑ ∑
where j
j
J p= ∑
Relative Goodness-of-Fit
22
1
2endogenous LV j
Outer model Inner model
( , )( ; explaining )1 1
Nb of endogenous LV
j p
jh j j i jh
j j
Cor x R
J
ξ ξ ξ ξ
λ ρ
= ×∑
∑ ∑
where:- λ j is the first eigenvalue computed from the PCA of block j
- j
ρ is the first canonical correlation between the dependent block j and the
concatenation of all the blocks i explaining the dependent block j.
8/18/2019 Structural Equation Modelling for Small Samples
19/52
19
Table 3.4: Latent Variable validation
Cross-loadings (Monofactorial manifest variables):
Physico-chemical Sensorial Hedonic
Glucose -0.889 -0.584 -0.689Fructose -0.889 -0.574 -0.689
Saccharose 0.931 0.758 0.832
Sweetening power 0.099 0.294 0.242
pH before centrifugation 0.952 0.896 0.955
pH after centrifugation 0.942 0.905 0.954
Titer -0.972 -0.738 -0.789
Citric acid. -0.977 -0.743 -0.798
Vitamin C -0.194 -0.045 -0.023
Smell intensity 0.236 0.411 0.199
Odor typicity 0.814 0.977 0.904
Pulp 0.574 0.709 0.637
Taste intensity -0.397 -0.639 -0.549
Acidity -0.751 -0.925 -0.942
Bitterness -0.784 -0.952 -0.972
Sweetness 0.877 0.968 0.982
judge2 0.646 0.925 0.880
judge3 0.654 0.755 0.860
judge6 0.665 0.667 0.787
judge11 0.873 0.785 0.916
judge12 0.729 0.930 0.834
judge25 0.972 0.817 0.879
judge30 0.750 0.524 0.648
judge31 0.349 0.690 0.689 judge35 0.777 0.936 0.926
judge48 0.470 0.835 0.815
judge52 0.801 0.843 0.938
judge55 0.517 0.876 0.847
judge59 0.533 0.593 0.479
judge60 0.872 0.853 0.924
judge63 0.349 0.690 0.689
judge68 0.910 0.673 0.695
judge77 0.727 0.474 0.432
judge79 0.729 0.930 0.834
judge84 0.957 0.935 0.953
judge86 0.467 0.685 0.742
judge91 0.831 0.846 0.925
judge92 0.724 0.424 0.602
judge96 0.559 0.677 0.731
Comment:- Sweetening power and Vitamin C are not correlated to their block.- Smell intensity is not correlated to its own block- Judges 59 and 77 are weakly correlated to their block.
8/18/2019 Structural Equation Modelling for Small Samples
20/52
20
Table 3.5: Latent Variable weights (non significant weights are in bold )
Latent variable Manifest variablesOuterweight
Outerweight
(Bootstrap)Standard
errorCritical
ratio (CR)
Lowerbound(90%)
Upperbound(90%)
Glucose -0.124 -0.113 0.050 -2.491 -0.147 -0.057
Fructose -0.123 -0.111 0.049 -2.498 -0.144 -0.057
Saccharose 0.154 0.140 0.041 3.720 0.096 0.180
Sweetening power 0.052 0.038 0.093 0.560 -0.115 0.151
pH before centrifugation 0.180 0.159 0.021 8.460 0.121 0.184
pH after centrifugation 0.180 0.159 0.020 9.073 0.121 0.185
Titer -0.148 -0.137 0.017 -8.808 -0.169 -0.111
Citric acid. -0.150 -0.139 0.016 -9.127 -0.171 -0.113
Physico-chemical
Vitamin C -0.007 -0.010 0.085 -0.077 -0.126 0.130
Smell intens ity 0.052 0.033 0.099 0.527 -0.136 0.166
Odor typicity 0.206 0.190 0.039 5.255 0.143 0.241
Pulp 0.145 0.114 0.082 1.759 -0.065 0.207
Taste intensity -0.113 -0.107 0.080 -1.406 -0.196 0.069
Acidity -0.203 -0.190 0.045 -4.472 -0.227 -0.143
Bitterness -0.210 -0.197 0.034 -6.136 -0.240 -0.143
Sensorial
Sweetness 0.223 0.208 0.036 6.148 0.143 0.267
judge2 0.059 0.056 0.012 4.740 0.043 0.064
judge3 0.053 0.051 0.015 3.435 0.034 0.065
judge6 0.050 0.047 0.018 2.749 0.023 0.071
judge11 0.062 0.059 0.019 3.268 0.048 0.080
judge12 0.062 0.058 0.013 4.605 0.037 0.083
judge25 0.067 0.063 0.013 5.247 0.051 0.078
judge30 0.048 0.040 0.020 2.364 0.000 0.065
judge31 0.039 0.036 0.023 1.695 0.000 0.066 judge35 0.064 0.062 0.012 5.530 0.050 0.075
judge48 0.049 0.047 0.016 3.047 0.020 0.064
judge52 0.062 0.061 0.012 5.320 0.049 0.082
judge55 0.052 0.049 0.014 3.717 0.029 0.061
judge59 0.042 0.036 0.026 1.642 -0.021 0.066
judge60 0.065 0.060 0.016 4.065 0.047 0.074
judge63 0.039 0.036 0.023 1.695 0.000 0.066
judge68 0.059 0.051 0.018 3.305 0.000 0.072
judge77 0.045 0.037 0.027 1.649 -0.021 0.065
judge79 0.062 0.058 0.013 4.605 0.037 0.083
judge84 0.071 0.066 0.013 5.420 0.055 0.084 judge86 0.043 0.039 0.020 2.168 -0.007 0.057
judge91 0.063 0.059 0.018 3.499 0.050 0.074
judge92 0.043 0.038 0.028 1.554 -0.007 0.081
Hedonic
judge96 0.046 0.044 0.020 2.345 0.013 0.066
Comment:- Sweetening power and Vitamin C are not significant in block physico-chemical- Smell intensity, Pulp and Taste intensity are not significant in block sensorial- Judges 59, 77, 86 and 92 are not significant in block hedonic.
8/18/2019 Structural Equation Modelling for Small Samples
21/52
21
Table 3.6: Correlations between MV and LV
Correlations:
Latent variable Manifest variables
Standardized
loadings Communalities Redundancies
Standardizedloadings
(Bootstrap)
Standard
errorGlucose -0.889 0.790 -0.850 0.226
Fructose -0.889 0.790 -0.847 0.227
Saccharose 0.931 0.867 0.876 0.268
Sweetening power 0.099 0.010 0.101 0.591pH beforecentrifugation 0.952 0.906 0.968 0.078
pH after centrifugation 0.942 0.887 0.964 0.073
Titer -0.972 0.946 -0.950 0.066
Citric acid. -0.977 0.954 -0.956 0.063
Physico-chemical
Vitamin C -0.194 0.038 -0.203 0.435
Smell intensity 0.411 0.169 0.113 0.285 0.497
Odor typicity 0.977 0.954 0.641 0.940 0.110
Pulp 0.709 0.503 0.338 0.612 0.382
Taste intensity -0.639 0.408 0.274 -0.589 0.404
Acidity -0.925 0.856 0.575 -0.915 0.206
Bitterness -0.952 0.907 0.609 -0.949 0.087
Sensorial
Sweetness 0.968 0.936 0.629 0.967 0.069
judge2 0.880 0.774 0.743 0.859 0.182
judge3 0.860 0.740 0.710 0.828 0.236
judge6 0.787 0.619 0.594 0.736 0.254
judge11 0.916 0.840 0.806 0.885 0.230
judge12 0.834 0.695 0.667 0.852 0.115
judge25 0.879 0.773 0.742 0.896 0.159
judge30 0.648 0.419 0.403 0.570 0.284
judge31 0.689 0.475 0.456 0.619 0.340
judge35 0.926 0.858 0.824 0.923 0.147
judge48 0.815 0.664 0.638 0.785 0.233
judge52 0.938 0.879 0.844 0.936 0.130
judge55 0.847 0.717 0.688 0.809 0.215
judge59 0.479 0.230 0.221 0.447 0.406
judge60 0.924 0.853 0.820 0.893 0.214
judge63 0.689 0.475 0.456 0.619 0.340
judge68 0.695 0.483 0.464 0.685 0.258
judge77 0.432 0.186 0.179 0.426 0.404 judge79 0.834 0.695 0.667 0.852 0.115
judge84 0.953 0.909 0.873 0.943 0.137
judge86 0.742 0.551 0.529 0.670 0.308
judge91 0.925 0.856 0.822 0.895 0.215
judge92 0.602 0.363 0.348 0.531 0.411
Hedonic
judge96 0.731 0.535 0.514 0.709 0.281
Comments:
- Standardized loading = correlation- Communality = squared correlation
- Redundancy = Communality*R2(Dep. LV; Explanatory related LVs)
8/18/2019 Structural Equation Modelling for Small Samples
22/52
22
Table 3.6: Correlations between MV and LV (continued )
Correlations:
Latent variable Manifest variablesCritical ratio
(CR)Lower bound
(90%)Upper bound
(90%)
Glucose -3.938 -0.995 -0.569
Fructose -3.913 -0.998 -0.589
Saccharose 3.476 0.729 0.996
Sweetening power 0.167 -0.860 0.950
pH before centrifugation 12.199 0.925 1.000
pH after centrifugation 12.966 0.899 1.000
Titer -14.729 -1.000 -0.860
Citric acid. -15.413 -1.000 -0.872
Physico-chemical
Vitamin C -0.445 -0.970 0.656
Smell intensity 0.827 -0.684 0.930
Odor typicity 8.912 0.763 0.999
Pulp 1.856 -0.174 0.998
Taste intensity -1.580 -0.998 0.203
Acidity -4.495 -1.000 -0.752
Bitterness -10.967 -1.000 -0.897
Sensorial
Sweetness 13.953 0.940 1.000
judge2 4.832 0.639 0.981
judge3 3.642 0.469 0.999
judge6 3.102 0.347 0.982
judge11 3.993 0.773 0.994
judge12 7.258 0.648 0.999
judge25 5.541 0.742 0.997
judge30 2.277 0.000 0.936 judge31 2.030 0.000 0.991
judge35 6.293 0.825 0.997
judge48 3.493 0.408 0.997
judge52 7.234 0.858 0.998
judge55 3.942 0.425 0.996
judge59 1.179 -0.426 0.948
judge60 4.315 0.716 0.997
judge63 2.030 0.000 0.991
judge68 2.694 0.000 0.997
judge77 1.069 -0.426 0.896
judge79 7.258 0.648 0.999 judge84 6.934 0.911 0.998
judge86 2.411 -0.093 0.992
judge91 4.301 0.783 0.986
judge92 1.465 -0.192 0.982
Hedonic
judge96 2.602 0.173 0.997
Comment: (identical with those for weights) - Sweetening power and Vitamin C are not significant in block physico-chemical- Smell intensity, Pulp and Taste intensity are not significant in block sensorial- Judges 59, 77, 86 and 92 are not significant in block hedonic.
8/18/2019 Structural Equation Modelling for Small Samples
23/52
23
Table 3.7: Inner model
R² (Sensor ial):
R² R²(Bootstrap)Standard
error Critical ratio (CR)Lower bound
(90%) Upper bound (90%)
0.672 0.791 0.157 4.276 0.588 0.999
Path coefficients (Sensorial):
Latent variable Value Standard error t Pr > |t| Value(Bootstrap)
Physico-chimique 0.820 0.286 2.864 0.046 0.835
Latent variableStandard
error(Bootstrap)Critical ratio
(CR)Lower bound
(90%) Upper bound (90%)
Physico-chemical 0.308 2.660 0.757 0.994
Comment:
The usual Student t test and the bootstrap approach give here the same results.
R² (Hedonic):
R² R²(Bootstrap)Standard
error Critical ratio (CR) Lower bound (90%) Upper bound (90%)
0.960 0.986 0.017 58.017 0.947 1.000
Path coefficients (Hedonic):
Latent variable Value Standard error t Pr > |t|
Physico-chemical 0.306 0.201 1.522 0.225
Sensorial 0.713 0.201 3.546 0.038
Path coefficients (Hedonic):
Latent variable Value(Bootstrap)Standard
error(Bootstrap)Critical ratio
(CR)Lower bound
(90%)Upper bound
(90%)
Physico-chemical 0.331 0.698 0.438 -0.642 1.000
Sensorial 0.651 0.674 1.058 0.000 1.397
Comment:
The usual Student t test and the bootstrap approach give here the same results. But the non-
significance of the physico-chemical can also be due to a multicolinearity problem. PLS regression
for estimating the structural regression equations can be used and is presented in Table 4.
8/18/2019 Structural Equation Modelling for Small Samples
24/52
24
Table 3.8: Impact and contribution of the variables to Hedonic
Impact and contribution of the variables to Hedonic:
Sensorial Physico-chemical
Correlation 0.964 0.891Path coefficient 0.713 0.306
Correlation * path coefficient 0.688 0.273
Contribution to R² (%) 71.612 28.388
Cumulative % 71.612 100.000
Impact and contr ibution of the variables to Hedonic
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Senso rial P hysico -chemical
Latent variable
P a t h c o e f f i c i e n t s
0
20
40
60
80
100
C o n t r i b u t i o n t o R ² ( % )
Path coeff icient Cumulative %
Comment:
-2
1ˆ( ; ,..., ) ( , )*
k j j R Y X X Cor Y X β = ∑
- When all the terms ˆ( , )* j j
Cor Y X β are positive, it makes sense to compute the
relative contribution of each explanatory variable X j to the R square.
8/18/2019 Structural Equation Modelling for Small Samples
25/52
25
Table 3.9: Model assessment
Latent variable Type Mean R² Adjusted
R²
Weighted MeanCommunalities
(AVE) Mean Redundancies
Physico-chemical Exogenous 0.000 0.687
Sensorial Endogenous 0.000 0.672 0.672 0.676 0.454
Hedonic Endogenous 0.000 0.960 0.950 0.634 0.609
Mean 0.816 0.654 0.532
Comments:
- The weighted mean takes into account the number of MVs in each block.- (Absolute GoF)2 = (Mean R2)*(Weighted mean communalities)
Table 3.10: Correlation between the latent variables
Correlations (Latent variable):
Physico-chemical Sensorial Hedonic
Physico-chemical 1.000 0.820 0.891
Sensorial 0.820 1.000 0.964
Hedonic 0.891 0.964 1.000
Table 3.11: Direct, indirect and total effects
Direct effects (Latent variable):
Physico-chemical Sensorial Hedonic
Physico-chemical
Sensorial 0.820
Hedonic 0.306 0.713
Comment :
- Sensorial = .820*Physico-chemical
- Hedonic = .306*Physico-chemical + .713*Sensorial
8/18/2019 Structural Equation Modelling for Small Samples
26/52
26
Table 3.11: Direct, indirect and total effects (continued )
Indirect effects (Latent variable):
Physico-chemical Sensorial Hedonic
Physico-chemicalSensorial 0.000
Hedonic 0.585 0.000
Comment :
- Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical- Indirect effect of Physico-chemical on Hedonic = .713*.820 = 0.585
Total effects (Latent variable):
Physico-chemical Sensorial Hedonic
Physico-chemical
Sensorial 0.820
Hedonic 0.891 0.713
Comment :
Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical
= .891*Physico-chemical
Table 3.12: Discriminant validity
Discriminant validity (Squared correlations < AVE) :
Physico-chemical Sensorial Hedonic
Physico-chemical 1 0.672 0.793
Sensorial 0.672 1 0.930
Hedonic 0.793 0.930 1
Mean Communalities (AVE) 0.687 0.676 0.634
Comment: Due to non significant MV’s, the AVE criterion is too small for the three LV’s.
8/18/2019 Structural Equation Modelling for Small Samples
27/52
27
Table 3.13: Latent variable score
Summary statistics / Latent variable scores:
Variable Observations Minimum Maximum MeanStd.
Deviation
Physico-chemical 6 -1.680 1.120 0.000 1.000
Sensorial 6 -1.381 1.378 0.000 1.000
Hedonic 6 -1.203 1.253 0.000 1.000
Latent variable scores :
Physico-chemical Sensorial Hedonic
pampryl r. t. -0.810 -1.381 -1.203
tropicana r. t. 1.120 0.462 0.742
fruvita refr. 0.917 0.964 1.253
joker r. t. -1.680 -0.852 -0.991
tropicana refr. 0.630 1.378 0.946
pampryl refr. -0.176 -0.570 -0.747
Table 4: PLS regression of Hedonic score on Physico-chemical and sensorial scores
Goodness of fit statistics (Variable Hedonic):
R² 0.948
Bootstrap validation
Path coefficients (Hedonic):
Latent variable ValueValue
(Bootstrap)Standard error
(Bootstrap)Critical ratio
(CR)Lower bound
(90%)Upper bound
(90%)
Physico-chemical 0.490 0.267 0.408 1.201 -0.422 0.893
Sensorial 0.531 0.744 0.402 1.320 0.103 1.397
Jack-knife validation on the observed latent variables
Standardized coefficients (Variable Hedonic):
Variable CoefficientStd.
deviationLower bound
(95%)Upper bound
(95%)
Physico-chemical 0.490 0.021 0.449 0.531
Sensorial 0.531 0.022 0.488 0.573
8/18/2019 Structural Equation Modelling for Small Samples
28/52
28
3. Comparison between PLS, ULS-SEM and PCA
Comparison between weights
When we compare the weight confidence intervals computed with PLS (Table 3.5) with those
coming from ULS-SEM (Table 2.1), we find that both methods yield to the same non significantweights with only one exception for Judge 86 (non significant for PLS and significant for ULS-
SEM). These weights are compared in Figure 6.
Figure 6: Comparison between the PLS and ULS-SEM weights
8/18/2019 Structural Equation Modelling for Small Samples
29/52
29
Comparison between PLS and ULS-SEM scores
The scores coming from PLS and ULS-SEM are compared in Figure 7. They are highly correlated.
This confirms our previous findings and a general remark of Noonan and Wold (1982) on the fact
that the final outer LV estimates depend very little on the selected scheme of calculation of the inner
LV estimates.
Figure 7: Comparison between the PLS and ULS-SEM scores
8/18/2019 Structural Equation Modelling for Small Samples
30/52
30
Comparison between the PLS and ULS-SEM scores and the block principal components
The correlations between the PLS and USL-SEM scores with the block principal components are
given in Table 5.
Table 5: Correlation between the PLS and ULS-SEM scores and the block principal components
ULS-SEM scores PLS scores
Physico-chemical 1st PC .999 .997
Sensorial 1st PC .998 .998
Hedonic 1st PC .999 .997
We may conclude that ULS-SEM, PLS and principal component analysis are giving practically the
same scores on this orange juice example.
II. Exploratory factor analysis, ULS-SEM and PCA
If the structural model is limited to one standardized latent variable (or common factor) ξ described by a vector x composed of p centred manifest variables, one gets the decomposition
(20) ξ = +xx λ δ
It is usual to add the following hypotheses:
(21)
( ) 0,
( ) ( ') is diagonal( , ) 0
E
Cov E Cov ξ
=
= ==
δ
δ
Θ δ δδδ
Under these hypotheses, the covariance matrixΣ
of the random vector x is written as
(22) '( ') E = = +x x δΣ xx λ λ Θ
The parametersλx and Θδ in model (22) can now be estimated using the ULS method. This means
searching for the parameters ˆ xλ andˆ
δΘ , minimizing the criterion
(23)2
'ˆ ˆ ˆ( )− +x xx δS λ λ Θ
where S is the matrix of empirical covariances. To remove the indetermination on the global sign of
the vector ˆxλ (if
ˆxλ is a solution, then -
ˆxλ is also a solution), the solution can be chosen to make the
sum of the coordinates positive. This is the option chosen in AMOS 6.0.
The advantage of the ULS method over the other more frequently used GLS (Generalized Least
Squares) or ML (Maximum Likelihood) methods lies in its ability to function with a singular
covariance matrix S, particularly in situations where the number of observations is less than the
number of variables.
8/18/2019 Structural Equation Modelling for Small Samples
31/52
31
The quality of the fit is measured by the GFI written here as
(24)
2'
2
ˆ ˆ ˆ( )1GFI
− += −
x xx δS λ λ Θ
S
Principal component analysis (PCA) is found again if one imposes the additional condition
(25) ˆ δΘ = 0
In this case, one seeks to minimise the criterion
(26)2
'ˆ ˆ− x xS λ λ
The vector ˆ xλ is now equal to 1 1λ u , where u1 is the normed eigenvector of the covariance matrix
S associated with the largest eigenvalue λ1.
For each MV x j, the explained variance (or communality) is therefore2
1 1ˆ
jj juσ λ = . The residual
variance (or specificity) θ j is then estimated by2
1 1ˆ j jj j
s uθ λ = − .
The quality of the fit can still be measured by the GFI :
(27)
( )2 2
2
ˆ ˆ ˆ
1
jj jj
j
s
GFI
σ − −
= −∑'x xS - λ λ
S
The square of the norm of S is equal to the sum of the squared eigenvalues λh of S. In PCA,2
ˆ ˆ 'x x
S - λ λ is equal to the sum of the squares of the p-1 last eigenvalues of S. Consequently, in PCA
one obtains
(28)
( )2
2 2
1 1 1
2
1
jj j
j
p
h
h
s u
GFI
λ λ
λ
=
+ −
=∑
∑
Moreover, SEM softwares allow the computation of confidence intervals for parameters by
bootstrapping. They also allow criterion (26) to be minimised by imposing value constraints or
equality constraints on the coordinates of the vector ˆxλ . We can continue to use criterion (27) to
measure the quality of the model.
8/18/2019 Structural Equation Modelling for Small Samples
32/52
32
Link between ULS-SEM, Factor Analysis, PLS and Principal Component Analysis
A central point in PLS path modelling concerns the relation between the MV’s related to one LV and
this LV.
Reflective mode
The reflective mode is common to PLS and SEM. In this mode, each MV is related to its LV by a
simple regression:
(29) j j j x λ ξ δ = +
This model corresponds to the usual one-dimension factor analysis (FA) model. Minimization of
criterion (23) allows the estimation of the parameters of this model. As the diagonal terms of the
residual matrix 'ˆ ˆ ˆ( )− +x xx δ
S λ λ Θ are automatically null, the path coefficient λ j are computed with the
objective of reconstruction of the covariance matrix terms outside the diagonal. The averagevariance extracted ( AVE ), defined by ˆ /
jj jj
j j
sσ ∑ ∑ , measures the summary power of the LV. It is
not the first objective in this approach. It is an a posteriori value of the model.
In a one block of variables situation, it is natural to estimate the LV ξ using the first principal
component of the MV’s. The minimization of criterion (26) yields to this solution. Furthermore, the
diagonal terms of the residual 'ˆ ˆ−x xS λ λ are now taken into account in the minimization. The path
coefficients λ j are now computed with the objective of reconstruction of the whole covariance matrix
terms, diagonal included. The AVE still measures the summary power of the LV. But it is now a
part of the objective in this approach. Consequently, in the ULS-SEM context, PCA can be obtained
by considering the FA model (22) and then by cancelling in a first step the residual measurementvariances.
In PLS path modelling softwares, the one block situation has been implemented. In this situation,
the outer estimate of the block LV is also taken as the inner estimate. Therefore, Mode A yields to
the following equation:
(30) ˆ ˆ( , ) j j
j
Cov x xξ ξ ∝ ∑
The PLS algorithm will converge to the first principal component of the block of MV’s, solution of
equation (30).
Formative mode
The formative mode is easy to implement in PLS. In this mode, each LV is related to its MV’s by a
multiple regression:
(31) j j
j
xξ β δ = ∑ +
But in a one block situation, it is an indeterminate problem.
8/18/2019 Structural Equation Modelling for Small Samples
33/52
33
Conclusion
The residual sum of squares ( RESS ), defined by ( )2
ˆij ij
i j
s σ <
−∑ , is smaller for FA than for PCA. On
the other hand the AVE is larger for PCA than for FA.
Example 2
We use data on the cubic capacity, power output, speed, weight, width and length of 24 car models
in production in 2004 given in Tenenhaus (2007). We compare on these data FA and PCA with
respect to the RESS and AVE criterions. The analyses are carried out on standardized variables. The
correlation matrix is given in Table 6.
Table 6: Car example: Correlation matrix
Capacity Power Speed Weight Width Length
Capacity 1 0.954 0.885 0.692 0.706 0.664Power 1 0.934 0.529 0.730 0.527
Speed 1 0.466 0.619 0.578
Weight 1 0.477 0.795
Width 1 0.591
Length 1
The path models for one-dimension FA and PCA are given in Figure 8. The common factor is
denominated as F1. The implied covariance matrices and the residual matrices produced by AMOS
are given in Table 7.
FA
1.00
F1
Capacity
.02
e1
.99
1
Power
.14
e2
.93
1
Speed
.25
e3.87
1
Weight
.53
e4
.68 1
Width
.45
e5
.74
1
Length
.47
e6
.73
1
PCA
1.00
F1
Capacity
.00
e1
.96
1
Power
.00
e2
.92
1
Speed
.00
e3.89
1
Weight
.00
e4
.76 1
Width
.00
e5
.80
1
Length
.00
e6
.80
1
Figure 8 : Path model for FA and PCA
8/18/2019 Structural Equation Modelling for Small Samples
34/52
34
Table 7: Car example: Implied covariance matrices and residuals produced by AMOS
Capacity Power Speed Weight Width Length
Capacity 1 .918 .860 .678 .737 .722
Power 1 .804 .633 .689 .674
Speed 1 .593 .645 .632
Weight 1 .508 .498
Width 1 .541
Implied
correlations
Length 1
Capacity Power Speed Weight Width Length
Capacity 0 0.036 0.025 0.014 -0.031 -0.058
Power 0 0.130 -0.104 0.041 -0.147
Speed 0 -0.127 -0.026 -0.054
Weight 0 -0.031 0.297
Width 0 0.050
FA
Residuals
Length 0
Capacity Power Speed Weight Width LengthCapacity .926 .889 .853 .738 .771 .765
Power .853 .818 .699 .740 .734
Speed .785 .671 .710 .705
Weight .573 .606 .602
Width .642 .637
Implied
correlations
Length .632
Capacity Power Speed Weight Width Length
Capacity 0.074 0.065 0.032 -0.046 -0.065 -0.101
Power 0 0.147 0.116 -0.170 -0.010 -0.207
Speed 0 0 0.215 -0.205 -0.091 -0.127
Weight 0 0 0 0.427 -0.129 0.193
Width 0 0 0 0 0.358 -0.046
PCA
Residuals
Length 0 0 0 0 0 0.368
The comparison between FA and PCA results is shown in Table 8.
Table 8: Comparison between FA and PCA approaches
RESS AVE GFI
FA .169 .690 .983
PCA .230 .735 .978
For PCA, the GFI produced by AMOS has to be modified according to formula (27). The usual
PCA of standardized data results in the following eigenvalues: 4.4113, .8534, .4357, .2359, .0514
and .0124. The quality of the approximation of S by '1 1 1ˆλ + δ
u u Θ is therefore measured by the
following value of the GFI :
(32)
( )2
2 2
1 1 1
2
1
19.459 .519.978
20.436
jj j
j
q
h
h
s u
GFI
λ λ
λ =
+ −+
= = =∑
∑
8/18/2019 Structural Equation Modelling for Small Samples
35/52
35
We then used AMOS 6.0 to carry out a first-order PCA of these standardized data under the
hypothesis of equality of weights for the engine variables "cubic capacity, power, speed" and
similarly equality of weight for the passenger compartment variables "weight, width, length".
Figure 9 shows the results of this estimation and Table 9 the 90% bootstrap confidence intervals.
The bootstrap intervals contain values greater than 1 because the bootstrap samples no longer consist
of standardized variables.
1.00
F1
Capacity
.00
e1
.92
1
Power
.00
e2
.92
1
Speed
.00
e3.92
1
Weight
.00
e4.78 1
Width
.00
e5
.78
1
Length
.00
e6
.78
1
Figure 9 : PCA under constraints on the "Auto 2004" data ( AMOS 6.0 output )
Table 9: PCA under constraints for the "Auto 2004" data ( AMOS 6.0 output )
Estimation and bootstrap confidence interval for the coordinates of x
Parameter Estimate Inf (90%) Sup (90%)
Capacity - F1 .924 .542 1.195
Power - F1 .924 .542 1.195
Speed - F1 .924 .542 1.195
Weight - F1 .784 .555 1.003
Width - F1 .784 .555 1.003Length - F1 .784 .555 1.003
The GFI for the model with constraints has the following value provided by AMOS:
(33)
2
*
2
ˆ ˆ
1 .9505GFI = − ='
x xS - λ λ
S
8/18/2019 Structural Equation Modelling for Small Samples
36/52
36
Using the modified formula yields to:
(34)
( )2
2
*
2
ˆ.509
.9505 .97520.436
j jj x
j
s
GFI GFI
λ −
= + = + =∑
S
The very slight reduction of the GFI (.975 vs .978) means that one can accept the model withconstraints.
In this example, we obtain the component ξ̂ as the "McDonald" estimation of the factor ξ ,
calculated as follows:
ξ̂ ∝ .924(capacity* + power * + speed*) + .784(weight* + length* + width*)
where the asterisk means that the variable is standardized.
III. Confirmatory factor analysis, ULS-SEM and analysis of multi-block data
We assume now that the random column vector x breaks down into J blocks of random vectors
1( ,..., ) ' j j j jp x x=x . A specific model with one standardized latent variable (and usual hypotheses) is
constructed for each block x j:
(35) , 1,..., j j j j
j J ξ = + =x λ δ
This model is similar to model (4) with1
J
j j=
= ⊕xΛ λ . For each block j we have
(36) ' j j j j
= +x δΣ λ λ Θ
and for two blocks j and k we get
(37) ' j k jk j k
ϕ =x xΣ λ λ
where ( , ) jk j k Cor ϕ ξ ξ = .
Decomposition (7) thus becomes
(38) ' j j
⎡ ⎤ ⎡ ⎤= ⊕ ⊕ +⎣ ⎦ ⎣ ⎦ δ
Σ λ Φ λ Θ
The parametersλ1,…, λ J , and Θδ in model (38) can now be estimated by using the ULS method.
This means seeking the parameters 1ˆ ˆ,...,
J λ λ , Φ̂ and ˆ
δΘ minimizing the criterion
(39)2
ˆ ˆ ˆˆ( ) ( ) ' j j
⎡ ⎤− ⊕ ⊕ +⎣ ⎦δS λ Φ λ Θ
Adding constraint (25) gives a new criterion to be minimized:
(40)2
ˆ ˆˆ( ) ( ) ' j j− ⊕ ⊕S λ Φ λ
8/18/2019 Structural Equation Modelling for Small Samples
37/52
37
This results in a new factorisation of the covariance matrix allowing an estimation to be made of
both the loadings and also the correlations between the factors. The quality of the fit is still
measured by the GFI criterion.
Example 3
In this example we are going to study data about wine tasting described in detail in Pagès, Asselin,
Morlat & Robichet (1987).
Description of the data
A collected of 21 red wines of Bourgueil, Chinon and Saumur appellations is described by a set of
27 taste variables divided into 4 blocks:
X 1 = Smell at rest
Rest1 = smell intensity at rest, Rest2 = aromatic quality at rest, Rest3 = fruity note at rest, Rest4 =floral note at rest, Rest5 = spicy note at rest
X 2 = View
View1 = visual intensity, View2 = shading (from orange to purple), View3 = surface impression
X 3 = Smell after shaking
Shaking1 = smell intensity, Shaking2 = smell quality, Shaking3 = fruity note, Shaking4 = floral
note, Shaking5 = spicy note, Shaking6 = vegetable note, Shaking7 = phenolic note, Shaking8 =
aromatic intensity in mouth, Shaking9 = aromatic persistence in mouth, Shaking10 = aromatic
quality in mouth
X 4 = Tasting
Tasting1 = intensity of attack, Tasting2 = acidity, Tasting3 = astringency, Tasting4 = alcohol,
Tasting5 = balance (acidity, astringency, alcohol), Tasting6 = mellowness, Tasting7 = bitterness,
Tasting8 = ending intensity in mouth, Tasting9 = harmony
These data have already been analysed using PLS in Tenenhaus & Esposito Vinzi (2005) and in
Tenenhaus & Hanafi (2007). We present here the ULS-SEM solution on the standardized variables
with cancellation of the residual measurement variances. First of all, we present the PCA for each
separate block in Table 10.
8/18/2019 Structural Equation Modelling for Small Samples
38/52
38
Table 10: Principal component analysis of each block for the "Wine" data
Smell at rest
.741 .551
.915 -.144
.854 -.191
.345 -.537
.077 .933
Smell intensity at rest Aromatic quality at rest
Fruity note at rest
Floral note at rest
Spicy note at rest
1 2
Component
View
.986 -.146
.983 -.163
.947 .320
Visual intensity
Shading (from orange to purple)
Surface impression
1 2
Component
Smell after shaking
.472 .743
.881 -.180
.819 -.176
.328 -.500
.089 .746
-.635 .593.370 .633
.895 .277
.888 .307
.882 -.372
Smell intensity
Smell quality
Fruity note
Floral note
Spicy note
Vegetable notePhelonic note
Aromatic intensity in mouth
Aromatic persistence in mouth
Aromatic quality in mouth
1 2
Component
Tasting
.937 .082
-.257 .691
.775 .427
.774 .378
.844 -.423
.901 -.380
.377 .760
.967 .117
.958 -.233
Intensity of attack
Acidity
Astringency
Alcohol
Balance (acidity, astringency, alcohol)
Mellowness
Bitterness
Ending intensiry in mouth
Harmony
1 2
Component
8/18/2019 Structural Equation Modelling for Small Samples
39/52
39
Use of ULS-SEM for the analysis of multi-block data
All the variables are standardized: S = R . The correlation matrix R is now approximated using
criterion (40), with the aid of the following factorisation formula:
'12 13 141 1
'21 23 242 2
1 4 '31 32 343 3
'41 42 434 4
' ' ' '
1 1 12 1 2 13 1 3 14 1 3
' '
21 2 1 2 2 2
10 0 0 0 0 0
10 0 0 0 0 0( ,..., , )
10 0 0 0 0 0
10 0 0 0 0 0
ϕ ϕ ϕ
ϕ ϕ ϕ
ϕ ϕ ϕ
ϕ ϕ ϕ
ϕ ϕ ϕ
ϕ ϕ
⎡ ⎤⎡ ⎤⎡ ⎤ ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥Φ =⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦
=
λ λ
λ λ R λ λ
λ λ
λ λ
λ λ λ λ λ λ λ λ
λ λ λ λ ' '3 2 3 24 2 4' ' ' '
31 3 1 32 3 2 3 3 34 3 4
' ' ' '41 4 1 42 4 2 43 4 3 4 4
ϕ
ϕ ϕ ϕ
ϕ ϕ ϕ
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
λ λ λ λ
λ λ λ λ λ λ λ λ
λ λ λ λ λ λ λ λ
In this way, confirmatory factor analysis (or perhaps rather confirmatory PCA), in the context
described here, allows the best first order reconstruction of the intra- and inter-block correlations.
First analysis
Using AMOS 6.0, we obtained Table 11 and the diagram shown in Figure 10. Confirmatory factor
analysis of the four blocks echoes in essence the results of the first principal components of the
separate PCA’s for each block. We have put the significant loadings in bold in Table 11. They well
correspond to the strongest variable×PC1 correlations given in Table 10. There are two exceptions:
Smell intensity at rest in block 1 and Astringency in block 4. It should be noted that these variables
have fairly high correlations with the second principal components. The GFI is less than 0.9. This is
due to the existence of the second dimensions for blocks 1, 3 and 4.
Second analysis
In order to better identify the first dimension of the phenomenon under study, it is usual in
confirmatory factor analysis to "purify" the scales: the analysis is repeated, omitting the non-
significant variables. This is where Table 12 and Figure 11 come from. All the correlations between
the manifest and latent variables of the corresponding block, and the correlations between the latent
variables, are now strongly positive. All the correlations are significant. The first dimension of the phenomenon under study has therefore been perfectly identified. The GFI of 0.983 is excellent and
well shows the unidimensionality of the selected variables.
8/18/2019 Structural Equation Modelling for Small Samples
40/52
40
Table 11: Confirmatory factor analysis of the "Wine" data ( AMOS 6.0 output )
(Significant coefficient in bold, non-significant in italic)
Parameter Estimate Inf (95%) Sup (95%) P
Smell intensity at rest
8/18/2019 Structural Equation Modelling for Small Samples
41/52
8/18/2019 Structural Equation Modelling for Small Samples
42/52
42
Table 12: Confirmatory factor analysis of the "Wine" data on
the significant variables ( p-value < .05) of Table 11 ( AMOS 6.0 output )
Parameter Estimate Inf (95%) Sup (95%) P
Aromatic quality at rest
8/18/2019 Structural Equation Modelling for Small Samples
43/52
43
1.00
Rest 1
rest3
.00
e2
.89
rest2
.00
e3
1
1.00
View
view3
.00
e5
1
view2
.00
e6
1
view1
.00
e7
.94
1
1.00
Shaking 1shaking3
.00
e11 .82
1
shaking2
.00
e12.88
1.00
Tasting 1
tasting6
.00
e14tasting5
.00
e15
tasting4
.00
e16
1
tasting1
.00
e19
1
shaking8
.00
e221
shaking9
.00
e231
shaking10
.00
e24
tasting8 .00
e26
tasting9
.00
e27
1
.96
.93
.89.79
1
1
.99
.92
.92
.85
.89
1
1
.97
.92.65
.87
.69
.84
.90
1.04
1
1
.79
Figure 11: Confirmatory factor analysis of the "Wine" data on
the significant variables of Table 9 ( AMOS 6.0 output )
8/18/2019 Structural Equation Modelling for Small Samples
44/52
44
Third analysis
As the four LV’s appearing in Figure 11 are highly correlated, it is natural to summarize these LV’s
through a second order confirmatory factor analysis. This yields to Figure 12. The regression
coefficient of one MV of each block has been put to 1. The second order LV “Score 1” is similar to
the standardized first principal component of the first order LV’s as the error variances have been put to zero. The first order LV’s are evaluated by using the McDonald approach. For example,
using the path coefficients shown in Figure 12, we get:
* *Score(Rest 1) 1 rest2 .88 rest3∝ × + ×
Rest 1
rest3
.00
e2
.88
rest2
.00
e3
1
View
view3
.00
e5
1
view2
.00
e6
1
view1
.00
e7
1.00
1
Shaking 1
shaking3
.00
e11 .961
shaking2
.00
e121.04
Tasting 1
tasting6
.00
e14tasting5
.00
e15
tasting4
.00
e16
1
tasting1
.00
e19
1
shaking8
.00
e221
shaking9
.00
e231
shaking10
.00
e24
tasting8 .00
e26
tasting9
.00
e27
1
.99
.95
.91.82
1
1
1.00
1.09
1.08
1.00
.91
1
1
1.00
.981.12
1
1
1.00
Score 1
.84.83
.82
.94
.00
d2
.00
d1
.00
d3
.00
d4
11
1
1
Figure 12: Second order confirmatory factor analysis of the "Wine" data on
the significant variables of Table 9 ( AMOS 6.0 output )
8/18/2019 Structural Equation Modelling for Small Samples
45/52
45
In the same way, the second order LV (“Score 1”) can be computed as a weighted sum of all the
MV’s. The regression coefficient of “Score 1” in the regression of an MV on “Score 1” is equal to
the product of the path coefficients related to the link between this MV and its LV and to the link
between the LV and “Score 1”. For example
12 12 12 12
(rest2,Score 1)
(Rest 1,Score 1) ( Rest 1 ,Score 1) (Rest 1,Score 1)
(Score 1)
Cov
CovCov Cov
Var λ ε λ λ = + = = ×
as the latent variable “Score 1” is standardized. This leads to:
( ) ( )* * * *Score 1
.83 1 rest2 .88 rest3 .94 .91 tasting1 1 tasting9∝ × × + × + + × × + + ×
But this formula has a severe drawback: it gives more weight to a block containing many variables
than to a block with few variables. From a pragmatic point of view, we prefer to compute aweighted sum of the first order standardized LV estimates, using the path coefficients relating the
first order LV’s to the second order LV. In fact, these weights are reflecting the quality of the
approximation of the second order LV by the first order LV’s. This leads to what is called here
Global score (1):
Global score (1)
.83 Score (Rest 1) .94 Score (Tasting 1)∝ × + + ×
The correlation table between these scores is given in Table 13. All the computed first order LV’s
are well positively correlated and very well summarized by the computed second order LV.
Table 13: Correlation between scores related to the first dimensions of the wine data
Correlations
1 .671 .687 .546 .802
.671 1 .794 .838 .921
.687 .794 1 .897 .942
.546 .838 .897 1 .920
.802 .921 .942 .920 1
Rest 1
View 1
Shaking 1
Tasting 1
Global score 1
Rest 1 View 1 Shaking 1 Tasting 1Globalscore 1
Fourth analysis
To identify the second dimension of the phenomenon under study, we will construct a new
confirmatory PCA model for the manifest variables not taken into account in the second analysis.The non-significant variables were eliminated iteratively as before. This is where Figure 13 and
Table 14 come from. All the correlations between the manifest and latent variables of the
corresponding block, and the correlations between the latent variables, are now strongly positive.All the correlations are significant. The second dimension of the phenomenon studied has therefore
been identified. The value of the GFI is 0.919; this means that this second dimension can beaccepted.
8/18/2019 Structural Equation Modelling for Small Samples
46/52
46
1.00
Rest 2
rest1
.00
e4
1.00
Shaking 2shaking5
.00
e9.78
shaking1
.00
e131
1.00
Tasting 2
tasting3
.00
e17
1
rest5
.00
e20
shaking7
.00
e21
tasting7
.00
e25
.91
1
.94
1
.85
.60
1 .79
.77
.75
.96
.78
1
1
Figure 13: Confirmatory factor analysis of the "Wine" dataon the variables of Table 8 ( AMOS 6.0 output )
Table 14: Confirmatory factor analysis of the "Wine" data on the non-significant variables of Table
9 ( AMOS 6.0 output ). Results after iterative elimination of non-significant variables.
Parameter Estimate Inf (95%) Sup (95%) P
Smell intensity at rest
8/18/2019 Structural Equation Modelling for Small Samples
47/52
47
Fifth analysis
The three LV’s appearing in Figure 13 being highly correlated, they are summarized as above
through a second order confirmatory factor analysis. This yields to Figure 14. Scores related to thesecond dimension are computed in the same way as those related to the first dimension. The
correlation table related to these scores is given in Table 15. Comments are the same as for Table13.
Rest 2
rest1
.00
e4
1
Shaking 2shaking5
.00
e91.34
shaking1
.00
e131
Tasting 2
tasting3
.00
e17
1
rest5
.00
e20
1
shaking7
.00
e21
tasting7
.00
e25
1.08
1
1.62
1.25
1
1.00
1.00
1.00
1
1.00
Score 2
.70
.54
.78
.00
d1
.00
d3
.00
d4
1
1
1
Figure 14: Second order confirmatory factor analysis of the "Wine" data on
the variables of Table 12 ( AMOS 6.0 output )
Table 15: Correlation between scores related to the second dimensions of the wine data
Correlations
1 .758 .776 .908
.758 1 .793 .933
.776 .793 1 .925
.908 .933 .925 1
Rest 2
Shaking 2
Tasting 2
Global score 2
Rest 2 Shaking 2 Tasting 2Globalscore 2
8/18/2019 Structural Equation Modelling for Small Samples
48/52
48
Remarks:
1. The first dimension consists of variables all positively correlated with the global quality grade
(available elsewhere). These correlations are given in Table 16. The second dimension, on the otherhand, is relative to variables not correlated with the global quality grade.
2. It may be wished to obtain orthogonal components in each block. Then, it would be necessary touse the deflation process, i.e. to construct a new analysis on the residuals of the regression of each
original block X j on its first computed latent variable LV j.
Table 16: Correlation between the variables related to the two dimensions
and the global quality grade
Variables related to dimension 1 Global quality
Aromatic quality at rest 0.62
Fruity note at rest 0.50
Visual intensity 0.54 Shading (from orange to purple) 0.51
Surface impression 0.67
Smell quality 0.76
Aromatic intensity in mouth 0.61
Aromatic persistence in mouth 0.68
Aromatic quality in mouth 0.85
Intensity of attack 0.77
Alcohol 0.52
Balance (acidity, astringency, alcohol) 0.95
Mellowness 0.92
Ending intensity in mouth 0.80 Harmony 0.88
Global score 1 0.73
Variables related to di mension 2 Global quality
Smell intensity at rest 0.04
Spicy note at rest -0.31
Smell intensity after shaking 0.17
Spicy note after shaking -0.08
Phelonic note 0.09
Astringency 0.41
Bitterness 0.05 Global score 2 0.08
Graphical displays
Using Global scores (1) and (2), we obtain three graphical displays. The variables are describedwith their correlations with Global scores (1) and (2). The individuals are visualized with these two
global scores using appellation and soil markers. These graphical displays are given in Figures 15,
16 and 17. Figures 16 and 17 show clearly that soil is a much better predictor of wine quality thanappellation. All the wines produced on a reference soil are positive on Score 1. The reader
interested in wine can even detect that the two Saumur 1DAM and 2DAM are the best wines from
8/18/2019 Structural Equation Modelling for Small Samples
49/52
49
this sample. I can testify that I drank outstanding Saumur-Champigny produced at Dampierre-sur-
Loire.
Figure 15: Graphical display of the variables
Figure 16: Graphical display of the wine with appellation markers
8/18/2019 Structural Equation Modelling for Small Samples
50/52
50
Figure 17: Graphical display of the wine with soil markers
IV. Comparison between the ULS-SEM and PLS approaches.
The die is not cast and the ULS-SEM approach is not uniformly more powerful than the PLSapproach. We have set out the "pluses" and "minuses" of each approach in Table 16.
V. Conclusion
Roderick McDonald has thrown a bridge between the SEM and PLS approaches by making use ofthree ideas: (1) using the ULS method, (2) setting the variances of the residual terms of the
measurement model to 0, and (3) estimating the latent variables by using the loadings of the MV’son their LV’s. The McDonald approach has some very promising implications. Using a SEM
software such as AMOS 6.0 makes it possible to get back to PCA, to the analysis of multi-block data
and to a "data analysis" approach for SEM completely similar to the PLS approach. We haveillustrated this process with three examples, corresponding to these different themes. We have listed
the advantages and disadvantages of the two approaches. We end this paper with a wish: that thisULS-SEM approach be included in a PLS-SEM software. The user would then have access to a very
comprehensive toolbox for a "data analysis" approach to structural equation modelling.
8/18/2019 Structural Equation Modelling for Small Samples
51/52
51
Table 16: Comparison between the ULS-SEM and PLS approaches.
ULS-SEM PLS
The
"pluses"
- Global criterion well identified- Use of SEM softwares
- Parameters can be subject toconstraints
- Use of bootstrapping on all themodel parameters
- Better measurement of thequality of the theoretical model
- Non-recursive model allowed
- No identification problem- Systematic convergence of the
PLS algorithm- General framework for multi-
block data analysis- Robust method for small-size
samples
- Possibility of several LV’s per block exists in PLS-Graph
software- Explicit calculation of LV’s
integrated in PLS softwares
- Easy handling of missing data
The
"minuses"
- Possible difficulty in modelidentification
- Possible non-convergence of thealgorithm
- Explicit calculation of LV’s isoutside the SEM software
- Missing data are not permitted
- Algorithm is often closer to anheuristic than to the optimisation
of a global criterion
- It is impossible to imposeconstraints on the parameters
- Measurement of the quality of theinner model is underestimated
- Measurement of the quality of theouter model is overestimated
- Non-recursive model prohibited
References
Arbuckle, J.L. (2005): AMOS 6.0. AMOS Development Corporation, Spring House, PA.
Bollen, K. A. (1989): Structural Equations with Latent Variables, John Wiley & Sons.
Chin W.W. (2001): “PLS-Graph User’s Guide”, C.T. Bauer College of Business, University of Houston,
USA.
Hwang, H. & Takane Y. (2004) : Generalized structured component analysis, Psychometrika, 69, 1, 81-99.
McDonald, R.P. (1996): Path analysis with composite variables, Multivariate Behavioral Research,
31 (2), 239-270.
Noonan, R. & Wold, H. (1982): PLS path modeling with indirectly observed variables: a comparison
of alternative estimates for the latent variable. In: Jöreskog, K.G., Wold, H. (Eds.), Systems under Indirect Observation. North-Holland, Amsterdam, pp. 75–94.
Pagès J., Asselin C., Morlat R., Robichet J. (1987): Analyse factorielle multiple dans le traitement de
données sensorielles : Application à des vins rouges de la vallée de la Loire, Sciences des aliments,
7, 549-571)
Tenenhaus, M. (2007): Statistique : Méthodes pour décrire, expliquer et prévoir , Dunod, Paris.
Tenenhaus M., Esposito Vinzi V., Chatelin Y.-M., Lauro C. (2005): PLS path modeling.Computational Statistics & Data Analysis, 48, 159-205.
8/18/2019 Structural Equation Modelling for Small Samples
52/52
Tenenhaus, M. & Esposito Vinzi, V. (2005): PLS regression, PLS path modeling and generalized
Procustean analysis: a combined approach for multiblock analysis, Journal of Chemometrics, 19,145-153.
Tenenhaus, M. & Hanafi M. (2007): A bridge between PLS path modelling and multi-block data
analysis », in Handbook of Partial Least Squares (PLS): Concepts, Methods and Applications (V.
Esposito Vinzi, W. Chin, J. Henseler, H. Wang, Eds), Volume II in the series of the Handbooks ofComputational Statistics, Springer, in press.
Tenenhaus, M., Pagès, J., Ambroisine, L. & Guinot, C. (2005): PLS methodology to study
relationships between hedonic judgements and product characteristics, Food Quality and Preference, vol. 16, n° 4, pp. 315-325.
XLSTAT (2007): XLSTAT-PLSPM module, XLSTAT software, Addinsoft, Paris.