+ All Categories
Home > Documents > Structural Equation Modelling for Small Samples

Structural Equation Modelling for Small Samples

Date post: 07-Jul-2018
Category:
Upload: franckiko2
View: 222 times
Download: 1 times
Share this document with a friend

of 52

Transcript
  • 8/18/2019 Structural Equation Modelling for Small Samples

    1/52

      1

    Structural Equation Modelling for small samples

    Michel Tenenhaus HEC School of Management (GRECHEC),

    1 rue de la Libération, Jouy-en-Josas, France [[email protected]]

    Abstract

    Two complementary schools have come to the fore in the field of Structural Equation Modelling

    (SEM): covariance-based SEM and component-based SEM.

    The first approach developed around Karl Jöreskog. It can be considered as a generalisation of both

     principal component analysis and factor analysis to the case of several data tables connected by

    causal links.

    The second approach developed around Herman Wold under the name "PLS" (Partial Least

    Squares). More recently Hwang and Takane (2004) have proposed a new method namedGeneralized Structural Component Analysis.  This second approach is a generalisation of principal

    component analysis (PCA) to the case of several data tables connected by causal links.

    Covariance-based SEM is usually used with an objective of model validation and needs a large

    sample (what is large varies from an author to another: more than 100 subjects and preferably more

    than 200 subjects are often mentioned). Component-based SEM is mainly used for score

    computation and can be carried out on very small samples. A research based on 6 subjects has been

     published by Tenenhaus, Pagès, Ambroisine & Guinot (2005) and will be used in this paper.

    In 1996, Roderick McDonald published a paper in which he showed how to carry out a PCA using

    the ULS (Unweighted Least Squares) criterion in the covariance-based SEM approach. He

    concluded from this that he could in fact use the covariance-based SEM approach to obtain resultssimilar to those of the PLS approach, but with a precise optimisation criterion in place of an

    algorithm with not well known properties.

    In this research, we will explore the use of ULS-SEM and PLS on small samples. First experiences

    have already shown that score computation and bootstrap validation are very insensitive to the

    choice of the method. We will also study the very important contribution of these methods to multi-

     block analysis.

    Key words: Multi-block analysis, PLS path modelling, Structural Equation Modelling, Unweighted

    Least Squares

    Introduction

    Compare to covariance-based SEM, PLS suffers from several handicaps: (1) the diffusion of path

    modelling softwares is much more confidential than that of covariance-based SEM softwares, (2) the

    PLS algorithm is more an heuristic than an algorithm with well known properties and (3) the

     possibility of imposing value or equality constraints on path coefficients is easily managed in

    covariance-based SEM and does not exist in PLS. Of course, PLS has also some advantages on

    covariance-based SEM (that’s why PLS exists) and we can list some of them: systematic

    convergence of the algorithm due to its simplicity, possibility of managing data with a small number

    of individuals and a large number of variables, practical meaning of the latent variable estimates,

    general framework for multi-block analysis.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    2/52

      2

    It is often mentioned that PLS is to covariance-based SEM as PCA is to factor analysis. But the

    situation has seriously changed when Roderick McDonald showed in his 1996 seminal paper that he

    could easily carry out a PCA with a covariance-based SEM software by using the ULS ( Unweighted

     Least Squares) criterion and cancelling the measurement error variances. Furthermore, the

    estimation of the latent variables proposed by McDonald is similar to using the PLS mode A and the

    SEM scheme (i.e. using the “theoretical” latent variables as inner LV estimates). Thus, it became possible to use a covariance-based SEM software to mimic PLS.

    In the first section of this paper, it is reminded how to use the ULS criterion for covariance-based

    SEM and the PLS way of estimating latent variables for mimicking PLS path modelling. Then, the

    second section is devoted to show how to carry out a PCA with a covariance-based SEM software

    and to comment the interest of this approach for taking into account parameter constraints and for

     bootstrapping. Multi-block analysis is presented in the third section as a confirmatory factor

    analysis.

    We have used AMOS 6.0 (Arbuckle, 2005) and XLSTAT-PLSPM, a module of the XLSTAT

    software (XLSTAT, 2007), on practical examples to illustrate the paper. Listing the pluses and

    minuses of ULS-SEM and PLS finally concludes the paper.

    I. Using ULS and PLS estimation methods for structural equation modelling

    We describe in this section the use of the ULS estimation method applied to the SEM parameter

    estimates and that of the PLS estimation method for computing the LV values.

    In a first part we remind the structural equation model following Bollen (1989). A structural

    equation model consists of two models: the latent variable model and the measurement model.

    The latent variable model

    Letη

     be a column vector consisting of m endogenous (dependent) centred latent variables, and ξ a

    column vector consisting of k   exogenous (independent) centred latent variables.  The structural

    model connecting the vectorη

     to the vectorsη and ξ is written as

    (1) η = Bη + Γξ + ζ  

    where B is a zero-diagonal m m×  matrix of regression coefficients, a m k ×  matrix of regressioncoefficients and

    ζ

     a centred random vector of dimension m.

    The measurement model

    Each latent (unobservable) variable is described by a set of manifest (observable) variables. Thecolumn vector y j of the centred manifest variables linked to the dependent latent variable η  j can be

    written as a function of η  j through a simple regression with usual hypotheses.

    (2)  y j j j j

    η y = λ  + ε  

    The column vector y, obtained by concatenation of the y j’s, is written as

    (3) yy = Λ η + ε  

    where1

    m y

     j j=

    = ⊕yΛ λ   is the direct sum of 1 ,...,

     y y

    mλ λ   and ε is a column vector obtained by concatenation

    of theε j’s. It may be reminded that the direct sum of a set of matrices A1, A2,…, Am  is a block

    diagonal matrix in which the blocks of the diagonal are formed by matrices A1, A2,…, Am.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    3/52

      3

    Similarly, the column vector x  of the centred manifest variables linked to the latent independent

    variables is written as a function of ξ: 

    (4) xx = Λ ξ + δ  

    Adding the usual hypothesis that the matrix I-B is non-singular, equation (1) can also be written as:

    (5) 1( ) ( )−η = I - B   Γξ + ζ  

    and consequently (3) becomes

    (6) 1[( ) ( )]−y

    y = Λ I - B   Γξ + ζ + ε  

     Factorisation of the manifest variable covariance matrix

    Let = Cov(ξ) =  E (ξξ’), ψ = Cov(ζ) =  E (ζζ’), Θε  = Cov(ε) =  E (εε’) and Θδ = Cov(δ) =  E (δδ’).Suppose that the random vectorsξ

     andδ

     are independent of each other and that the covariance

    matrices Ψ, Θε

    , Θδ

     of the error terms are diagonal. Then, we get:

    ' xx = x x   δΣ Λ ΦΛ +   Θ ,

    1 1[( ) ( )][( ) '] ' yy

    − −y y   ε

    Σ = Λ I - B   ΓΦΓ' + Ψ I - B   Λ +   Θ ,

    [ ]1

    ' ( ) ' ' xy

    −=

    x yΣ Λ ΦΓ I - B   Λ  

    From which we finally obtain:

    (7)

    [ ]11 1

    1

    [( ) ( )][( ) '] ' ( ) '

    ' ( ) ' ' '

     yy yx

     xy xx

    −− −

    ⎡ ⎤⎡ ⎤⎢ ⎥= =⎢ ⎥

    ⎡ ⎤⎢ ⎥⎣ ⎦   ⎣ ⎦⎣ ⎦

    y y   ε y x

    x y x x   δ

    Λ I - B   ΓΦΓ'+ Ψ I - B   Λ +   Θ Λ I - B   ΓΦΛΣ ΣΣ

    Σ Σ   Λ ΦΓ I - B   Λ Λ ΦΛ +   Θ 

    Let { }θ , , , , ,= x y   ε δΛ Λ B   Γ,Φ, Ψ Θ Θ  be the set of parameters of the model and (θ)Σ the matrix (7).

     Model estimation using the ULS method

    Let S be the empirical covariance matrix of the MV’s. The object is to seek the set of parametersˆ ˆ ˆ ˆ ˆ ˆˆ ˆ ˆθ { , , , , } x y=   ε δΛ Λ B   Γ,Φ, Ψ Θ ,Θ  minimizing the criterion

    (8)2

    ˆ(θ)−S   Σ  

    The aim is therefore to seek a factorisation of the empirical covariance matrix S as a function of the

     parameters of the structural model. In SEM softwares, the covariance matrix estimations

    ˆ ( )Cov=εΘ ε  and ˆ ( )Cov=δΘ δ  of the residual terms are computed in such a way that the diagonal of

    the reconstruction error matrix ˆ(θ)−E = S   Σ   is null, even when it yields to negative variance(Heywood case).

  • 8/18/2019 Structural Equation Modelling for Small Samples

    4/52

      4

    Let’s denote by ˆii

    σ  the i-th term of the diagonal of ˆ ˆ ˆˆ ˆ ˆ( , , , , )x yΣ Λ Λ B   Γ,Φ, Ψ 0,0  and byˆii

    θ   the i-th term

    of the diagonal of ˆ ˆ⊕ε δΘ Θ . From the formula:

    (9) ˆˆii ii ii

    s   σ θ = +  

    we may conclude that ˆiiσ   is the part of the variance sii of the i-th MV explained by its LV (except in

    a Heywood case) and ˆii

    θ    is the estimate of the variance of the measurement error relative to this

    MV. As all the error terms ˆˆ( )ii ii ii iie s   σ θ = − +   are null, this method is not oriented towards the

    research of parameters explaining the MV variances. It is in fact oriented towards the reconstruction

    of the covariances between the MV’s, variances excluded.

    The McDonald approach for parameter estimation

    In his 96 paper, McDonald proposes to estimate the model parameters subject to the constraints that

    all the ˆiiθ   are null. The object is to seek the parameters ˆ ˆ ˆˆ ˆ ˆ, , , x yΛ Λ B   Γ,Φ, Ψ  minimizing the criterion

    (10)2

    ˆ ˆ ˆˆ ˆ ˆ( , , , , , )−x y

    S   Σ Λ Λ B   Γ,Φ, Ψ 0 0  

    The estimations of the variances of the residual termsε

     andδ

     are integrated in the diagonal terms of

    the reconstruction error matrix ˆ ˆ ˆˆ ˆ ˆ( , , , , , )= −x y

    E S   Σ Λ Λ B   Γ,Φ, Ψ 0 0 . This method is therefore oriented

    towards the reconstruction of the full MV covariance matrix, variances included. On a second step,

    final estimation ˆ ˆ andε δ

    Θ Θ  of the variances of the residual termsε

     andδ

     are obtained by using again

    formula (9).

    Goodness of Fit

    The quality of the fit can be measured by the GFI (Goodness of Fit Index) criterion of Jöreskog &

    Sorbum, defined by the formula

    (11) 

    2

    2

    ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )1GFI 

    −= −

    x y   ε δS   Σ Λ Λ B   Γ,Φ, Ψ Θ Θ

    i.e. the proportion of2

    S   explained by the model. By convention, the model under study is

    acceptable when the GFI  is greater than 0.90.

    The quantity2

    ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )−x y   ε δ

    S   Σ Λ Λ B   Γ,Φ, Ψ Θ Θ  can be deduced from the CMIN criterion given in

    AMOS:

    (12)21 ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )

    2

     N CMIN 

      −= × − x y   ε δS   Σ Λ Λ B   Γ,Φ, Ψ Θ Θ  

    where N  is the number of cases.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    5/52

      5

    In practical applications of the McDonald approach, the difference between the GFI given by AMOS

    and the exact GFI computed with formula (11) will be small:

    (13) 

    2

    2

    22

    2

    ˆ ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , , )1

    ˆ ˆ ˆ ˆˆ ˆ ˆ( , , , , )

      1ii

    i

    GFI 

    θ 

    −= −

    − −= −

    x y   ε δ

    x y

    S   Σ Λ Λ B   Γ,Φ, Ψ Θ Θ

    S

    S   Σ Λ Λ B   Γ,Φ, Ψ 0,0

    S

     

    Using the McDonald approach, the GFI given by AMOS is equal to

    (14)

    2

    2

    ˆ ˆ ˆˆ ˆ ˆ( , , , , )1GFI 

    −= −

    x yS   Σ Λ Λ B   Γ,Φ, Ψ 0,0

    and

    22

    ˆ /iii

    θ ∑ S   is usually small. Furthermore, the exact GFI   will always be larger than the GFI  given by AMOS.

     Evaluation of the latent variables

    After having estimated the parameters of the model, we now present the problem of evaluating the

    latent variables. Three approaches can be distinguished: the traditional SEM approach, the

    "McDonald" approach, and the "Fornell" approach. As it is usual in the PLS approach, we now

    designate one manifest variable with the letter  x and one latent variable with the letter ξ , regardless

    of whether they are of the dependent or independent type. The total number of latent variables is

    n = k + m and the number of manifest variables related to the latent variable ξ  j is p j.

    The traditional SEM approach

    To construct an estimation ˆ j

    ξ   of j

    ξ  , one proceeds by multiple regression of j

    ξ   on the whole set of

    the centred manifest variables 11 11,..., n nnp np x x x x− − . In other words, if one denotes asˆ

     xxΣ   the

    implied (i.e. predicted by the structural model) covariance matrix between the manifest variables,

    and as ˆ j xξ 

    Σ the vector of the implied covariances between the manifest variables  x  and the latent

    variable ξ  j, one obtains an expression of ˆ jξ   as a function of the whole set of manifest variables:

    (15) 1ˆ ˆ ˆ j j xx x

     X  ξ ξ   −= Σ Σ  

    where 11 11,..., n nnp np X x x x x⎡ ⎤= − −⎣ ⎦

    . This method is not really usable, as it is more natural to

    estimate a latent variable solely as a function of its own manifest variables.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    6/52

  • 8/18/2019 Structural Equation Modelling for Small Samples

    7/52

  • 8/18/2019 Structural Equation Modelling for Small Samples

    8/52

      8

    summarize the manifest variables of the block. This relationship may also be reflective: each

    manifest variable is then a reflection of a latent variable existing a priori, a theoretical concept one

    would try to outline with measures. The formative mode does not require the blocks to be one-

    dimensional, while that is compulsory for the reflective mode. Here, we are more in a formative

    mode for the physico-chemical and sensorial blocks and reflective mode by construction for the

    hedonic block. The two modes are indicated by the direction of the arrows in Figure 1.

    With regard to the PLS algorithm, it is recommended that the method of calculating outer estimates

    of the latent variables is selected depending on the type of relationship between the manifest

    variables and their latent variables: Mode A for the reflective type and Mode B for the formative

    type (Wold, 1985). The low number of products has obliged us to use Mode A to calculate the outer

    estimates of the latent variables (although the mode of relationship between the manifest and latent

    variables is formative for the physico-chemical and sensorial blocks).

    The ULS-SEM approach presented here is clearly oriented towards the reflective mode. Therefore

    this orange juice example will be analyzed with ULS-SEM and PLS approaches using the reflective

    mode for the three blocks. Concerning the physico-chemical and sensorial blocks, the direction of

    the arrows connecting the MV’s to their LV’s shown in Figure 1 should thus be inversed.

    Figure 1: Theoretical model of relationships between the hedonic, physico-chemical and sensorial

    data

    Glucose

    Fructose

    Saccharose

    Sweetening power

    pH before processing

    pH after centrifugation 

    Titer

    Citric acid

    VitaminC

    Smell intensity

    Odor typicity

    Pulp

    Taste intensity 

    Acidity

    Bitterness

    Sweetness

    ξ

    1

    ξ

    2

    ξ

    3

    Judge 2Judge 3

    Judge 96

     

    1. Use of ULS-SEM

    We now use the ULS-SEM approach on the orange juice data. Following McDonald, the

    measurement error variances are put to 0. The results are given in Figure 2 and in Table 2.

    All manifest variables have been standardized. The value 1 has been given to the path coefficients

    related to the manifest variables pH before centrifugation, Sweetness and Judge2.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    9/52

  • 8/18/2019 Structural Equation Modelling for Small Samples

    10/52

      10

    Table 2: Outputs of AMOS 6.0

    Table 2.1: Regression Weights (non significant weights in bold):

    Parameter Estimate Lower (90%) Upper (90%) P

    SENSORIAL

  • 8/18/2019 Structural Equation Modelling for Small Samples

    11/52

      11

    Tableau 2.2: Variances

    Parameter Estimate Lower Upper P

    PHYSICO-CHEMICAL .921 .429 1.120 .020

    d1 .298 .028 .364 .010

    d2 .034 .000 .044 .177

     

    Tableau 2.3: Squared Multiple Correlations

    Parameter Estimate Lower Upper P

    SENSORIAL .655 .529 .951 .010

    HEDONIC .946 .919 1.000 .020

    Tableau 2.4: Model Fit Summary

    Model NPAR CMIN

    Default model 42 105.613

    Model RMR GFI AGFI PGFI

    Default model .175 .904 .898 .855

    Comment: The GFI is equal to .904 and suggests that the model is acceptable.

    Figure 3: Loading plot for the PCA of judges

  • 8/18/2019 Structural Equation Modelling for Small Samples

    12/52

      12

    The main objective of component-based SEM is the construction of scores. Following the

    McDonald approach, we use the path coefficients given in Figure 2 and in Table 2.1. We obtain the

    following constructs:

    For the Physico-chemical block

    Scor e( Physi co- Chemi cal ) ∝  - . 765*Gl ucose - . 764*Fr uct ose +. 890*Saccharose+. 219*( Sweet eni ng power) + 1*( pH bef ore cent r i f ugat i on) + . 998*( pH af t ercent r i f ugat i on) - . 869*Ti t er - . 877*( Ci t r i c aci d) - . 064*( Vi t ami n C)

    where all the variables (score and manifest variables) are standardized.

    For the Sensorial block

    Scor e( Sensor i al ) ∝  . 244*( Smel l i nt ensi t y) + . 935*( Odor t ypi ci t y) +. 657*Pul p - . 565*( Tast e i nt ensi t y) - . 946*Aci di t y - . 974*Bi t t er ness +

    1*Sweetness

    with the same standardization than for the previous block.

    For the Hedonic block

    Scor e( Hedoni c) = 1*J udge2 + . 956*J udge3 + … + . 821*J udge96

    with the same standardization than for the previous blocks.

    The latent variable scores are given in Table 2.5 and their correlations in Table 2.6. The correlations

     between these scores and the manifest variables are given in Table 2.7.

    Tableau 2.5: ULS-SEM latent variable scores

    Physico-chemical Sensorial Hedonic

    Pampryl r.t. -0.72 -1.26 -1.10

    Tropicana r.t. 1.05 0.43 0.66

    Fruivita refr. 0.81 0.87 1.17

    Joker r.t. -1.54 -0.77 -0.84

    Tropicana refr. 0.56 1.27 0.85

    Pampryl refr. -0.16 -0.53 -0.74

    Tableau 2.6: ULS-SEM latent variable score correlation matrix

    Physico-chemical Sensorial Hedonic

    Physico-chemical 1.000 .810 .867

    Sensorial .810 1.000 .961

    Hedonic .867 .961 1.000

  • 8/18/2019 Structural Equation Modelling for Small Samples

    13/52

      13

    Tableau 2.7: Correlations between the ULS-SEM LV scores and the MV’s

    Physico-chemical Sensorial Hedonic

    Glucose -0.898 -0.585 -0.673

    Fructose -0.898 -0.575 -0.673

    Saccharose 0.926 0.755 0.817Sweetening power 0.078 0.288 0.242

    pH before centrifugation 0.950 0.896 0.947

    pH after centrifugation 0.939 0.904 0.946

    Titer -0.973 -0.735 -0.765

    Citric acid -0.977 -0.740 -0.774

    Vitamin C -0.195 -0.040 -0.001

    Smell intensity 0.229 0.410 0.174

    Odor typicity 0.806 0.976 0.893

    Pulp 0.558 0.704 0.625

    Taste intensity -0.401 -0.646 -0.552

     Acidity -0.745 -0.927 -0.950

    Bitterness -0.775 -0.951 -0.976

    Sweetness 0.871 0.967 0.979

    Judge2 0.640 0.928 0.887Judge3 0.647 0.756 0.877Judge6 0.656 0.662 0.794

    Judge11 0.872 0.785 0.919Judge12 0.718 0.929 0.823

    Judge25 0.971 0.817 0.864Judge30 0.742 0.518 0.637Judge31 0.343 0.693 0.712

    Judge35 0.771 0.936 0.926

    Judge48 0.460 0.837 0.834Judge52 0.791 0.840 0.944

    Judge55 0.504 0.878 0.863Judge59 0.534 0.592 0.458

    Judge60 0.870 0.854 0.924

    Judge63 0.343 0.693 0.712Judge68 0.909 0.670 0.666Judge77 0.734 0.473 0.396

    Judge79 0.718 0.929 0.823Judge84 0.953 0.934 0.941

    Judge86 0.453 0.685 0.762Judge91 0.827 0.845 0.927

    Judge92 0.724 0.419 0.595Judge96 0.554 0.679 0.744

    The estimate of the hedonic score, shown in Table 2.5, enables us to classify the products by order of

     preference:

    Fruivita refr. > Tropicana refr. > Tropicana r.t. > Pampryl refr. > Joker r.t. > Pampryl r.t.

    Using the significant regression weights of Table 2.1 and the correlations given in Table 2.7, we may

    conclude that the physico-chemical score is correlated negatively with the fructose, glucose, titer and

    citric acid characteristics and positively with the saccharose, pH before and after centrifugation

  • 8/18/2019 Structural Equation Modelling for Small Samples

    14/52

      14

    characteristics. The sensorial score is correlated positively with odor typicity and sweetness and

    negatively with acidity and bitterness.

    The hedonic score related to the homogenous group of judges is correlated positively with the

     physico-chemical (.867) and sensorial scores (.961). Consequently, this group of judges likes

     products with odor typicity and sweetness (Fruivita refr., Tropicana r.t., Tropicana refr.) and rejects

     products with an acidic and bitter nature (Joker r.t., Pampryl refr., Pampryl r.t.). This result is

    verified in Table 3.

    Table 3: Sensorial characteristics of the products ranked according to the hedonic score

    odor hedonic

    Product sweeteness typicity acidity bitterness score

     _____________________________________________________________________

    Fruivita refr. 3.4 2.88 2.42 1.76 1.17

    Tropicana refr. 3.3 3.02 2.33 1.97 0.85

    Tropicana r.t. 3.3 2.82 2.55 2.08 0.66

    ---------------------------------------------------------------------

    Pampryl refr. 2.9 2.73 3.31 2.63 -0.74Joker r.t. 2.8 2.59 3.05 2.56 -0.84

    Pampryl r.t. 2.6 2.53 3.15 2.97 -1.10

    2. Use of PLS Path modeling

    For estimating the parameters of the model, we have used the module XLSTAT-PLSPM of the

    XLSTAT software (XLSTAT, 2007). The variables have all been standardized. To calculate the

    inner estimates of the latent variables, we have used the centroid scheme recommended by Herman

    Wold (1985).

    Table 4 contains the output of this modelling of the orange juice data with comments. Figure 4includes the regression coefficients between the latent variables of the model shown in Figure 1 and

    the correlation coefficients between the manifest and latent variables.

    Coefficient validation

    Although it gives robust and stable results with the various methods used on these orange juice data

    (the same items appear to be significant in Tenenhaus, Pagès, Ambroisine, Guinot (2005) and in the

     present paper), we may think that bootstrap validation carried out on only 6 cases cannot be very

    reliable. One reason is the following: In this example, data structure comes from the opposition

     between the two groups {Fruivita refr., Tropicana r.t., Tropicana refr.} on one side and {Pamprylrefr., Joker r.t., Pampryl r.t.} on the other side. If one of these groups of products is not selected in

    the bootstrap sampling selection, then the correlations between the latent variables disappear.

    Maybe, non representative samples should be eliminated.

    Bootstrap has been based on 200 samples and 90% confidence intervals have been asked for.

    Results of bootstrap validation for the inner model are shown in Table 3.7. The confidence intervals

    indicate the regression coefficients which are significant. We can also look to the usual Student t  

    related to the regression coefficients. By convention, a coefficient is significant if the absolute value

    of t   is larger than 2. In this specific example, both methods give the same results. The relationship

     between the hedonic data and the physico-chemical data is not significant (t = 1.522), while that

     between the hedonic data and the sensorial data is (t  = 3.546). There is also a significant connection

     between the physico-chemical and the sensorial data (t  = 2.864).

  • 8/18/2019 Structural Equation Modelling for Small Samples

    15/52

      15

    Figure 4: XLSTAT-PLSPM software output for the orange juice data

  • 8/18/2019 Structural Equation Modelling for Small Samples

    16/52

      16

    However, the strong correlation between the hedonic data and the physico-chemical data suggests

    that a PLS regression of the hedonic score should be carried out on the physico-chemical and

    sensorial scores. This PLS regression (with one component) leads to the following equation:

    Hedonic score = 0.49*(Physico-Chemical score) + 0.53*(Sensorial score)

    with an R 2

     = 0.948 to be compared with R 2

     = 0.960 in the model shown in Figure 4.Bootstrap validation for PLS regression yields to the same significant regression coefficients as for

    OLS regression (see Table 4). If the PLS regression is validated by Jack-knife on the observed latent

    variables, both coefficients are now significant (Table 4 and Figure 5).

    Figure 5: XLSTAT-PLSPM software output: Validation of the PLS regression of the hedonic score

    on the physico-chemical and sensorial scores

    Hedonic / Standardized coefficients

    (95% conf. inter val)

    Physico-chemical

    Sensorial

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    Variable

       S   t  a  n   d  a  r   d   i  z  e   d  c  o  e   f   f   i  c   i  e  n   t  s

     

    Table 3: XLSTAT-PLSPM outputs for the orange juice example

    Table 3.1: Block dimensionality 

    Latent variable Dimensions Critical value Eigenvalues

    Physico-chemical 9 1.800 6.213

    1.410

    1.046

    0.317

    0.013

    Sensorial 7 1.400 4.7441.333

    0.820

    0.084

    0.019

    Hedonic 23 4.600 14.655

    3.663

    2.199

    1.837

    0.646

    Comment: The critical value is equal to the average eigenvalue. In this example the number of

    eigenvalues is equal to 5 as the number of observations (6) is smaller than the number of variables

    and because the variables are centered. Each block can be considered as unidimensional.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    17/52

      17

    Table 3.2: Checking block dimensionality (larger correlation per row is in bold ) 

    Variables/Factors correlations (Physico-chemical):

    F1 F2 F3 F4 F5

    Glucose 0.914 0.388 -0.057 0.109 -0.013

    Fructose 0.913 0.378 -0.083 0.121 -0.050

    Saccharose -0.912 0.261 0.286 -0.127 0.049

    Sweetening power -0.035 0.947 0.319 -0.020 0.017

    pH before centrifugation -0.945 0.019 -0.026 0.325 0.006

    pH after centrifugation -0.933 0.071 -0.069 0.346 -0.006

    Titer 0.974 -0.144 0.070 0.150 0.062

    Citric acid. 0.978 -0.136 0.049 0.144 0.052

    Vitamin C 0.212 -0.328 0.916 0.080 -0.035

    Variables/Factors correlations (Sensorial):

    F1 F2 F3 F4 F5Smell intensity 0.460 0.754 -0.468 0.008 0.004

    Odor typicity 0.985 0.134 -0.058 0.077 0.041

    Pulp 0.722 0.617 0.298 -0.096 -0.031

    Taste intensity -0.650 0.429 0.626 0.005 0.048

     Acidity -0.913 0.348 -0.021 0.205 -0.057

    Bitterness -0.935 0.188 -0.285 -0.028 0.093

    Sweetness 0.955 -0.159 0.187 0.161 0.048

    Variables/Factors correlations (Hedonic):

    F1 F2 F3 F4 F5

     judge2 0.894 -0.218 0.307 -0.203 0.132 judge3 0.890 -0.318 -0.310 -0.018 0.104

     judge6 0.798 -0.039 -0.166 0.522 -0.247

     judge11 0.919 0.051 -0.177 0.278 0.212

     judge12 0.814 0.221 0.213 -0.429 -0.243

     judge25 0.849 0.422 -0.177 -0.166 0.205

     judge30 0.625 0.399 -0.631 -0.058 -0.221

     judge31 0.733 -0.565 0.321 0.179 0.090

     judge35 0.925 0.035 0.329 0.179 -0.060

     judge48 0.852 -0.474 0.207 -0.039 -0.074

     judge52 0.948 -0.049 -0.286 0.068 -0.115

     judge55 0.878 -0.398 0.131 -0.164 -0.166

     judge59 0.438 0.475 0.740 0.176 -0.062

     judge60 0.922 0.051 -0.149 -0.154 0.317

     judge63 0.733 -0.565 0.321 0.179 0.090

     judge68 0.638 0.763 -0.063 -0.072 -0.041

     judge77 0.363 0.862 0.235 -0.169 0.203

     judge79 0.814 0.221 0.213 -0.429 -0.243

     judge84 0.928 0.352 0.115 0.032 0.012

     judge86 0.778 -0.410 -0.372 -0.229 -0.187

     judge91 0.927 0.043 0.069 0.364 0.043

     judge92 0.585 0.348 -0.282 0.676 0.000

     judge96 0.755 -0.285 -0.331 -0.430 0.231

  • 8/18/2019 Structural Equation Modelling for Small Samples

    18/52

      18

    Table 3.3: Model validation 

    Goodness of fit index:

    GoF GoF (Bootstrap) Standard error Critical ratio (CR)

     Absolute 0.731 0.732 0.049 14.943Relative 0.823 0.801 0.048 17.146

    Outer model 0.911 0.852 0.039 23.286

    Inner model 0.903 0.940 0.048 18.815

    Lower bound(90%)

    Upper bound(90%) Minimum

    1stQuartile Median

    3rdQuartile Maximum

     Absolute 0.645 0.799 0.522 0.711 0.738 0.762 0.821

    Relative 0.707 0.847 0.525 0.790 0.811 0.824 0.855

    Outer model 0.784 0.893 0.707 0.816 0.863 0.865 0.911

    Inner model 0.885 0.999 0.669 0.921 0.941 0.966 1.000

    Comment: Number of bootstrap samples = 200. Level of the confidence intervals: 90% 

     Absolute Goodness-of-Fit

    2 2

    Endogenous VL

    1 1( , ) ( ; explaining )

     Nb of endogenous LV jh j j i j

     j h

    GoF Cor x R J 

    = ξ × ξ ξ ξ∑∑ ∑  

    where j

     j

     J p= ∑  

     Relative Goodness-of-Fit

    22

    1

    2endogenous LV  j

    Outer model Inner model

    ( , )( ; explaining )1 1

      Nb of endogenous LV

     

     j p

     jh j j i jh

     j  j

    Cor x R

     J 

    ξ ξ ξ ξ 

    λ ρ 

    = ×∑

    ∑ ∑

     

    where:-  λ  j is the first eigenvalue computed from the PCA of block j

    -  j

     ρ   is the first canonical correlation between the dependent block j and the

    concatenation of all the blocks i explaining the dependent block j.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    19/52

      19

    Table 3.4: Latent Variable validation 

    Cross-loadings (Monofactorial manifest variables):

    Physico-chemical Sensorial Hedonic

    Glucose -0.889 -0.584 -0.689Fructose -0.889 -0.574 -0.689

    Saccharose 0.931 0.758 0.832

    Sweetening power 0.099 0.294 0.242

    pH before centrifugation 0.952 0.896 0.955

    pH after centrifugation 0.942 0.905 0.954

    Titer -0.972 -0.738 -0.789

    Citric acid. -0.977 -0.743 -0.798

    Vitamin C -0.194 -0.045 -0.023

    Smell intensity 0.236 0.411 0.199

    Odor typicity 0.814 0.977 0.904

    Pulp 0.574 0.709 0.637

    Taste intensity -0.397 -0.639 -0.549

     Acidity -0.751 -0.925 -0.942

    Bitterness -0.784 -0.952 -0.972

    Sweetness 0.877 0.968 0.982

     judge2 0.646 0.925 0.880

     judge3 0.654 0.755 0.860

     judge6 0.665 0.667 0.787

     judge11 0.873 0.785 0.916

     judge12 0.729 0.930 0.834

     judge25 0.972 0.817 0.879

     judge30 0.750 0.524 0.648

     judge31 0.349 0.690 0.689 judge35 0.777 0.936 0.926

     judge48 0.470 0.835 0.815

     judge52 0.801 0.843 0.938

     judge55 0.517 0.876 0.847

     judge59 0.533 0.593 0.479

     judge60 0.872 0.853 0.924

     judge63 0.349 0.690 0.689

     judge68 0.910 0.673 0.695

     judge77 0.727 0.474 0.432

     judge79 0.729 0.930 0.834

     judge84 0.957 0.935 0.953

     judge86 0.467 0.685 0.742

     judge91 0.831 0.846 0.925

     judge92 0.724 0.424 0.602

     judge96 0.559 0.677 0.731

    Comment:-  Sweetening power and Vitamin C are not correlated to their block.-  Smell intensity is not correlated to its own block-   Judges 59 and 77 are weakly correlated to their block.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    20/52

      20

    Table 3.5: Latent Variable weights (non significant weights are in bold ) 

    Latent variable Manifest variablesOuterweight

    Outerweight

    (Bootstrap)Standard

    errorCritical

    ratio (CR)

    Lowerbound(90%)

    Upperbound(90%)

    Glucose -0.124 -0.113 0.050 -2.491 -0.147 -0.057

    Fructose -0.123 -0.111 0.049 -2.498 -0.144 -0.057

    Saccharose 0.154 0.140 0.041 3.720 0.096 0.180

    Sweetening power 0.052 0.038 0.093 0.560 -0.115 0.151

    pH before centrifugation 0.180 0.159 0.021 8.460 0.121 0.184

    pH after centrifugation 0.180 0.159 0.020 9.073 0.121 0.185

    Titer -0.148 -0.137 0.017 -8.808 -0.169 -0.111

    Citric acid. -0.150 -0.139 0.016 -9.127 -0.171 -0.113

    Physico-chemical

    Vitamin C -0.007 -0.010 0.085 -0.077 -0.126 0.130

    Smell intens ity 0.052 0.033 0.099 0.527 -0.136 0.166

    Odor typicity 0.206 0.190 0.039 5.255 0.143 0.241

    Pulp 0.145 0.114 0.082 1.759 -0.065 0.207

    Taste intensity -0.113 -0.107 0.080 -1.406 -0.196 0.069

     Acidity -0.203 -0.190 0.045 -4.472 -0.227 -0.143

    Bitterness -0.210 -0.197 0.034 -6.136 -0.240 -0.143

    Sensorial

    Sweetness 0.223 0.208 0.036 6.148 0.143 0.267

     judge2 0.059 0.056 0.012 4.740 0.043 0.064

     judge3 0.053 0.051 0.015 3.435 0.034 0.065

     judge6 0.050 0.047 0.018 2.749 0.023 0.071

     judge11 0.062 0.059 0.019 3.268 0.048 0.080

     judge12 0.062 0.058 0.013 4.605 0.037 0.083

     judge25 0.067 0.063 0.013 5.247 0.051 0.078

     judge30 0.048 0.040 0.020 2.364 0.000 0.065

     judge31 0.039 0.036 0.023 1.695 0.000 0.066 judge35 0.064 0.062 0.012 5.530 0.050 0.075

     judge48 0.049 0.047 0.016 3.047 0.020 0.064

     judge52 0.062 0.061 0.012 5.320 0.049 0.082

     judge55 0.052 0.049 0.014 3.717 0.029 0.061

     judge59 0.042 0.036 0.026 1.642 -0.021 0.066

     judge60 0.065 0.060 0.016 4.065 0.047 0.074

     judge63 0.039 0.036 0.023 1.695 0.000 0.066

     judge68 0.059 0.051 0.018 3.305 0.000 0.072

     judge77 0.045 0.037 0.027 1.649 -0.021 0.065

     judge79 0.062 0.058 0.013 4.605 0.037 0.083

     judge84 0.071 0.066 0.013 5.420 0.055 0.084 judge86 0.043 0.039 0.020 2.168 -0.007 0.057

     judge91 0.063 0.059 0.018 3.499 0.050 0.074

     judge92 0.043 0.038 0.028 1.554 -0.007 0.081

    Hedonic

     judge96 0.046 0.044 0.020 2.345 0.013 0.066

    Comment:-  Sweetening power and Vitamin C are not significant in block physico-chemical-  Smell intensity, Pulp and Taste intensity are not significant in block sensorial-   Judges 59, 77, 86 and 92 are not significant in block hedonic.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    21/52

      21

    Table 3.6: Correlations between MV and LV  

    Correlations:

    Latent variable Manifest variables

    Standardized

    loadings Communalities Redundancies

    Standardizedloadings

    (Bootstrap)

    Standard

    errorGlucose -0.889 0.790 -0.850 0.226

    Fructose -0.889 0.790 -0.847 0.227

    Saccharose 0.931 0.867 0.876 0.268

    Sweetening power 0.099 0.010 0.101 0.591pH beforecentrifugation 0.952 0.906 0.968 0.078

    pH after centrifugation 0.942 0.887 0.964 0.073

    Titer -0.972 0.946 -0.950 0.066

    Citric acid. -0.977 0.954 -0.956 0.063

    Physico-chemical

    Vitamin C -0.194 0.038 -0.203 0.435

    Smell intensity 0.411 0.169 0.113 0.285 0.497

    Odor typicity 0.977 0.954 0.641 0.940 0.110

    Pulp 0.709 0.503 0.338 0.612 0.382

    Taste intensity -0.639 0.408 0.274 -0.589 0.404

     Acidity -0.925 0.856 0.575 -0.915 0.206

    Bitterness -0.952 0.907 0.609 -0.949 0.087

    Sensorial

    Sweetness 0.968 0.936 0.629 0.967 0.069

     judge2 0.880 0.774 0.743 0.859 0.182

     judge3 0.860 0.740 0.710 0.828 0.236

     judge6 0.787 0.619 0.594 0.736 0.254

     judge11 0.916 0.840 0.806 0.885 0.230

     judge12 0.834 0.695 0.667 0.852 0.115

     judge25 0.879 0.773 0.742 0.896 0.159

     judge30 0.648 0.419 0.403 0.570 0.284

     judge31 0.689 0.475 0.456 0.619 0.340

     judge35 0.926 0.858 0.824 0.923 0.147

     judge48 0.815 0.664 0.638 0.785 0.233

     judge52 0.938 0.879 0.844 0.936 0.130

     judge55 0.847 0.717 0.688 0.809 0.215

     judge59 0.479 0.230 0.221 0.447 0.406

     judge60 0.924 0.853 0.820 0.893 0.214

     judge63 0.689 0.475 0.456 0.619 0.340

     judge68 0.695 0.483 0.464 0.685 0.258

     judge77 0.432 0.186 0.179 0.426 0.404 judge79 0.834 0.695 0.667 0.852 0.115

     judge84 0.953 0.909 0.873 0.943 0.137

     judge86 0.742 0.551 0.529 0.670 0.308

     judge91 0.925 0.856 0.822 0.895 0.215

     judge92 0.602 0.363 0.348 0.531 0.411

    Hedonic

     judge96 0.731 0.535 0.514 0.709 0.281

    Comments:

    -  Standardized loading = correlation-  Communality = squared correlation

    -   Redundancy = Communality*R2(Dep. LV; Explanatory related LVs)

  • 8/18/2019 Structural Equation Modelling for Small Samples

    22/52

      22

    Table 3.6: Correlations between MV and LV (continued ) 

    Correlations:

    Latent variable Manifest variablesCritical ratio

    (CR)Lower bound

    (90%)Upper bound

    (90%)

    Glucose -3.938 -0.995 -0.569

    Fructose -3.913 -0.998 -0.589

    Saccharose 3.476 0.729 0.996

    Sweetening power 0.167 -0.860 0.950

    pH before centrifugation 12.199 0.925 1.000

    pH after centrifugation 12.966 0.899 1.000

    Titer -14.729 -1.000 -0.860

    Citric acid. -15.413 -1.000 -0.872

    Physico-chemical

    Vitamin C -0.445 -0.970 0.656

    Smell intensity 0.827 -0.684 0.930

    Odor typicity 8.912 0.763 0.999

    Pulp 1.856 -0.174 0.998

    Taste intensity -1.580 -0.998 0.203

     Acidity -4.495 -1.000 -0.752

    Bitterness -10.967 -1.000 -0.897

    Sensorial

    Sweetness 13.953 0.940 1.000

     judge2 4.832 0.639 0.981

     judge3 3.642 0.469 0.999

     judge6 3.102 0.347 0.982

     judge11 3.993 0.773 0.994

     judge12 7.258 0.648 0.999

     judge25 5.541 0.742 0.997

     judge30 2.277 0.000 0.936 judge31 2.030 0.000 0.991

     judge35 6.293 0.825 0.997

     judge48 3.493 0.408 0.997

     judge52 7.234 0.858 0.998

     judge55 3.942 0.425 0.996

     judge59 1.179 -0.426 0.948

     judge60 4.315 0.716 0.997

     judge63 2.030 0.000 0.991

     judge68 2.694 0.000 0.997

     judge77 1.069 -0.426 0.896

     judge79 7.258 0.648 0.999 judge84 6.934 0.911 0.998

     judge86 2.411 -0.093 0.992

     judge91 4.301 0.783 0.986

     judge92 1.465 -0.192 0.982

    Hedonic

     judge96 2.602 0.173 0.997

    Comment: (identical with those for weights) -  Sweetening power and Vitamin C are not significant in block physico-chemical-  Smell intensity, Pulp and Taste intensity are not significant in block sensorial-   Judges 59, 77, 86 and 92 are not significant in block hedonic.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    23/52

      23

    Table 3.7: Inner model 

    R² (Sensor ial):

    R² R²(Bootstrap)Standard

    error Critical ratio (CR)Lower bound

    (90%) Upper bound (90%)

    0.672 0.791 0.157 4.276 0.588 0.999

    Path coefficients (Sensorial):

    Latent variable Value Standard error t Pr > |t| Value(Bootstrap)

    Physico-chimique 0.820 0.286 2.864 0.046 0.835

    Latent variableStandard

    error(Bootstrap)Critical ratio

    (CR)Lower bound

    (90%) Upper bound (90%)

    Physico-chemical 0.308 2.660 0.757 0.994

    Comment:

    The usual Student t test and the bootstrap approach give here the same results.

    R² (Hedonic):

    R² R²(Bootstrap)Standard

    error Critical ratio (CR) Lower bound (90%) Upper bound (90%)

    0.960 0.986 0.017 58.017 0.947 1.000

    Path coefficients (Hedonic):

    Latent variable Value Standard error t Pr > |t|

    Physico-chemical 0.306 0.201 1.522 0.225

    Sensorial 0.713 0.201 3.546 0.038 

    Path coefficients (Hedonic):

    Latent variable Value(Bootstrap)Standard

    error(Bootstrap)Critical ratio

    (CR)Lower bound

    (90%)Upper bound

    (90%)

    Physico-chemical 0.331 0.698 0.438 -0.642 1.000

    Sensorial 0.651 0.674 1.058 0.000 1.397

    Comment:

    The usual Student t test and the bootstrap approach give here the same results. But the non-

    significance of the physico-chemical can also be due to a multicolinearity problem. PLS regression

     for estimating the structural regression equations can be used and is presented in Table 4.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    24/52

      24

    Table 3.8: Impact and contribution of the variables to Hedonic 

    Impact and contribution of the variables to Hedonic:

    Sensorial Physico-chemical

    Correlation 0.964 0.891Path coefficient 0.713 0.306

    Correlation * path coefficient 0.688 0.273

    Contribution to R² (%) 71.612 28.388

    Cumulative % 71.612 100.000

     

    Impact and contr ibution of the variables to Hedonic

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    Senso rial P hysico -chemical

    Latent variable

       P  a   t   h  c  o  e   f   f   i  c   i  e  n   t  s

    0

    20

    40

    60

    80

    100

       C  o  n   t  r   i   b  u   t   i  o  n   t  o   R   ²   (   %   )

    Path coeff icient Cumulative %

     

    Comment:

    -2

    1ˆ( ; ,..., ) ( , )*

    k j j R Y X X Cor Y X    β = ∑  

    - When all the terms ˆ( , )* j j

    Cor Y X     β   are positive, it makes sense to compute the

    relative contribution of each explanatory variable X  j to the R square.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    25/52

      25

    Table 3.9: Model assessment  

    Latent variable Type Mean R² Adjusted

    Weighted MeanCommunalities

    (AVE) Mean Redundancies

    Physico-chemical Exogenous 0.000 0.687

    Sensorial Endogenous 0.000 0.672 0.672 0.676 0.454

    Hedonic Endogenous 0.000 0.960 0.950 0.634 0.609

    Mean 0.816 0.654 0.532

    Comments:

    -  The weighted mean takes into account the number of MVs in each block.-  (Absolute GoF)2 = (Mean R2)*(Weighted mean communalities)

    Table 3.10: Correlation between the latent variables 

    Correlations (Latent variable):

    Physico-chemical Sensorial Hedonic

    Physico-chemical 1.000 0.820 0.891

    Sensorial 0.820 1.000 0.964

    Hedonic 0.891 0.964 1.000

    Table 3.11: Direct, indirect and total effects 

    Direct effects (Latent variable):

    Physico-chemical Sensorial Hedonic

    Physico-chemical

    Sensorial 0.820

    Hedonic 0.306 0.713

    Comment :

    -  Sensorial = .820*Physico-chemical

    -   Hedonic = .306*Physico-chemical + .713*Sensorial

  • 8/18/2019 Structural Equation Modelling for Small Samples

    26/52

      26

    Table 3.11: Direct, indirect and total effects (continued ) 

    Indirect effects (Latent variable):

    Physico-chemical Sensorial Hedonic

    Physico-chemicalSensorial 0.000

    Hedonic 0.585 0.000

    Comment :

    -   Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical-  Indirect effect  of Physico-chemical on Hedonic = .713*.820 = 0.585 

    Total effects (Latent variable):

    Physico-chemical Sensorial Hedonic

    Physico-chemical

    Sensorial 0.820

    Hedonic 0.891 0.713

    Comment :

     Hedonic = .306*Physico-chemical + .713*.820*Physico-chemical

    = .891*Physico-chemical

    Table 3.12: Discriminant validity 

    Discriminant validity (Squared correlations < AVE) :

    Physico-chemical Sensorial Hedonic

    Physico-chemical 1 0.672 0.793

    Sensorial 0.672 1 0.930

    Hedonic 0.793 0.930 1

    Mean Communalities (AVE) 0.687 0.676 0.634

    Comment:  Due to non significant MV’s, the AVE criterion is too small for the three LV’s.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    27/52

      27

    Table 3.13: Latent variable score 

    Summary statistics / Latent variable scores:

    Variable Observations Minimum Maximum MeanStd.

    Deviation

    Physico-chemical 6 -1.680 1.120 0.000 1.000

    Sensorial 6 -1.381 1.378 0.000 1.000

    Hedonic 6 -1.203 1.253 0.000 1.000

    Latent variable scores :

    Physico-chemical Sensorial Hedonic

    pampryl r. t. -0.810 -1.381 -1.203

    tropicana r. t. 1.120 0.462 0.742

    fruvita refr. 0.917 0.964 1.253

     joker r. t. -1.680 -0.852 -0.991

    tropicana refr. 0.630 1.378 0.946

    pampryl refr. -0.176 -0.570 -0.747

    Table 4: PLS regression of Hedonic score on Physico-chemical and sensorial scores

    Goodness of fit statistics (Variable Hedonic):

    R² 0.948

     Bootstrap validation

    Path coefficients (Hedonic):

    Latent variable ValueValue

    (Bootstrap)Standard error

    (Bootstrap)Critical ratio

    (CR)Lower bound

    (90%)Upper bound

    (90%)

    Physico-chemical 0.490 0.267 0.408 1.201 -0.422 0.893

    Sensorial 0.531 0.744 0.402 1.320 0.103 1.397

     Jack-knife validation on the observed latent variables

    Standardized coefficients (Variable Hedonic):

    Variable CoefficientStd.

    deviationLower bound

    (95%)Upper bound

    (95%)

    Physico-chemical 0.490 0.021 0.449 0.531

    Sensorial 0.531 0.022 0.488 0.573

  • 8/18/2019 Structural Equation Modelling for Small Samples

    28/52

      28

    3. Comparison between PLS, ULS-SEM and PCA

    Comparison between weights

    When we compare the weight confidence intervals computed with PLS (Table 3.5) with those

    coming from ULS-SEM (Table 2.1), we find that both methods yield to the same non significantweights with only one exception for Judge 86 (non significant for PLS and significant for ULS-

    SEM). These weights are compared in Figure 6.

    Figure 6: Comparison between the PLS and ULS-SEM weights

  • 8/18/2019 Structural Equation Modelling for Small Samples

    29/52

      29

    Comparison between PLS and ULS-SEM scores

    The scores coming from PLS and ULS-SEM are compared in Figure 7. They are highly correlated.

    This confirms our previous findings and a general remark of Noonan and Wold (1982) on the fact

    that the final outer LV estimates depend very little on the selected scheme of calculation of the inner

    LV estimates.

    Figure 7: Comparison between the PLS and ULS-SEM scores

  • 8/18/2019 Structural Equation Modelling for Small Samples

    30/52

      30

    Comparison between the PLS and ULS-SEM scores and the block principal components 

    The correlations between the PLS and USL-SEM scores with the block principal components are

    given in Table 5.

    Table 5: Correlation between the PLS and ULS-SEM scores and the block principal components

    ULS-SEM scores PLS scores

    Physico-chemical 1st PC .999 .997

    Sensorial 1st PC .998 .998

    Hedonic 1st PC .999 .997

    We may conclude that ULS-SEM, PLS and principal component analysis are giving practically the

    same scores on this orange juice example.

    II. Exploratory factor analysis, ULS-SEM and PCA

    If the structural model is limited to one standardized latent variable (or common factor) ξ  described by a vector x composed of p centred manifest variables, one gets the decomposition

    (20) ξ = +xx   λ δ  

    It is usual to add the following hypotheses:

    (21)

    ( ) 0,

    ( ) ( ') is diagonal( , ) 0

     E 

    Cov E  Cov  ξ 

    =

    = ==

    δ

    δ

    Θ δ δδδ

     

    Under these hypotheses, the covariance matrixΣ

     of the random vector x is written as

    (22) '( ') E = = +x x   δΣ xx   λ λ Θ  

    The parametersλx and Θδ in model (22) can now be estimated using the ULS method. This means

    searching for the parameters ˆ xλ   andˆ

    δΘ , minimizing the criterion

    (23)2

    'ˆ ˆ ˆ( )− +x xx   δS   λ λ Θ  

    where S is the matrix of empirical covariances. To remove the indetermination on the global sign of

    the vector ˆxλ   (if

    ˆxλ   is a solution, then -

    ˆxλ   is also a solution), the solution can be chosen to make the

    sum of the coordinates positive. This is the option chosen in AMOS 6.0.

    The advantage of the ULS   method over the other more frequently used GLS (Generalized Least

    Squares)  or  ML (Maximum Likelihood)  methods lies in its ability to function with a singular

    covariance matrix S, particularly in situations where the number of observations is less than the

    number of variables.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    31/52

      31

    The quality of the fit is measured by the GFI written here as

    (24)

    2'

    2

    ˆ ˆ ˆ( )1GFI 

    − += −

    x xx   δS   λ λ Θ

    Principal component analysis (PCA) is found again if one imposes the additional condition

    (25) ˆ  δΘ = 0  

    In this case, one seeks to minimise the criterion

    (26)2

    'ˆ ˆ− x xS   λ λ   

    The vector ˆ xλ   is now equal to 1 1λ u  , where u1 is the normed eigenvector of the covariance matrix

    S associated with the largest eigenvalue λ1.

    For each MV  x j, the explained variance (or communality) is therefore2

    1 1ˆ

     jj juσ λ = . The residual

    variance (or specificity) θ  j is then estimated by2

    1 1ˆ j jj j

    s uθ λ = − .

    The quality of the fit can still be measured by the GFI :

    (27)

    ( )2 2

    2

    ˆ ˆ ˆ

    1

     jj jj

     j

    s

    GFI 

    σ − −

    = −∑'x xS - λ λ 

    The square of the norm of S  is equal to the sum of the squared eigenvalues λh  of S. In PCA,2

    ˆ ˆ 'x x

    S - λ λ   is equal to the sum of the squares of the p-1 last eigenvalues of S.  Consequently, in PCA

    one obtains

    (28)

    ( )2

    2 2

    1 1 1

    2

    1

     jj j

     j

     p

    h

    h

    s u

    GFI 

    λ λ 

    λ 

    =

    + −

    =∑

    ∑ 

    Moreover, SEM softwares allow the computation of confidence intervals for parameters by

     bootstrapping. They also allow criterion (26) to be minimised by imposing value constraints or

    equality constraints on the coordinates of the vector ˆxλ  . We can continue to use criterion (27) to

    measure the quality of the model.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    32/52

      32

     Link between ULS-SEM, Factor Analysis, PLS and Principal Component Analysis

    A central point in PLS path modelling concerns the relation between the MV’s related to one LV and

    this LV.

     Reflective mode

    The reflective mode is common to PLS and SEM. In this mode, each MV is related to its LV by a

    simple regression:

    (29) j j j x   λ ξ δ = +  

    This model corresponds to the usual one-dimension factor analysis (FA) model. Minimization of

    criterion (23) allows the estimation of the parameters of this model. As the diagonal terms of the

    residual matrix 'ˆ ˆ ˆ( )− +x xx   δ

    S   λ λ Θ  are automatically null, the path coefficient λ  j are computed with the

    objective of reconstruction of the covariance matrix terms outside the diagonal. The averagevariance extracted ( AVE ), defined by ˆ /

     jj jj

     j j

    sσ ∑ ∑ , measures the summary power of the LV. It is

    not the first objective in this approach. It is an a posteriori value of the model.

    In a one block of variables situation, it is natural to estimate the LV ξ   using the first principal

    component of the MV’s. The minimization of criterion (26) yields to this solution. Furthermore, the

    diagonal terms of the residual 'ˆ ˆ−x xS   λ λ    are now taken into account in the minimization. The path

    coefficients λ  j are now computed with the objective of reconstruction of the whole covariance matrix

    terms, diagonal included. The AVE   still measures the summary power of the LV. But it is now a

     part of the objective in this approach. Consequently, in the ULS-SEM context, PCA can be obtained

     by considering the FA model (22) and then by cancelling in a first step the residual measurementvariances.

    In PLS path modelling softwares, the one block situation has been implemented. In this situation,

    the outer estimate of the block LV is also taken as the inner estimate. Therefore, Mode A yields to

    the following equation:

    (30) ˆ ˆ( , ) j j

     j

    Cov x xξ ξ ∝ ∑  

    The PLS algorithm will converge to the first principal component of the block of MV’s, solution of

    equation (30).

    Formative mode

    The formative mode is easy to implement in PLS. In this mode, each LV is related to its MV’s by a

    multiple regression:

    (31) j j

     j

     xξ β δ = ∑ +  

    But in a one block situation, it is an indeterminate problem.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    33/52

      33

    Conclusion

    The residual sum of squares ( RESS ), defined by ( )2

    ˆij ij

    i j

    s   σ <

    −∑ , is smaller for FA than for PCA. On

    the other hand the AVE  is larger for PCA than for FA.

    Example 2

    We use data on the cubic capacity, power output, speed, weight, width and length of 24 car models

    in production in 2004 given in Tenenhaus (2007). We compare on these data FA and PCA with

    respect to the RESS  and AVE  criterions. The analyses are carried out on standardized variables. The

    correlation matrix is given in Table 6.

    Table 6: Car example: Correlation matrix

    Capacity Power Speed Weight Width Length

    Capacity 1 0.954 0.885 0.692 0.706 0.664Power 1 0.934 0.529 0.730 0.527

    Speed 1 0.466 0.619 0.578

    Weight 1 0.477 0.795

    Width 1 0.591

    Length 1

    The path models for one-dimension FA and PCA are given in Figure 8. The common factor is

    denominated as F1. The implied covariance matrices and the residual matrices produced by AMOS

    are given in Table 7.

    FA

    1.00

    F1

    Capacity

    .02

    e1

    .99

    1

    Power 

    .14

    e2

    .93

    1

    Speed

    .25

    e3.87

    1

    Weight

    .53

    e4

    .68 1

    Width

    .45

    e5

    .74

    1

    Length

    .47

    e6

    .73

    1

     

    PCA

    1.00

    F1

    Capacity

    .00

    e1

    .96

    1

    Power 

    .00

    e2

    .92

    1

    Speed

    .00

    e3.89

    1

    Weight

    .00

    e4

    .76 1

    Width

    .00

    e5

    .80

    1

    Length

    .00

    e6

    .80

    1

     

    Figure 8 : Path model for FA and PCA 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    34/52

      34

    Table 7: Car example: Implied covariance matrices and residuals produced by AMOS

    Capacity Power Speed Weight Width Length

    Capacity 1 .918 .860 .678 .737 .722

    Power 1 .804 .633 .689 .674

    Speed 1 .593 .645 .632

    Weight 1 .508 .498

    Width 1 .541

    Implied

    correlations

    Length 1

    Capacity Power Speed Weight Width Length

    Capacity 0 0.036 0.025 0.014 -0.031 -0.058

    Power 0 0.130 -0.104 0.041 -0.147

    Speed 0 -0.127 -0.026 -0.054

    Weight 0 -0.031 0.297

    Width 0 0.050

    FA

    Residuals

    Length 0

    Capacity Power Speed Weight Width LengthCapacity .926 .889 .853 .738 .771 .765

    Power .853 .818 .699 .740 .734

    Speed .785 .671 .710 .705

    Weight .573 .606 .602

    Width .642 .637

    Implied

    correlations

    Length .632

    Capacity Power Speed Weight Width Length

    Capacity 0.074 0.065 0.032 -0.046 -0.065 -0.101

    Power 0 0.147 0.116 -0.170 -0.010 -0.207

    Speed 0 0 0.215 -0.205 -0.091 -0.127

    Weight 0 0 0 0.427 -0.129 0.193

    Width 0 0 0 0 0.358 -0.046

    PCA

    Residuals

    Length 0 0 0 0 0 0.368

    The comparison between FA and PCA results is shown in Table 8.

    Table 8: Comparison between FA and PCA approaches 

     RESS AVE GFI

    FA .169 .690 .983

    PCA .230 .735 .978

    For PCA, the GFI produced by AMOS has to be modified according to formula (27). The usual

    PCA of standardized data results in the following eigenvalues: 4.4113, .8534, .4357, .2359, .0514

    and .0124. The quality of the approximation of S  by '1 1 1ˆλ    +  δ

    u u   Θ is therefore measured by the

    following value of the GFI :

    (32)

    ( )2

    2 2

    1 1 1

    2

    1

    19.459 .519.978

    20.436

     jj j

     j

    q

    h

    h

    s u

    GFI 

    λ λ 

    λ =

    + −+

    = = =∑

     

  • 8/18/2019 Structural Equation Modelling for Small Samples

    35/52

      35

     

    We then used AMOS 6.0 to carry out a first-order PCA of these standardized data under the

    hypothesis of equality of weights for the engine  variables "cubic capacity, power, speed" and

    similarly equality of weight for the  passenger compartment   variables "weight, width, length".

    Figure 9 shows the results of this estimation and Table 9 the 90% bootstrap confidence intervals.

    The bootstrap intervals contain values greater than 1 because the bootstrap samples no longer consist

    of standardized variables.

    1.00 

    F1 

    Capacity

    .00

    e1

    .92

    1

    Power 

    .00

    e2

    .92

    1

    Speed

    .00

    e3.92

    1

    Weight

    .00

    e4.78 1

    Width

    .00

    e5

    .78

    1

    Length

    .00

    e6

    .78

    1

     

    Figure 9 : PCA under constraints on the "Auto 2004" data ( AMOS 6.0 output )

    Table 9: PCA under constraints for the "Auto 2004" data ( AMOS 6.0 output ) 

     Estimation and bootstrap confidence interval for the coordinates of  x 

    Parameter Estimate Inf (90%) Sup (90%)

    Capacity - F1 .924 .542 1.195

    Power - F1 .924 .542 1.195

    Speed - F1 .924 .542 1.195

    Weight - F1 .784 .555 1.003

    Width - F1 .784 .555 1.003Length - F1 .784 .555 1.003

    The GFI  for the model with constraints has the following value provided by AMOS:

    (33)

    2

    *

    2

    ˆ ˆ

    1 .9505GFI   = − ='

    x xS - λ λ 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    36/52

      36

    Using the modified formula yields to:

    (34)

    ( )2

    2

    *

    2

    ˆ.509

    .9505 .97520.436

     j jj x

     j

    s

    GFI GFI  

    λ −

    = + = + =∑

    The very slight reduction of the GFI   (.975 vs .978) means that one can accept the model withconstraints.

    In this example, we obtain the component ξ̂    as the "McDonald" estimation of the factor ξ , 

    calculated as follows:

    ξ̂   ∝ .924(capacity* + power * + speed*) + .784(weight* + length* + width*) 

    where the asterisk means that the variable is standardized.

    III. Confirmatory factor analysis, ULS-SEM and analysis of multi-block data

    We assume now that the random column vector x  breaks down into  J   blocks of random vectors

    1( ,..., ) ' j j j jp x x=x . A specific model with one standardized latent variable (and usual hypotheses) is

    constructed for each block x j:

    (35) , 1,..., j j j j

     j J ξ = + =x   λ δ  

    This model is similar to model (4) with1

     J 

     j j=

    = ⊕xΛ λ  . For each block j we have

    (36) '   j j j j

    = +x   δΣ λ λ Θ  

    and for two blocks j and k  we get

    (37) ' j k   jk j k 

    ϕ =x xΣ λ λ   

    where ( , ) jk j k Cor ϕ ξ ξ = .

    Decomposition (7) thus becomes

    (38) ' j j

    ⎡ ⎤ ⎡ ⎤= ⊕ ⊕ +⎣ ⎦ ⎣ ⎦   δ

    Σ λ Φ λ Θ  

    The parametersλ1,…, λ J , and Θδ in model (38) can now be estimated by using the ULS method.

    This means seeking the parameters 1ˆ ˆ,...,

     J λ λ  , Φ̂  and ˆ

    δΘ  minimizing the criterion

    (39)2

    ˆ ˆ ˆˆ( ) ( ) ' j j

    ⎡ ⎤− ⊕ ⊕ +⎣ ⎦δS   λ Φ λ Θ  

    Adding constraint (25) gives a new criterion to be minimized:

    (40)2

    ˆ ˆˆ( ) ( ) ' j j− ⊕ ⊕S   λ Φ λ   

  • 8/18/2019 Structural Equation Modelling for Small Samples

    37/52

      37

    This results in a new factorisation of the covariance matrix allowing an estimation to be made of

     both the loadings and also the correlations between the factors. The quality of the fit is still

    measured by the GFI  criterion.

    Example 3

    In this example we are going to study data about wine tasting described in detail in Pagès, Asselin,

    Morlat & Robichet (1987).

     Description of the data

    A collected of 21 red wines of Bourgueil, Chinon and Saumur appellations is described by a set of

    27 taste variables divided into 4 blocks:

     X 1 = Smell at rest

    Rest1 = smell intensity at rest, Rest2 = aromatic quality at rest, Rest3 = fruity note at rest, Rest4 =floral note at rest, Rest5 = spicy note at rest

     X 2 = View

    View1 = visual intensity, View2 = shading (from orange to purple), View3 = surface impression

     X 3 = Smell after shaking

    Shaking1 = smell intensity, Shaking2 = smell quality, Shaking3 = fruity note, Shaking4 = floral

    note, Shaking5 = spicy note, Shaking6 = vegetable note, Shaking7 = phenolic note, Shaking8 =

    aromatic intensity in mouth, Shaking9 = aromatic persistence in mouth, Shaking10 = aromatic

    quality in mouth

     X 4 = Tasting

    Tasting1 = intensity of attack, Tasting2 = acidity, Tasting3 = astringency, Tasting4 = alcohol,

    Tasting5 = balance (acidity, astringency, alcohol), Tasting6 = mellowness, Tasting7 = bitterness,

    Tasting8 = ending intensity in mouth, Tasting9 = harmony

    These data have already been analysed using PLS in Tenenhaus & Esposito Vinzi (2005) and in

    Tenenhaus & Hanafi (2007). We present here the ULS-SEM solution on the standardized variables

    with cancellation of the residual measurement variances. First of all, we present the PCA for each

    separate block in Table 10.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    38/52

      38

    Table 10: Principal component analysis of each block for the "Wine" data 

    Smell at rest

    .741 .551

    .915 -.144

    .854 -.191

    .345 -.537

    .077 .933

    Smell intensity at rest Aromatic quality at rest

    Fruity note at rest

    Floral note at rest

    Spicy note at rest

    1 2

    Component

     

    View

    .986 -.146

    .983 -.163

    .947 .320

    Visual intensity

    Shading (from orange to purple)

    Surface impression

    1 2

    Component

     

    Smell after shaking

    .472 .743

    .881 -.180

    .819 -.176

    .328 -.500

    .089 .746

    -.635 .593.370 .633

    .895 .277

    .888 .307

    .882 -.372

    Smell intensity

    Smell quality

    Fruity note

    Floral note

    Spicy note

    Vegetable notePhelonic note

     Aromatic intensity in mouth

     Aromatic persistence in mouth

     Aromatic quality in mouth

    1 2

    Component

     

    Tasting

    .937 .082

    -.257 .691

    .775 .427

    .774 .378

    .844 -.423

    .901 -.380

    .377 .760

    .967 .117

    .958 -.233

    Intensity of attack

     Acidity

     Astringency

     Alcohol

    Balance (acidity, astringency, alcohol)

    Mellowness

    Bitterness

    Ending intensiry in mouth

    Harmony

    1 2

    Component

     

  • 8/18/2019 Structural Equation Modelling for Small Samples

    39/52

      39

    Use of ULS-SEM for the analysis of multi-block data

    All the variables are standardized: S  = R . The correlation matrix R   is now approximated using

    criterion (40), with the aid of the following factorisation formula:

    '12 13 141 1

    '21 23 242 2

    1 4 '31 32 343 3

    '41 42 434 4

    ' ' ' '

    1 1 12 1 2 13 1 3 14 1 3

    ' '

    21 2 1 2 2 2

    10 0 0 0 0 0

    10 0 0 0 0 0( ,..., , )

    10 0 0 0 0 0

    10 0 0 0 0 0

     

    ϕ ϕ ϕ 

    ϕ ϕ ϕ 

    ϕ ϕ ϕ 

    ϕ ϕ ϕ 

    ϕ ϕ ϕ 

    ϕ ϕ 

    ⎡ ⎤⎡ ⎤⎡ ⎤ ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥Φ =⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥

    ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦

    =

    λ    λ 

    λ    λ R   λ λ 

    λ    λ 

    λ    λ 

    λ λ λ λ λ λ λ λ  

    λ λ λ λ   ' '3 2 3 24 2 4' ' ' '

    31 3 1 32 3 2 3 3 34 3 4

    ' ' ' '41 4 1 42 4 2 43 4 3 4 4

    ϕ 

    ϕ ϕ ϕ 

    ϕ ϕ ϕ 

    ⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

    λ λ λ λ  

    λ λ λ λ λ λ λ λ  

    λ λ λ λ λ λ λ λ  

     

    In this way, confirmatory factor analysis (or perhaps rather confirmatory PCA), in the context

    described here, allows the best first order reconstruction of the intra- and inter-block correlations.

     First analysis

    Using AMOS 6.0, we obtained Table 11 and the diagram shown in Figure 10. Confirmatory factor

    analysis of the four blocks echoes in essence the results of the first principal components of the

    separate PCA’s for each block. We have put the significant loadings in bold in Table 11. They well

    correspond to the strongest variable×PC1 correlations given in Table 10. There are two exceptions:

    Smell intensity at rest  in block 1 and Astringency in block 4. It should be noted that these variables

    have fairly high correlations with the second principal components. The GFI is less than 0.9. This is

    due to the existence of the second dimensions for blocks 1, 3 and 4.

    Second analysis

    In order to better identify the first dimension of the phenomenon under study, it is usual in

    confirmatory factor analysis to "purify" the scales: the analysis is repeated, omitting the non-

    significant variables. This is where Table 12 and Figure 11 come from. All the correlations between

    the manifest and latent variables of the corresponding block, and the correlations between the latent

    variables, are now strongly positive. All the correlations are significant. The first dimension of the phenomenon under study has therefore been perfectly identified. The GFI of 0.983 is excellent and

    well shows the unidimensionality of the selected variables.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    40/52

      40

    Table 11: Confirmatory factor analysis of the "Wine" data ( AMOS 6.0 output )

    (Significant coefficient in bold, non-significant in italic)

    Parameter   Estimate  Inf (95%)  Sup (95%)  P 

    Smell intensity at rest 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    41/52

  • 8/18/2019 Structural Equation Modelling for Small Samples

    42/52

      42

    Table 12: Confirmatory factor analysis of the "Wine" data on

    the significant variables ( p-value < .05) of Table 11 ( AMOS 6.0 output )

    Parameter Estimate  Inf (95%)  Sup (95%)  P 

     Aromatic quality at rest 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    43/52

      43

     

    1.00

    Rest 1

    rest3

    .00

    e2

    .89

    rest2

    .00

    e3

    1

    1.00

    View

    view3

    .00

    e5

    1

    view2

    .00

    e6

    1

    view1

    .00

    e7

    .94

    1

    1.00

    Shaking 1shaking3

    .00

    e11   .82

    1

    shaking2

    .00

    e12.88

    1.00

    Tasting 1

    tasting6

    .00

    e14tasting5

    .00

    e15

    tasting4

    .00

    e16

    1

    tasting1

    .00

    e19

    1

    shaking8

    .00

    e221

    shaking9

    .00

    e231

    shaking10

    .00

    e24

    tasting8   .00

    e26

    tasting9

    .00

    e27

    1

    .96

    .93

    .89.79

    1

    1

    .99

    .92

    .92

    .85

    .89

    1

    1

    .97

    .92.65

    .87

    .69

    .84

    .90

    1.04

    1

    1

    .79

     

    Figure 11: Confirmatory factor analysis of the "Wine" data on

    the significant variables of Table 9 ( AMOS 6.0 output ) 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    44/52

      44

    Third analysis

    As the four LV’s appearing in Figure 11 are highly correlated, it is natural to summarize these LV’s

    through a second order confirmatory factor analysis. This yields to Figure 12. The regression

    coefficient of one MV of each block has been put to 1. The second order LV “Score 1” is similar to

    the standardized first principal component of the first order LV’s as the error variances have been put to zero. The first order LV’s are evaluated by using the McDonald approach. For example,

    using the path coefficients shown in Figure 12, we get:

    * *Score(Rest 1) 1 rest2 .88 rest3∝ × + ×  

    Rest 1

    rest3

    .00

    e2

    .88

    rest2

    .00

    e3

    1

    View

    view3

    .00

    e5

    1

    view2

    .00

    e6

    1

    view1

    .00

    e7

    1.00

    1

    Shaking 1

    shaking3

    .00

    e11   .961

    shaking2

    .00

    e121.04

    Tasting 1

    tasting6

    .00

    e14tasting5

    .00

    e15

    tasting4

    .00

    e16

    1

    tasting1

    .00

    e19

    1

    shaking8

    .00

    e221

    shaking9

    .00

    e231

    shaking10

    .00

    e24

    tasting8   .00

    e26

    tasting9

    .00

    e27

    1

    .99

    .95

    .91.82

    1

    1

    1.00

    1.09

    1.08

    1.00

    .91

    1

    1

    1.00

    .981.12

    1

    1

    1.00

    Score 1

    .84.83

    .82

    .94

    .00

    d2

    .00

    d1

    .00

    d3

    .00

    d4

    11

    1

    1

     

    Figure 12: Second order confirmatory factor analysis of the "Wine" data on

    the significant variables of Table 9 ( AMOS 6.0 output ) 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    45/52

      45

    In the same way, the second order LV (“Score 1”) can be computed as a weighted sum of all the

    MV’s. The regression coefficient of “Score 1” in the regression of an MV on “Score 1” is equal to

    the product of the path coefficients related to the link between this MV and its LV and to the link

     between the LV and “Score 1”. For example

    12 12 12 12

    (rest2,Score 1)

    (Rest 1,Score 1)  ( Rest 1 ,Score 1) (Rest 1,Score 1)

    (Score 1)

    Cov

    CovCov Cov

    Var λ ε λ λ  = + = = ×

     

    as the latent variable “Score 1” is standardized. This leads to:

    ( ) ( )* * * *Score 1

      .83 1 rest2 .88 rest3 .94 .91 tasting1 1 tasting9∝ × × + × + + × × + + ×  

    But this formula has a severe drawback: it gives more weight to a block containing many variables

    than to a block with few variables. From a pragmatic point of view, we prefer to compute aweighted sum of the first order standardized LV estimates, using the path coefficients relating the

    first order LV’s to the second order LV. In fact, these weights are reflecting the quality of the

    approximation of the second order LV by the first order LV’s. This leads to what is called here

    Global score (1):

    Global score (1)

      .83 Score (Rest 1) .94 Score (Tasting 1)∝ × + + × 

    The correlation table between these scores is given in Table 13. All the computed first order LV’s

    are well positively correlated and very well summarized by the computed second order LV.

    Table 13: Correlation between scores related to the first dimensions of the wine data

    Correlations

    1 .671 .687 .546 .802

    .671 1 .794 .838 .921

    .687 .794 1 .897 .942

    .546 .838 .897 1 .920

    .802 .921 .942 .920 1

    Rest 1

    View 1

    Shaking 1

    Tasting 1

    Global score 1

    Rest 1 View 1 Shaking 1 Tasting 1Globalscore 1

     

     Fourth analysis

    To identify the second dimension of the phenomenon under study, we will construct a new

    confirmatory PCA model for the manifest variables not taken into account in the second analysis.The non-significant variables were eliminated iteratively as before. This is where Figure 13 and

    Table 14 come from. All the correlations between the manifest and latent variables of the

    corresponding block, and the correlations between the latent variables, are now strongly positive.All the correlations are significant. The second dimension of the phenomenon studied has therefore

     been identified. The value of the GFI is 0.919; this means that this second dimension can beaccepted.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    46/52

      46

     

    1.00

    Rest 2

    rest1

    .00

    e4

    1.00

    Shaking 2shaking5

    .00

    e9.78

    shaking1

    .00

    e131

    1.00

    Tasting 2

    tasting3

    .00

    e17

    1

    rest5

    .00

    e20

    shaking7

    .00

    e21

    tasting7

    .00

    e25

    .91

    1

    .94

    1

    .85

    .60

    1   .79

    .77

    .75

    .96

    .78

    1

    1

     

    Figure 13: Confirmatory factor analysis of the "Wine" dataon the variables of Table 8 ( AMOS 6.0 output )

    Table 14: Confirmatory factor analysis of the "Wine" data on the non-significant variables of Table

    9 ( AMOS 6.0 output ).  Results after iterative elimination of non-significant variables.

    Parameter   Estimate Inf (95%)  Sup (95%)  P 

    Smell intensity at rest 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    47/52

      47

     Fifth analysis

    The three LV’s appearing in Figure 13 being highly correlated, they are summarized as above

    through a second order confirmatory factor analysis. This yields to Figure 14. Scores related to thesecond dimension are computed in the same way as those related to the first dimension. The

    correlation table related to these scores is given in Table 15. Comments are the same as for Table13.

    Rest 2

    rest1

    .00

    e4

    1

    Shaking 2shaking5

    .00

    e91.34

    shaking1

    .00

    e131

    Tasting 2

    tasting3

    .00

    e17

    1

    rest5

    .00

    e20

    1

    shaking7

    .00

    e21

    tasting7

    .00

    e25

    1.08

    1

    1.62

    1.25

    1

    1.00

    1.00

    1.00

    1

    1.00

    Score 2

    .70

    .54

    .78

    .00

    d1

    .00

    d3

    .00

    d4

    1

    1

    1

     

    Figure 14: Second order confirmatory factor analysis of the "Wine" data on

    the variables of Table 12 ( AMOS 6.0 output )

    Table 15: Correlation between scores related to the second dimensions of the wine data

    Correlations

    1 .758 .776 .908

    .758 1 .793 .933

    .776 .793 1 .925

    .908 .933 .925 1

    Rest 2

    Shaking 2

    Tasting 2

    Global score 2

    Rest 2 Shaking 2 Tasting 2Globalscore 2

     

  • 8/18/2019 Structural Equation Modelling for Small Samples

    48/52

      48

     Remarks: 

    1. The first dimension consists of variables all positively correlated with the global quality grade

    (available elsewhere). These correlations are given in Table 16. The second dimension, on the otherhand, is relative to variables not correlated with the global quality grade.

    2. It may be wished to obtain orthogonal components in each block. Then, it would be necessary touse the deflation process, i.e. to construct a new analysis on the residuals of the regression of each

    original block X  j on its first computed latent variable LV  j.

    Table 16: Correlation between the variables related to the two dimensions

    and the global quality grade 

    Variables related to dimension 1  Global quality

     Aromatic quality at rest  0.62 

    Fruity note at rest  0.50 

    Visual intensity  0.54 Shading (from orange to purple)  0.51 

    Surface impression  0.67 

    Smell quality  0.76 

     Aromatic intensity in mouth  0.61 

     Aromatic persistence in mouth  0.68 

     Aromatic quality in mouth  0.85 

    Intensity of attack  0.77 

     Alcohol  0.52 

    Balance (acidity, astringency, alcohol) 0.95 

    Mellowness  0.92 

    Ending intensity in mouth  0.80 Harmony  0.88 

    Global score 1 0.73

    Variables related to di mension 2  Global quality

    Smell intensity at rest  0.04 

    Spicy note at rest  -0.31 

    Smell intensity after shaking 0.17 

    Spicy note after shaking  -0.08 

    Phelonic note  0.09 

     Astringency  0.41 

    Bitterness  0.05 Global score 2 0.08

    Graphical displays

    Using Global scores (1) and (2), we obtain three graphical displays. The variables are describedwith their correlations with Global scores (1) and (2). The individuals are visualized with these two

    global scores using appellation and soil markers. These graphical displays are given in Figures 15,

    16 and 17. Figures 16 and 17 show clearly that soil is a much better predictor of wine quality thanappellation. All the wines produced on a reference soil are positive on Score 1. The reader

    interested in wine can even detect that the two Saumur 1DAM and 2DAM are the best wines from

  • 8/18/2019 Structural Equation Modelling for Small Samples

    49/52

      49

    this sample. I can testify that I drank outstanding Saumur-Champigny produced at Dampierre-sur-

    Loire.

    Figure 15: Graphical display of the variables 

    Figure 16: Graphical display of the wine with appellation markers 

  • 8/18/2019 Structural Equation Modelling for Small Samples

    50/52

      50

     

    Figure 17: Graphical display of the wine with soil markers 

    IV. Comparison between the ULS-SEM and PLS approaches.

    The die is not cast and the ULS-SEM approach is not uniformly more powerful than the PLSapproach. We have set out the "pluses" and "minuses" of each approach in Table 16.

    V. Conclusion

    Roderick McDonald has thrown a bridge between the SEM and PLS approaches by making use ofthree ideas: (1) using the ULS method, (2) setting the variances of the residual terms of the

    measurement model to 0, and (3) estimating the latent variables by using the loadings of the MV’son their LV’s. The McDonald approach has some very promising implications. Using a SEM

    software such as AMOS 6.0 makes it possible to get back to PCA, to the analysis of multi-block data

    and to a "data analysis" approach for SEM completely similar to the PLS approach. We haveillustrated this process with three examples, corresponding to these different themes. We have listed

    the advantages and disadvantages of the two approaches. We end this paper with a wish: that thisULS-SEM approach be included in a PLS-SEM software. The user would then have access to a very

    comprehensive toolbox for a "data analysis" approach to structural equation modelling.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    51/52

      51

    Table 16: Comparison between the ULS-SEM and PLS approaches.

    ULS-SEM PLS

    The

    "pluses"

    -  Global criterion well identified-  Use of SEM softwares

    -  Parameters can be subject toconstraints

    -  Use of bootstrapping on all themodel parameters

    -  Better measurement of thequality of the theoretical model

    -   Non-recursive model allowed

    -   No identification problem-  Systematic convergence of the

    PLS algorithm-  General framework for multi-

     block data analysis-  Robust method for small-size

    samples

    -  Possibility of several LV’s per block exists in PLS-Graph

    software-  Explicit calculation of LV’s

    integrated in PLS softwares

    -  Easy handling of missing data

    The

    "minuses"

    -  Possible difficulty in modelidentification

    -  Possible non-convergence of thealgorithm

    -  Explicit calculation of LV’s isoutside the SEM software

    -  Missing data are not permitted

    -  Algorithm is often closer to anheuristic than to the optimisation

    of a global criterion

    -  It is impossible to imposeconstraints on the parameters

    -  Measurement of the quality of theinner model is underestimated

    -  Measurement of the quality of theouter model is overestimated

    -   Non-recursive model prohibited

    References

    Arbuckle, J.L. (2005): AMOS 6.0.  AMOS Development Corporation, Spring House, PA.

    Bollen, K. A. (1989): Structural Equations with Latent Variables, John Wiley & Sons.

    Chin W.W. (2001): “PLS-Graph User’s Guide”, C.T. Bauer College of Business, University of Houston,

    USA.

    Hwang, H. & Takane Y. (2004) : Generalized structured component analysis, Psychometrika, 69, 1, 81-99.

    McDonald, R.P. (1996): Path analysis with composite variables,  Multivariate Behavioral Research,

    31 (2), 239-270.

     Noonan, R. & Wold, H. (1982): PLS path modeling with indirectly observed variables: a comparison

    of alternative estimates for the latent variable. In: Jöreskog, K.G., Wold, H. (Eds.), Systems under Indirect Observation. North-Holland, Amsterdam, pp. 75–94.

    Pagès J., Asselin C., Morlat R., Robichet J. (1987): Analyse factorielle multiple dans le traitement de

    données sensorielles : Application à des vins rouges de la vallée de la Loire, Sciences des aliments,

    7, 549-571)

    Tenenhaus, M. (2007): Statistique : Méthodes pour décrire, expliquer et prévoir , Dunod, Paris.

    Tenenhaus M., Esposito Vinzi V., Chatelin Y.-M., Lauro C. (2005): PLS path modeling.Computational Statistics & Data Analysis, 48, 159-205.

  • 8/18/2019 Structural Equation Modelling for Small Samples

    52/52

    Tenenhaus, M. & Esposito Vinzi, V. (2005): PLS regression, PLS path modeling and generalized

    Procustean analysis: a combined approach for multiblock analysis,  Journal of Chemometrics, 19,145-153.

    Tenenhaus, M. & Hanafi M. (2007): A bridge between PLS path modelling and multi-block data

    analysis », in  Handbook of Partial Least Squares (PLS): Concepts, Methods and Applications  (V.

    Esposito Vinzi, W. Chin, J. Henseler, H. Wang, Eds), Volume II in the series of the Handbooks ofComputational Statistics, Springer, in press.

    Tenenhaus, M., Pagès, J., Ambroisine, L. & Guinot, C. (2005): PLS methodology to study

    relationships between hedonic judgements and product characteristics, Food Quality and Preference, vol. 16, n° 4, pp. 315-325.

    XLSTAT (2007): XLSTAT-PLSPM module, XLSTAT software, Addinsoft, Paris.


Recommended