+ All Categories
Home > Documents > 00b7d51f18159dbbb1000000

00b7d51f18159dbbb1000000

Date post: 02-Jun-2018
Category:
Upload: adis022
View: 216 times
Download: 0 times
Share this document with a friend

of 21

Transcript
  • 8/10/2019 00b7d51f18159dbbb1000000

    1/21

    Linear Algebra and its Applications 321 (2000) 153173www.elsevier.com/locate/laa

    The asymptotic covariance matrixof maximum-likelihood estimates in factor

    analysis: the case of nearly singular matrix of

    estimates of unique variances

    Kentaro Hayashia, Peter M. Bentler b,

    aDepartment of Psychology, Utah State University, Logan, UT 84322-2810, USAbDepartments of Psychology and Statistics, University of California, P.O. Box 951563, Los Angeles,

    CA 90095-1563, USA

    Received 26 March 1998; accepted 3 July 2000

    Submitted by H.J. Werner

    Abstract

    This paper is concerned with the asymptotic covariance matrix (ACM) of maximum-like-

    lihood estimates (MLEs) of factor loadings and unique variances when one element of MLEs

    of unique variances is nearly zero, i.e., the matrix of MLEs of unique variances is nearly

    singular. In this situation, standard formulas break down. We give explicit formulas for the

    ACM of MLEs of factor loadings and unique variances that could be used even when an

    element of MLEs of unique variances is very close to zero. We also discuss an alternative

    approach using the augmented information matrix under a nearly singular matrix of MLEs of

    unique variances and derive the partial derivatives of the alternative constraint functions with

    respect to the elements of factor loadings and unique variances. 2000 Elsevier Science Inc.All rights reserved.

    AMS classification: 62H25; 62F12

    Keywords: Augmented information matrix; Factor loadings; Heywood case; Standard errors

    This work was supported by National Institute on Drug Abuse grant DA01070. Corresponding author. Tel.: +1-310-825-2893; fax: +1-310-206-4315.E-mail addresses:[email protected] (K. Hayashi), [email protected] (P.M. Bentler).

    0024-3795/00/$ - see front matter 2000 Elsevier Science Inc. All rights reserved.

    PII: S 0 0 2 4 - 3 7 9 5 ( 0 0 ) 0 0 2 2 2 - 6

  • 8/10/2019 00b7d51f18159dbbb1000000

    2/21

    154 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    1. Introduction

    This paper is concerned with the asymptotic covariance matrix (ACM) of

    maximum-likelihoodestimates (MLEs) of factor loadings and unique variances when

    one element of MLEs of unique variances is nearly zero, i.e., when the MLE of the

    matrix of unique variances is nearly singular. It has been known for a long time

    that this situation occurs quite frequently in practice, e.g., Jreskog [14] noted that

    . . .improper solutions are quite frequent. Out of the 11 sets of data considered only

    two sets (Data 1 and 5) have proper solutions for all values ofk0 (the number of

    factors). This is a most remarkable result (p. 473). While he developed an effective

    computational method to yield parameter estimates in spite of this problem, he did

    not provide standard error formulas that could be applied in this circumstance.The formulas for the ACM of MLEs of factor loadings and unique variances for

    the regular case were obtained by Lawley [16], and they were systematically pre-

    sented in [18]. There are some mistakes in the formulas presented by Lawley and

    Maxwell [18] and the mistakes were corrected by Jennrich and Thayer [13]. Jennrich

    [9] and Lawley [17] introduced the augmented information matrix approach which

    simplifies the process of obtaining the ACM of MLEs of factor loadings and unique

    variances.

    The formulas presented in [13,18] involve a reciprocal of unique variances. When

    one of the MLEs of unique variances approaches zero, the reciprocal diverges to plus

    infinity. As a result, the estimated ACM computed using the standard formulas in[13,18] breaks down as one element of the MLEs of unique variances gets very close

    to zero. Thus, it is desirable to come up with alternative formulas for the ACM of

    MLEs of factor loadings and unique variances which do not involve a reciprocal of

    unique variances. Furthermore, the standard formulas are likely to run into computa-

    tional instabilities if some of the unique variances are very small, and rounding errors

    can result in disastrous outcomes of the computations. The alternative methods, on

    the other hand, should be more stable, avoiding problems with rounding errors. The

    purpose of this paper is to give such alternative formulas and also an alternative

    procedure to obtain the ACM that can be used even when the MLE of the matrix of

    unique variances is nearly singular.Jennrich and Clarkson [11] developed a method to compute approximate standard

    errors which makes use of a jackknife-like procedure. Their paper dealt with the

    Heywood case (i.e., case with a zero value of MLE of a unique variance) problem

    by expressing elements of the differentials of factor loadings without the inverse

    of unique variances. In this paper, we present exact (both standard and alternative)

    formulas for the ACM of MLEs of factor loadings and unique variances by making

    use of the differentials obtained by Jennrich and Clarkson [11].

    In addition to the formulas mentioned above, we also provide another approach

    to compute the ACM of MLEs of factor loadings and unique variances using the

    augmented information matrix under a nearly singular matrix of MLEs of unique

    variances. The partial derivatives of the constraint functions given by Jennrich [9]

  • 8/10/2019 00b7d51f18159dbbb1000000

    3/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 155

    involve reciprocals of elements of unique variances, so use of the standard partial

    derivatives creates a problem as one of the elements of MLEs of unique variances

    approaches zero. We use an alternative constraint function which does not involve re-

    ciprocals of unique variances, and derive the partial derivatives of the alternative con-

    straint functions with respect to the elements of factor loadings and unique variances

    to implement the augmented information matrix approach.

    We first introduce the formulas for MLEs of factor loadings and unique variances

    and the ACM for the regular case in Section 2. An alternative formula for MLEs of

    factor loadings is reviewed in Section 3, along with a discussion of several issues

    associated with the case of a nearly singular matrix of MLEs of unique variances.

    Then, in Section 4, we present our alternative formulas for the ACM of MLEs of

    factor loadings and unique variances. Section 5 presents both the standard and alter-native formulas derived from the differentials given by Jennrich and Clarkson [11].

    The augmented information matrix approach with an alternative constraint function

    and the partial derivatives follows in Section 6. Section 7 is a brief conclusion.

    2. The ACM of MLEs of factor loadings and unique variances: version

    Let xi , i= 1, . . . , n, be a p1 random vector of observations with the meanvector 0 and the covariance matrix , be a pm matrix of factor loadings,fi beanm1 vector of the common factors, and i be ap1 vector of the unique fac-tors. The factor analysis model is given by xi= fi+iwith E(fi )=0, Cov(fi )=Im, E(i )=0, Cov(i )=, Cov(fi , i )=0, where is a positive definite (or pos-itive semidefinite) diagonal matrix. Then the covariance matrix is expressed as

    = +. In MLE, we further assume that xi s are random samples from amultivariate normal population with mean vector 0 and covariance matrix , and the

    constraint that1 is diagonal is imposed for to be identified. It is known thatthe MLEs of and are obtained by solving the following two equations:

    = 1/2(Im)1/2, (1)

    =diag(S), (2)where is an mm diagonal matrix whose elements are the first m largest ei-genvalues of1/2S1/2, a pm matrix whose columns are normalized(i.e., =Im) eigenvectors corresponding to the first m largest eigenvalues of

    1/2S1/2, diag(A) denotes the diagonal matrix whose elements are the diagonalelements of the square matrixA, andSis the sample covariance matrix.

    Let and be the MLEs of and ,=vec() and= vdg(), wherevec(

    ) denotes the pm-vector listing m columns of the pm matrix

    starting

    from the first column, and vdg(

    ) denotes the diagonal elements of

    arranged

    as a p

    -vector. Anderson and Rubin [3] established the asymptotic multinormalityofn() andn( )under the following three assumptions: (i) isnonsingular, i.e., the determinant|| /=0, where is defined in (13), andis

  • 8/10/2019 00b7d51f18159dbbb1000000

    4/21

    156 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    the Hadamand product; (ii) and are identified by the constraint that 1 isdiagonal and the diagonal elements are different and ordered; (iii) the sample covari-

    ance matrix Sconverges to, in probability, andn(S)has an asymptotic mul-tinormal distribution. (Actually, the joint asymptotic multinormality of

    n(( )

    ( ))also holds. See, e.g., [2].)We now present the standard formulas for the ACM of MLEs of factor loadings

    and unique variances. The original elementwise formulas were given by Lawley and

    Maxwell [18], with correction by Jennrich and Thayer [13]. The formulas that we

    give are a matrix version of the identical results, which is slightly modified from the

    matrix results given by Hayashi and Sen [7]. (Note: The proof for the equivalence

    between the elementwise formulas and the matrix formulas for A andB are given in

    Appendix A.3. The matrix formulas for Eand

    were given by Lawley and Maxwell[18].) The formulas for the ACM are: Var() Cov(, )

    Cov(,) Var( )

    =

    1

    n

    A+2B EB 2B E

    2EB 2E

    , (3)

    whereA (of orderpmpm),B (of orderppm), andE (of orderpp) are asfollows:

    A= {M+(MM)(diag(Km))(Im)} {A1A1 A2}, (4)

    B= {2

    (Im)1

    1p}{1m+(1m )(diag(Kmb))(Im)}, (5)

    E= ()1, (6)with

    A1=1m1p(diag())(Im1p1p), (7)

    A2= 1p1p, (8)

    =vec((Im)2 1m1m+

    12

    Im), (9)

    M= (Im)1, (10)

    =vec1(((ImIm)2)+1m2 ), (11)

    b =(Im Im2 diag(vec((Im))))11m2 , (12)

    =

    1

    1(

    Im)

    1

    1, (13)

    where is anmm diagonal matrix whose elements are the first m largest eigen-values of1/21/2, vec1 the inverse operation of vec, i.e., vec(

    )=((Im

  • 8/10/2019 00b7d51f18159dbbb1000000

    5/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 157

    Im)2)+1m2 with

    being anmmmatrix, +in (11) the MoorePenrose

    inverse, the Kronecker product, Km the m2

    m2

    commutation matrix definedsuch thatKmvec(G)=vec(G)for anymm matrixG, diag(z)denotes the diag-onal matrix whose diagonal elements are vectorz,Imis the m-dimensional identity

    matrix, and 1mis the m-vector whose elements are all 1s.

    Note that Eqs. (5) and (13) involve inverses of unique variances. Also, the es-

    timate of the (1, 1) element of, i.e., the estimate of the largest eigenvalue of

    1/2

    1/2, gets very large as one of the MLEs of unique variances approaches

    zero. Thus, Eqs. (8)(12) also need to be modified.

    3. The case with a nearly singular matrix of MLEs of unique variances

    We now discuss the case where one element of the matrix of MLEs of unique

    variances is very close to zero, i.e., is nearly singular. (We deal only with explor-atory factor analysis in this paper. For confirmatory factor analysis, see, e.g., [6].)

    In this section, we discuss several issues associated with a nearly singular matrix of

    MLEs of unique variances.

    First, note that the assumption of positive semidefiniteness of, i.e., {: 0},is actually{:0 }, since is also positive semidefinite. We further as-sume that the true parameter values 0 of unique variances lie in the interior of theparameter space{: 0 }; thus 0 is nonsingular. (One of the regularityconditions for the asymptotic normality of MLEs requires that the neighborhood

    around the true parameter values needs to be inside the parameter space. See, e.g.,

    assumption (vi) of Theorem 2 of Anderson and Amemiya [2, p. 764].) The assump-

    tion of nonsingularity of0implies that the probability of obtaining a nonsingular(i.e., in the interior of{: 0 }) approaches unity as sample size increases.

    Second, the MLEs of factor loadings can be computed even when one element of

    MLEs of unique variances is exactly zero, by using the eigenvaluesr ofUU,

    whereU is the Cholesky factor ofS1 (i.e., S1

    =UU), instead of using the

    eigenvalues r of1/2S1/2. We will state this as Observation 1, as follows.

    Observation 1[12, 21, 22].(i) Assume thatSis positive definite. Letr be the eigenvalues ofU

    U with

    S1 =UU, where U is an upper triangular matrix with positive diagonalelements obtained by Cholesky decomposition of S1, and let r be the nor-malized eigenvectors corresponding tor. LetN

    be anmm diagonal matrixwhose elements are the m smallest eigenvalues 1 , . . . ,

    m(

    1

  • 8/10/2019 00b7d51f18159dbbb1000000

    6/21

    158 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    = 1/2UZN1/2. (15)

    (ii)The matrix of factor loadings is computed without and,as follows:

    =U1Z(ImN)1/2. (16)

    Thus,alternatively,the MLEs of and are obtained by solving Eqs. (16)and(2),

    instead of(1)and(2).

    See Appendix A.1 for the proof of Observation 1. Note that Eq. (1) is derived from

    (S

    )1

    =0, while Eq. (16) is derived from (S

    )S1

    =0. In fact, the

    three equations (S)1 =0, (S)S1 =0, and (S)1 =0 areequivalent, except that positive definiteness of, S, and are assumed, respectively

    [19, Theorems 4.2 and 4.3]. For example, equivalence of (S)1 =0 and(S)1 =0 can be shown by making use of the identity 11 = 1[18, Eq. (4.7)] to convert one to the other (see Appendix A.8).

    Third, although the MLEs of factor loadings can be computed even when one

    element of MLEs of unique variances is exactly zero, caution has to be taken in

    interpreting a zero value of

    j in the same way as a strictly positive value of

    j. Even

    if a zero value of an estimate is an MLE in the sense that the log-likelihood function

    is maximized at that value, it is not a stationary point of the likelihood equation in theinterior of the parameter space. Thus, we restrict ourselves to dealing with the ACM

    of MLEs of factor loadings under a nearly singular, but not a strictly singular, matrix

    of MLEs of unique variances in this paper. (See e.g., [5,15] for attempts to include

    a strictly singular matrix of MLEs of unique variances as long as we are confident

    in our assumption that 0 is nonsingular. However, further studies are needed on

    whether the formulas given below still approximate well the true ACM in such a

    case.)

    Fourth, as thejth diagonal element

    jof the matrix of MLEs of unique variances

    gets closer and closer to zero, the matrix of MLEs of factor loadings follows a spe-

    cific pattern: thejth row of the matrix of MLEs of factor loadings approaches zero,except for the(j, 1)element, which gets close to the square root of the jth diagonal

    element of sample covariance matrix (except for the sign change). We state this as

    the following observation.

    Observation 2. Ifj 0 (with the rest of the MLEs of unique variances not nearlyzero),thenj1s 1/2jj (orj1 s1/2jj )andj r 0, r= 2, . . . , m.

    The proof is given in Appendix A.2. This result must have been known by Jr-

    eskog [14], who described the phenomenon (e.g., in his Table 5) and developed

    a partialing procedure to yield estimates of j r= r with the exact property thatr= 0.

  • 8/10/2019 00b7d51f18159dbbb1000000

    7/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 159

    4. The ACM of MLEs of factor loadings and unique variances: version

    We now give alternative formulas for the asymptotic covariance matrix of MLEs

    of factor loadings and unique variances which can be used even when one element of

    estimated unique variances is very close to zero. As japproaches zero, Eqs. (5) and

    (13) become unstable (because they involve 1j

    , which diverges). Thus it is nec-

    essary to replace these equations with alternative formulas that do not involve 1j

    (see e.g., [4]). Our approach is motivated by theirs. In addition, the(1, 1) element of

    also gets very large as japproaches zero. Thus, Eqs. (8)(12) also need to be

    modified. Our modified formulas are as follows:

    A= {M+(MM)(diag(Km))(Im)}{A1A1A2}, (17)

    B= {1M1p}{1mIp+(1m 1)(diag(Kmb))(Im)}, (18)

    E= ()1, (19)with

    A1=1m1p(diag())(Im1p1p), (20)

    A2=N N

    N1p1p, (21)

    =vec((ImN )2N

    N2 1m1m+( 12 )Im), (22)

    M= (ImN )1, (23)

    N

    =vec1

    (((N ImImN )2

    )+1m2 ), (24)

    b =(NImImN2 diag(vec(ImN )))11m2 , (25)

    = 1 1M1, (26)where Nis an mm diagonal matrix whose elements are the m smallest eigenvalues(in ascending order) ofUUwith 1 =UU, whereUis an upper triangular ma-trix with positive diagonal elements obtained by the Cholesky decomposition of1.(The elementwise formulas corresponding to (17) and (18) are given in Appendix

    A.4.)It is easy to show the equivalence between Eqs. (4)(13) and Eqs. (17)(26),

    by noting the identities N= 1, 11 = 1, and 1(Im)1 =

  • 8/10/2019 00b7d51f18159dbbb1000000

    8/21

    160 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    1M. See Appendix A.5 for the proof of the equivalence. As before, we assume

    that the determinant|

    | /=

    0 and thus is nonsingular, so that we can

    compute E in (19) using in (26). ( in itself is in general not of full rank, but of

    rankpm, see, e.g., [1, p. 23]).In conclusion, the ACM of MLEs of factor loadings and unique variances can be

    computed using Eqs. (3), (17)(26), in place of Eqs. (3), (4)(13), including when

    one element of the MLEs of unique variances is nearly zero.

    5. Alternative matrix formulas: version and version

    Alternatively, it is possible to construct the matrix formulas for the ACM of esti-mates of factor loadings and unique variances based on the differentials reported in

    [11,13] with some modifications. The version of formulas is as follows: Var() Cov(, )

    Cov(,) Var( )

    =

    (Cov(s))

    , (27)

    where the matrices of partial derivatives ofand involving are given by

    =(1 Ip)

    Gp

    (1 )(1+2), (28)

    =K p( )1Kp()Gp, (29)

    with

    = 1, = 1, = 1 1,

    Kp=p

    i=1(Jp,i J

    p,iJp,i ),

    and

    1= 12 (diag(vec(1)))( )Gp

    , (30)

    2= (Im Im)+

    ( )Gp((+Im) )

    , (31)

    with the p-dimensional i th unit vector Jp,i , and under normal sampling, Cov(s) is

    given by

    Cov(s)=( 1n

    )Hp(Ip2+Kp)( )Hp, (32)

    whereK

    p is thep2

    p2

    commutation matrix (i.e.,K

    pvec(A)

    =vec(A

    )

    for anypp matrixA), andHp= (GpGp)1GpandGpis thep2 p(p+1)/2 duplica-tion matrix (i.e., vec(S) = Gpvech(S)forany ppsymmetric matrix S). (Essentially

  • 8/10/2019 00b7d51f18159dbbb1000000

    9/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 161

    the identical expression to /is obtained by Ihara and Kano [8].) The proof forthe alternative matrix approach ( version) is given in Appendix A.6.

    The version of the formulas (see also [7]) replaces the matrix of partial deriva-

    tives ofand in (28) and (29) by

    =(W1ZIp)

    Gp

    (W1 )(Y1+Y2), (33)

    =K p(Q Q)1Kp(QQ)Gp, (34)

    respectively, withW= 1, Z= 1,Q=IpW1Z, and

    Y1= 12 (diag(vec(W1)))(ZZ)Gp

    , (35)Y2= (Im WWIm)+

    ((ImW )ZZ)Gp(ZZ)

    . (36)

    See Appendix A.7 for the proof of the partial derivatives in the version.

    6. The augmented information matrix approach

    An alternative method to obtain the ACM of the MLEs of factor loadings is to

    consider it as a constrained MLE problem, using the augmented information matrix

    [9,17]. The augmented information approach gives a procedure to compute the ACM,

    but it does not give explicit formulas for the elements of the ACM. However, this

    approach is easy to implement; it is applicable to other rotated solutions as well;

    therefore it is a very practical approach. In this section, we consider modifying the

    standard augmented information matrix approach so that it can be used even when

    an element of the matrix of MLEs of unique variances is nearly zero.

    In case of the unrotated, unstandardized factor loadings, the formulas for the ele-ments of the information matrix are given by:

    xir,js=

    1n

    E

    2L

    ir j s

    =ij(1)rs+(1)is (1)j r , (37)

    yir,j=

    1n

    E

    2L

    ir j

    =ij(1)j r , (38)

    zi,j=

    1n

    E 2L

    ij

    =1

    2

    (ij)2, (39)

  • 8/10/2019 00b7d51f18159dbbb1000000

    10/21

    162 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    for 1 i, j p, 1 r, s m[9], whereL is the log likelihood function, ij the

    (i,j)element of1, and(V )rs is the(r, s)element of matrix V. Them(m

    1)/2

    constraint functionsguv on the parametersir andjare

    guv= (1)uv (40)for 1 u < v m, and the partial derivatives ofguv with respect toir andj are

    given by

    f1ir,uv=guv

    ir=(ruiv+rviu )2i , (41)

    f2j,uv=guv

    j= j uj v 2j . (42)

    Now define the matricesX=(xir,js ), Y= (yir,j), and Z=(zi,j) from the co-ordinatewise expressions in Eqs. (37)(39). (Note that the subscripts r and s serve

    as row and column block indices, respectively, while i and jare row and column

    indices within each block. That is,xir,js is the((r1)p+i, (s1)p+j )elementofX, yir,j the ((r1)p+i, j) element ofY, and zi,j is the (i, j) element ofZ.The orders of X, Y, and Z are pmpm, pmp, and pp, respectively. Xand Z are symmetric matrices. Likewise, define the matrices of partial derivatives

    F1= (f1ir,uv)and F2= (f2j,uv)from the coordinatewise expressions in Eqs. (41) and(42). (f1ir,uv is the ((r

    1)p

    +i , u(2m

    u

    1)/2

    +v

    m) element ofF1, f

    2j,uv

    is the (j, u(2mu1)/2+vm) element ofF2. The orders ofF1 and F2 arepmm(m1)/2 andpm(m1)/2, respectively.) Then the augmented infor-mation matrix is given by the sample sizen times the (p(m+1)+m(m1)/2) (p(m+1)+m(m1)/2)matrix whose submatrices are arranged as

    X Y F1Y Z F2F1 F

    2 0

    , (43)

    and the ACM for the MLEs of factor loadings and unique variances is the

    p(m

    +1)

    p(m

    +1)submatrix corresponding to the first p(m

    +1)rows and col-

    umns of the inverse of the augmented information matrix.Here, note thatxir,js , yir,j, andzi,jin (37)(39) are functions of the elements of

    1 and , which are all finite. Thus the functions xir,js , yir,j, andzi,jare all finite,

    and the estimates ofxir,js , yir,j, andzi,jcan be computed without any modification

    of the formulas even when the MLE ofj is nearly zero. On the other hand, the

    equations for the partial derivatives f1ir,uv

    and f2j,uv

    in (41) and (42) involve 1j

    and thus for some elements, the estimates of f1ir,uv and f2

    j,uv become very large

    when the MLE ofjis nearly zero.

    Motivated by Bentler and Yuan [4], Okamoto [19], and Swain [20], we use the

    alternative constraint functionsh

    uv=(

    1)

    uv, instead ofg

    uv=(

    1)

    uvin (30), when an element of MLE of is nearly zero. In fact, the constraint that

    1 is diagonal is equivalent to the constraint that 1 is diagonal, except

  • 8/10/2019 00b7d51f18159dbbb1000000

    11/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 163

    that the former constraint requires the assumption that is positive definite, while

    the latter constraint requires the assumption that is positive definite. To see the

    equivalence of the two constraints in the regular case, first note the identity:

    1 = 1(Im+1)1 (44)

    and note that, since the RHS of (44) is diagonal, the LHS of (44) also has to be

    diagonal.

    The partial derivatives ofhuv with respect to ir and j, which are used inside

    the augmented information matrix are given by:

    huv

    ir=(Jir1+1Jir 1(Jir+Jir )1)uv, (45)

    huv

    j= (1Kjj1)uv, (46)

    whereJir is a pm matrix whose (i, r) element is 1 and the rest of the elementsare all zero, and Kjj is a pp matrix whose (j,j) element is 1 and the rest ofthe elements are all zero (see [9, p. 125]). The proof for (45) and (46) is given in

    Appendix A.9.

    Thus to compute the ACM of MLEs of factor loadings and unique variances when

    one element of the MLEs of unique variances is nearly zero, we recommend using the

    partial derivatives ofhuvwith respect to ir and jgiven in (45) and (46), in place ofthe partial derivatives ofguvwith respect to ir and j in (41) and (42). The ACM of

    MLEs of factor loadings and unique variances is given by the p(m+1)p(m+1)submatrix corresponding to the first p(m+1) rows and columns of the inverse ofthe augmented information matrix withF1and F2replaced by the partial derivatives

    ofhuv . We should note that the augmented information matrix approach with the

    partial derivatives of the alternative constraint functions can be used whether or not

    an element of the MLEs of unique variances is nearly zero.

    7. Conclusion

    In this paper, we dealt with the ACM of MLEs of (unstandardized, unrotated)

    factor loadings and unique variances when an element of MLEs of unique variances

    is nearly zero, that is, the matrix of MLEs of unique variances is nearly singular.

    The standard formulas for the ACM given by Lawley and Maxwell [18] involve

    the inverse of the unique variances. Thus, we encounter a problem when one of the

    MLEs of unique variances approaches zero, since the reciprocal of the MLE of this

    unique variance gets very large.

    We presented alternative formulas for the ACM which can be used even whenan element of MLEs of unique variances is nearly zero. The derivation of the alter-

    native formulas involved replacing the expressions in terms of the inverse of unique

  • 8/10/2019 00b7d51f18159dbbb1000000

    12/21

    164 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    variances by the expressions in terms of the inverse of the covariance matrixwhose

    elements are all finite. The alternative formulas given in Sections 4 and 5 are exact

    asymptotic formulas, and they can be used whether or not an element of MLEs of

    unique variances is nearly zero. In this regard, we consider use of the alternative

    formulas to be more practical than the standard formulas.

    However, it should be noted that statistical instabilities that might arise in the case

    of unique variances near zero cannot be avoided just by using alternative formulas.

    For example, it is just possible that the asymptotic standard error of the MLE of a

    unique variance that is near zero is a very bad approximation to the true standard er-

    ror in small samples. (The authors thank the referee for noting this important point.)

    This is the area where we certainly need further research. See also [5,15] on the

    issues closely related with this point.Furthermore, we used alternative constraint functionshuv= (1)uvinsteadofguv= (1)uv , in the context of the augmented information matrix approach,when an element of the MLE of unique variances is nearly zero. In fact, use of

    the constraint that 1 is diagonal is not new; for example, it was mentionedby Swain [20]. However, to our knowledge, use of the constraint functionhuv=(1)uvin the context of the augmented information matrix approach for avoid-ing use of the inverse of unique variances, as well as the formulas for the partial

    derivatives ofhuv with respect toir andjto be used inside the augmented infor-

    mation matrix, are new.

    The augmented information approach has an advantage in that this approach isapplicable to other rotated solutions as well. In fact, only the matricesF1 andF2 of

    the partial derivatives of the constraint functions need to be modified for obtaining

    the standard errors for various rotated solutions. As long as the formulas for the

    constraint functions are not very complex, in general, the partial derivatives can be

    obtained fairly easily.

    Appendix A

    A.1. Proof of Observation 1

    (i) By definition of the eigenvalueeigenvector equation, (UU)Z=ZN.Rearrange this equation to:

    (1/2S1/2)(1/2UZN1/2)=(1/2UZN1/2)N1, (A.1)

    and comparing (A.1) with the eigenvalueeigenvector equation(1/2S1/2)

    = gives (14) and (15).(ii) Use Eq. (4.5) of Lawley and Maxwell [18]:(S)1 =0, and rearrange

    it as

    (UU)(U(S1)1/2)=(U(S1)1/2)(ImS1). (A.2)

  • 8/10/2019 00b7d51f18159dbbb1000000

    13/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 165

    (See, e.g., [12,19]). Comparing (A.2) with the eigenvalueeigenvector equation

    (UU)Z=

    ZN gives N=

    Im

    S1 and Z=

    U(S1)1/2=U(ImN)1/2. Thus (16) follows.

    A.2. Proof of Observation 2

    We omit the hat notation thereafter for simplicity. The rth largest eigenvalues

    r , r= 2, . . . , m, of1/2S1/2, except the largest eigenvalue, do not get verylarge as long as j is the only unique variance which is nearly zero. The (j,r)

    element of = 1/2(Im)1/2 in (1) is j r= 1/2j j r (r1)1/2. Let be apositive quantity very close to zero. When r /=1, j= with bounded j r(|j r | 1) and finite (not very large) rgives j r= r , r= 2, . . . , m, where rare quantitiesvery close to zero. Thus, the (j, j) element ofS + is sjj 2j1+( 22+ +2m+ ), andj1s 1/2jj orj1 s

    1/2jj

    follows.

    A.3. Outline of proof of the equivalence of the standard elementwise expressions

    and the matrix expressions (4), (5), (7)(12)

    Fori, j= 1, . . . , p, andr, s=1, . . . , m, the standard elementwise formulas forAand B ( version: [13,18]) are

    air,jr= r ij 12r ir j r+ m

    k /=r(krk ik j k ) , (A.3)

    air,js= {r s (rs )2}is j r forr /=s, (A.4)

    bj,ir= j r (r1)12j

    ijj

    1

    2

    ir j r (r1)1 +r

    mk /=r

    (ik j k (rk )1)

    ,

    (A.5)

    where ris the rth element of the mm diagonal matrixwhich has as its elementsthe m largest eigenvalues of1/21/2; ij= 1 ifi= jand ij= 0 ifi /=j; andr andrk are defined in terms ofr s as follows:

    r=r

    r1, (A.6)

    rk= r1r

    k

    2

    1. (A.7)

    First, we show that (A.3) is an element of the first term in (4). Express (A.3) as

    air,jr= r (ij+m

    k=r (k

    rkik j k )), where

    rk= rk if k /=r , rk= 1/2 if

  • 8/10/2019 00b7d51f18159dbbb1000000

    14/21

    166 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    k=r , and write the first term in (4) as M+(MM)(diag(Km))(Im

    )=

    (M

    Ip){

    Im

    +(Im

    )(Im

    M)(diag(vec()))(Im

    )}

    with

    =vec1(). The equivalence ofand rkis obvious by comparing (Im)2and (r1)2, and ImIm and rk. The rest is straightforward bynoting that MIp in (4) corresponds to r,Imcorresponds to ij, and(Im)(Im M)(diag(vec()))(Im)corresponds to

    mk=1(k

    rkik j k).

    Next, we show that (A.4) is an element of the second term in (4). First, notice that

    A1in (7) corresponds tois in (A.4). 1m1pincludes both the cases ofr= sandr /=s . To set the block diagonal elements which correspond to the case ofr= sequal to zero, we need to subtract a block diagonal matrix (diag())(Im1p1p)from 1m1p. Similarly,A1in (7) corresponds toj r in (A.4). Next,A2in (8)corresponds tor s /(rs )2 inair,js in (A.4) and it is easy to see that

    in (11)corresponds to(rs )2.

    It is obvious that the first term in B in (5) corresponds to j r (r1)12j inbj,ir in (A.5). Express

    1

    2

    ir j r (r1)1 +r

    mk /=r

    (ik j k (rk)1)

    as rm

    k=1ik j k brk, where

    brk= (rk )1 ifr /=k and brk= (2r (r1))1

    ifr= k . The equivalence ofb and brk is obvious by noting that ImIm corresponds to rk and (Im) corresponds to r (r1). The rest isstraightforward by noting that (1m)(diag(Kmmb))(Im)corresponds torm

    k=1(ik j k brk ).

    A.4. Alternative elementwise formulas forA andB ( version)

    Fori, j= 1, . . . , p, andr, s=1, . . . , m,

    air,jr= r

    ij

    1

    2

    r ir j r+

    m

    k /=r(krk ik j k )

    , (A.8)

    air,js= {r s (sr )2}is j r forr /=s, (A.9)

    bj,ir= r

    ps=1

    j s sr

    ij

    1

    2

    ir r

    ps=1

    j s sr

    +m

    k /=rik (kr )1

    ps=1

    j s sk

    , (A.10)

    wherer are the eigenvalues ofUU, andr and rk are defined in terms ofr s

    as follows:

    r= (1r )1, (A.11)

  • 8/10/2019 00b7d51f18159dbbb1000000

    15/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 167

    rk

    =2k

    1rkr

    2

    1. (A.12)

    A.5. Proof of the equivalence of the matrix expressions ( version) and the

    alternative matrix expressions ( version)

    (i) To show the equivalence of (5) and (18):

    B= {2(Im)1 1p}

    {1

    m

    +(1

    m

    )(diag(K

    mb))(I

    m

    )}= {1(Im)1 1p}

    {1mIm+(1m1)(diag(Kmb))(Im)}= {1M1p}

    {1mIm+(1m1)(diag(Kmb))(Im)}= {1M1p}

    {1mIm+(1m11)()(diag(Kmb))(Im)}

    = {1

    M1p}{1mIm+(1m 1){diag(()Kmb)}(Im)}

    = {1M1p}{1mIm+(1m 1){diag(Km()b)}(Im)}

    = {1M1p}{1mIp+(1m 1)(diag(Kmb))(Im)},

    since

    1

    1 = 1, 1(Im)1 = 1M,

    and

    ()b=()(ImIm2 diag(vec((Im))))11m2=(1 ImIm1 2(11)diag(vec((Im))))11m2= [1 ImIm1 2diag{vec((Im)1)}]11m2=(NImImN2diag(vec(ImN )))11m2= b.

  • 8/10/2019 00b7d51f18159dbbb1000000

    16/21

    168 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    (ii) To show the equivalence of (8) and (21):

    vec(

    )

    =()vec =()((ImIm)2)+1m2=((1 1)(ImIm)2)+1m2=(()(1 1)2(Im Im)2)+1m2=(1 1)((1 ImIm1)2)+1m2=(NN ){((NImImN )2)+1m2}=(NN )vecN

    =vec(NN

    N ).

    (iii) To show the equivalence of (9) and (22):

    vec((Im)2

    )

    =(Im(Im)2)((ImIm)2)+1m2= {{(Im(Im))1(Im Im)}2}+1m2= {{(Im(Im)1)(Im Im)}2}+1m2

    = {{Im

    (

    Im)

    1

    (

    Im)

    1

    }2

    }+1m2

    = {{Im(Im1)1 (Im1)11}2}+1m2= {{Im(ImN )1 N1 (ImN )1N}2}+1m2= {(N2 (ImN )2){NImImN}2}+1m2=(N2 (ImN )2)vecN

    =vec((ImN )2N

    N2).

    (iv) To show the equivalence of (10) and (23): UsingN= 1,

    M= (Im)1 =N1(N1 Im)1 =(N N1 N )1 =(ImN )1.(v) To show the equivalence of (13) and (26): Use the identity 1 = 1

    1

    1

    1, and subtract this from in (13):

    1 =1(1 (Im)1)1=1(1 (Im)1)1=1(Im)11=1M1,

    since

    1

    1 = 1

  • 8/10/2019 00b7d51f18159dbbb1000000

    17/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 169

    and

    1

    (Im)1

    = 1

    (Im)1

    .

    A.6. Proof of the expressions for the matrices of partial derivatives ofand (version)

    The likelihood equations are written in the form: (S)1 =0,diag(S)=0, and nondiag(1)=0, the differentials of which are

    (dd dd)1 =0,

    diag(dd dd)=0,

    and

    nondiag(d1 1 d1+1 d)=0,respectively, that is,

    d =(dd) d, (A.13)diag(dd dd)=0, (A.14)nondiag(d

    + d

    d)

    =0. (A.15)

    Vectorizing both sides of (A.13) and letting, db =vec(d), (Ip)(d)=(1 Ip)vec(dd)(Im)(db). Thus, letting b/=1+2,where 1 and 2 are the diagonal and the off-diagonal components of

    b/,respectively, (28) follows.

    Now, premultiplying (A.13) by gives d+ d= (dd). Forthe diagonal elements of d, 2 d= (dd), that is,

    d=

    1

    2

    1 (dd). (A.16)

    Vectorizing (A.16) gives

    db =

    1

    2

    vec(1)vec((dd))

    =

    1

    2

    vec(1)( )vec(dd)

    =

    1

    2

    diag(vec(1))( )vec(dd).

    Thus1 in (30) follows.

    Next, for the nondiagonal elements of d

    , inserting nondiag(

    d)

    =nondiag( dd) in (A.15) into nondiag( d+ d)=nondiag((dd))in (A.14) gives

  • 8/10/2019 00b7d51f18159dbbb1000000

    18/21

    170 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    nondiag( dd)=nondiag( d d(+Im)). (A.17)

    The vectorized version of (A.17) is(Im Im)(db)=( )vec(d)((+Im) )vec(d),

    from which 2 in (31) follows. Finally, noting =0 and =0, (A.14) leadsto diag( dS)=diag( d). Then

    vdg(d)=()1vdg(d). (A.18)Vectorizing the diagonal matrix whose elements are (A.18) gives

    d =vec(d)=vec(diag(()1vdg((d))))

    =Kp()1

    Kpvec((d))=Kp()1Kp( )Gp(d ),

    from which (29) follows.

    A.7. Proof of the expressions for the matrices of partial derivatives ofand (version)

    The likelihood equations in the version are

    (S)S1 =0,diag(S)=0,nondiag(S1)=0,

    and the differentials are

    (dddd)1 =0,(A.14), and

    nondiag(d1 1 d1+1 d)=0,that is, (A.14) and

    dW= (dd)Z dZ, (A.19)

    nondiag(dZ+Z dZ dZ)=0. (A.20)Let d=dZ. Then vectorizing both sides of (A.19) leads to

    (WIp)(d)=(ZIp )vec(dd)(Im)(d),and (33) follows, whereY1 andY2 are the diagonal and off-diagonal components of

    /, respectively. For the diagonal components of dZ, premultiplication of(A.19) byZ leads to 2 diagWdZ=diag(Z(dd)Z), that is,

    dZ=( 12

    )W1 Z(dd)Z. (A.21)

  • 8/10/2019 00b7d51f18159dbbb1000000

    19/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 171

    Vectorizing (A.21) gives

    d=(12 ){diag(vec(W

    1

    ))}(Z Z)vec(dd),from whichY1in (35) immediately follows.

    For the nondiagonal elements of dZ, inserting Zd =Z dZdZin (A.20) into (A.19) premultiplied byZ gives

    WdZdZW= Z dZ(ImW )Z dZ. (A.22)

    Vectorizing (A.22) leads to

    (ImWWIm)(d)=((ImW )ZZ)Gp(d )(ZZ)(d),

    from which Y2 in (36) follows. Now, let Q=IpW1Z. Then noting Q =0 and using (A.19), we have diag(Q dQ)=diag(Q dQ), that is, vdg(d)=(Q Q)1vdg(Q dQ). Thus

    d =vec(d)=vec(diag((QQ)1vdg(Q(d)Q)))=Kp(Q Q)1Kvec(Q(d)Q)=Kp(Q Q)1K(Q Q)Gp(d ),

    from which (34) results.

    A.8. Proof of the equivalence of the three equations (S)1 =0,(S)S1 =0, and(S)1 =0 [19, Theorems 4.2 and 4.3]

    We assume the positive definiteness of, , andS. Then

    (S)1 =0 (S)11 =0(S)1 =0 (used 11 = 1)S1 = = S1(S)S1 =0.

    A.9. Proof of the expressions for (35) and (36)

    Take the partial derivatives ofh= 1 with respect toir andj:

    h

    ir=

    ir

    1+

    1

    ir

    +1

    ir =Jir1 1

    ir

    1+1Jir , (A.23)

  • 8/10/2019 00b7d51f18159dbbb1000000

    20/21

    172 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173

    h

    j =

    1

    j =1

    j

    1. (A.24)

    Inserting the following partial derivatives of with respect toir andj:

    ir=

    ir

    +

    ir

    =Jir+Jir ,

    j=Kjj,

    into (A.23) and (A.24), the results follow by taking the(u, v)element.

    Acknowledgement

    The authors thank the referee for very helpful comments.

    References

    [1] T.W. Anderson, Estimating linear statistical relationships, Ann. Stat. 12 (1984) 145.

    [2] T.W. Anderson, Y. Amemiya, The asymptotic normal distribution of estimators in factor analysis

    under general conditions, Ann. Stat. 16 (1988) 759771.

    [3] T.W. Anderson, H. Rubin, Statistical inference in factor analysis, in: Proceedings of the Third Berke-

    ley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley,

    vol. 5, 1956, pp. 111150.

    [4] P.M. Bentler, K.-H. Yuan, Optimal conditionally unbiased equivalent factor score estimators, in:

    M. Berkane (Ed.), Latent Variables with Applications in Causality, Springer, New York, 1997, pp.

    259281.

    [5] D.W. Gerbing, J.C. Anderson, Improper solutions in the analysis of covariance structures: their

    interpretability and a comparison of alternative respecifications, Psychometrika 52 (1987) 99111.

    [6] T.K. Dijkstra, On statistical inference with parameter estimates on the boundary of the parameter

    space, British J. Math. Statist. Psychol. 45 (1992) 289309.

    [7] K. Hayashi, P.K. Sen, On covariance estimators of factor loadings in factor analysis, J. Multivar.

    Anal. 66 (1998) 3845.

    [8] M. Ihara, Y. Kano, Asymptotic equivalence of unique variance estimators in marginal and condi-

    tional factor analysis models, Stat. Probab. Lett. 14 (1992) 337341.

    [9] R.I. Jennrich, Simplified formulae for standard errors in maximum-likelihood factor analysis, British

    J. Math. Stat. Psychol. 27 (1974) 122131.

    [10] R.I. Jennrich, A GaussNewton algorithm for exploratory factor analysis, Psychometrika 51 (1986)

    277284.

    [11] R.I. Jennrich, D.B. Clarkson, A feasible method for standard errors of estimate in maximum likeli-

    hood factor analysis, Psychometrika 45 (1980) 237247.[12] R.I. Jennrich, S.M. Robinson, A NewtonRaphson algorithm for maximum likelihood factor analy-

    sis, Psychometrika 34 (1969) 111123.

  • 8/10/2019 00b7d51f18159dbbb1000000

    21/21

    K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 173

    [13] R.I. Jennrich, D.T. Thayer, A note on Lawleys formulas for standard errors in maximum likelihood

    factor analysis, Psychometrika 38 (1973) 571580.

    [14] K.G. Jreskog, Some contributions to maximum likelihood factor analysis, Psychometrika 32(1967) 443482.

    [15] Y. Kano, Causes and treatment of improper solutions: exploratory factor analysis, Bulletin, vol. 24,

    The Faculty of Human Sciences, Osaka University, 1998 (in Japanese).

    [16] D.N. Lawley, Some new results in maximum likelihood factor analysis, in: Proceedings of the Royal

    Society of Edinburgh, Section A, vol. 67, 1967, pp. 256264.

    [17] D.N. Lawley, The inversion of an augmented information matrix occurring in factor analysis, in:

    Proceedings of the Royal Society of Edinburgh, Section A, 1976, pp. 171178.

    [18] D.N. Lawley, A.E. Maxwell, Factor Analysis as a Statistical Method, second ed., Elsevier, New

    York, 1971.

    [19] M. Okamoto, Foundations of Factor Analysis, Nikka-Giren, Tokyo, 1986 (in Japanese).

    [20] A.J. Swain, A class of factor analysis estimation procedures with common asymptotic sampling

    properties, Psychometrika 40 (1975) 315335.

    [21] O.P. van Driel, On various causes of improper solutions in maximum likelihood factor analysis,

    Psychometrika 43 (1978) 225243.

    [22] H. Yanai, K. Shigemasu, S. Maekawa, M. Ichikawa, Factor Analysis, Asakura-shoten, Tokyo, 1990

    (in Japanese).