of 21
8/10/2019 00b7d51f18159dbbb1000000
1/21
Linear Algebra and its Applications 321 (2000) 153173www.elsevier.com/locate/laa
The asymptotic covariance matrixof maximum-likelihood estimates in factor
analysis: the case of nearly singular matrix of
estimates of unique variances
Kentaro Hayashia, Peter M. Bentler b,
aDepartment of Psychology, Utah State University, Logan, UT 84322-2810, USAbDepartments of Psychology and Statistics, University of California, P.O. Box 951563, Los Angeles,
CA 90095-1563, USA
Received 26 March 1998; accepted 3 July 2000
Submitted by H.J. Werner
Abstract
This paper is concerned with the asymptotic covariance matrix (ACM) of maximum-like-
lihood estimates (MLEs) of factor loadings and unique variances when one element of MLEs
of unique variances is nearly zero, i.e., the matrix of MLEs of unique variances is nearly
singular. In this situation, standard formulas break down. We give explicit formulas for the
ACM of MLEs of factor loadings and unique variances that could be used even when an
element of MLEs of unique variances is very close to zero. We also discuss an alternative
approach using the augmented information matrix under a nearly singular matrix of MLEs of
unique variances and derive the partial derivatives of the alternative constraint functions with
respect to the elements of factor loadings and unique variances. 2000 Elsevier Science Inc.All rights reserved.
AMS classification: 62H25; 62F12
Keywords: Augmented information matrix; Factor loadings; Heywood case; Standard errors
This work was supported by National Institute on Drug Abuse grant DA01070. Corresponding author. Tel.: +1-310-825-2893; fax: +1-310-206-4315.E-mail addresses:[email protected] (K. Hayashi), [email protected] (P.M. Bentler).
0024-3795/00/$ - see front matter 2000 Elsevier Science Inc. All rights reserved.
PII: S 0 0 2 4 - 3 7 9 5 ( 0 0 ) 0 0 2 2 2 - 6
8/10/2019 00b7d51f18159dbbb1000000
2/21
154 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
1. Introduction
This paper is concerned with the asymptotic covariance matrix (ACM) of
maximum-likelihoodestimates (MLEs) of factor loadings and unique variances when
one element of MLEs of unique variances is nearly zero, i.e., when the MLE of the
matrix of unique variances is nearly singular. It has been known for a long time
that this situation occurs quite frequently in practice, e.g., Jreskog [14] noted that
. . .improper solutions are quite frequent. Out of the 11 sets of data considered only
two sets (Data 1 and 5) have proper solutions for all values ofk0 (the number of
factors). This is a most remarkable result (p. 473). While he developed an effective
computational method to yield parameter estimates in spite of this problem, he did
not provide standard error formulas that could be applied in this circumstance.The formulas for the ACM of MLEs of factor loadings and unique variances for
the regular case were obtained by Lawley [16], and they were systematically pre-
sented in [18]. There are some mistakes in the formulas presented by Lawley and
Maxwell [18] and the mistakes were corrected by Jennrich and Thayer [13]. Jennrich
[9] and Lawley [17] introduced the augmented information matrix approach which
simplifies the process of obtaining the ACM of MLEs of factor loadings and unique
variances.
The formulas presented in [13,18] involve a reciprocal of unique variances. When
one of the MLEs of unique variances approaches zero, the reciprocal diverges to plus
infinity. As a result, the estimated ACM computed using the standard formulas in[13,18] breaks down as one element of the MLEs of unique variances gets very close
to zero. Thus, it is desirable to come up with alternative formulas for the ACM of
MLEs of factor loadings and unique variances which do not involve a reciprocal of
unique variances. Furthermore, the standard formulas are likely to run into computa-
tional instabilities if some of the unique variances are very small, and rounding errors
can result in disastrous outcomes of the computations. The alternative methods, on
the other hand, should be more stable, avoiding problems with rounding errors. The
purpose of this paper is to give such alternative formulas and also an alternative
procedure to obtain the ACM that can be used even when the MLE of the matrix of
unique variances is nearly singular.Jennrich and Clarkson [11] developed a method to compute approximate standard
errors which makes use of a jackknife-like procedure. Their paper dealt with the
Heywood case (i.e., case with a zero value of MLE of a unique variance) problem
by expressing elements of the differentials of factor loadings without the inverse
of unique variances. In this paper, we present exact (both standard and alternative)
formulas for the ACM of MLEs of factor loadings and unique variances by making
use of the differentials obtained by Jennrich and Clarkson [11].
In addition to the formulas mentioned above, we also provide another approach
to compute the ACM of MLEs of factor loadings and unique variances using the
augmented information matrix under a nearly singular matrix of MLEs of unique
variances. The partial derivatives of the constraint functions given by Jennrich [9]
8/10/2019 00b7d51f18159dbbb1000000
3/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 155
involve reciprocals of elements of unique variances, so use of the standard partial
derivatives creates a problem as one of the elements of MLEs of unique variances
approaches zero. We use an alternative constraint function which does not involve re-
ciprocals of unique variances, and derive the partial derivatives of the alternative con-
straint functions with respect to the elements of factor loadings and unique variances
to implement the augmented information matrix approach.
We first introduce the formulas for MLEs of factor loadings and unique variances
and the ACM for the regular case in Section 2. An alternative formula for MLEs of
factor loadings is reviewed in Section 3, along with a discussion of several issues
associated with the case of a nearly singular matrix of MLEs of unique variances.
Then, in Section 4, we present our alternative formulas for the ACM of MLEs of
factor loadings and unique variances. Section 5 presents both the standard and alter-native formulas derived from the differentials given by Jennrich and Clarkson [11].
The augmented information matrix approach with an alternative constraint function
and the partial derivatives follows in Section 6. Section 7 is a brief conclusion.
2. The ACM of MLEs of factor loadings and unique variances: version
Let xi , i= 1, . . . , n, be a p1 random vector of observations with the meanvector 0 and the covariance matrix , be a pm matrix of factor loadings,fi beanm1 vector of the common factors, and i be ap1 vector of the unique fac-tors. The factor analysis model is given by xi= fi+iwith E(fi )=0, Cov(fi )=Im, E(i )=0, Cov(i )=, Cov(fi , i )=0, where is a positive definite (or pos-itive semidefinite) diagonal matrix. Then the covariance matrix is expressed as
= +. In MLE, we further assume that xi s are random samples from amultivariate normal population with mean vector 0 and covariance matrix , and the
constraint that1 is diagonal is imposed for to be identified. It is known thatthe MLEs of and are obtained by solving the following two equations:
= 1/2(Im)1/2, (1)
=diag(S), (2)where is an mm diagonal matrix whose elements are the first m largest ei-genvalues of1/2S1/2, a pm matrix whose columns are normalized(i.e., =Im) eigenvectors corresponding to the first m largest eigenvalues of
1/2S1/2, diag(A) denotes the diagonal matrix whose elements are the diagonalelements of the square matrixA, andSis the sample covariance matrix.
Let and be the MLEs of and ,=vec() and= vdg(), wherevec(
) denotes the pm-vector listing m columns of the pm matrix
starting
from the first column, and vdg(
) denotes the diagonal elements of
arranged
as a p
-vector. Anderson and Rubin [3] established the asymptotic multinormalityofn() andn( )under the following three assumptions: (i) isnonsingular, i.e., the determinant|| /=0, where is defined in (13), andis
8/10/2019 00b7d51f18159dbbb1000000
4/21
156 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
the Hadamand product; (ii) and are identified by the constraint that 1 isdiagonal and the diagonal elements are different and ordered; (iii) the sample covari-
ance matrix Sconverges to, in probability, andn(S)has an asymptotic mul-tinormal distribution. (Actually, the joint asymptotic multinormality of
n(( )
( ))also holds. See, e.g., [2].)We now present the standard formulas for the ACM of MLEs of factor loadings
and unique variances. The original elementwise formulas were given by Lawley and
Maxwell [18], with correction by Jennrich and Thayer [13]. The formulas that we
give are a matrix version of the identical results, which is slightly modified from the
matrix results given by Hayashi and Sen [7]. (Note: The proof for the equivalence
between the elementwise formulas and the matrix formulas for A andB are given in
Appendix A.3. The matrix formulas for Eand
were given by Lawley and Maxwell[18].) The formulas for the ACM are: Var() Cov(, )
Cov(,) Var( )
=
1
n
A+2B EB 2B E
2EB 2E
, (3)
whereA (of orderpmpm),B (of orderppm), andE (of orderpp) are asfollows:
A= {M+(MM)(diag(Km))(Im)} {A1A1 A2}, (4)
B= {2
(Im)1
1p}{1m+(1m )(diag(Kmb))(Im)}, (5)
E= ()1, (6)with
A1=1m1p(diag())(Im1p1p), (7)
A2= 1p1p, (8)
=vec((Im)2 1m1m+
12
Im), (9)
M= (Im)1, (10)
=vec1(((ImIm)2)+1m2 ), (11)
b =(Im Im2 diag(vec((Im))))11m2 , (12)
=
1
1(
Im)
1
1, (13)
where is anmm diagonal matrix whose elements are the first m largest eigen-values of1/21/2, vec1 the inverse operation of vec, i.e., vec(
)=((Im
8/10/2019 00b7d51f18159dbbb1000000
5/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 157
Im)2)+1m2 with
being anmmmatrix, +in (11) the MoorePenrose
inverse, the Kronecker product, Km the m2
m2
commutation matrix definedsuch thatKmvec(G)=vec(G)for anymm matrixG, diag(z)denotes the diag-onal matrix whose diagonal elements are vectorz,Imis the m-dimensional identity
matrix, and 1mis the m-vector whose elements are all 1s.
Note that Eqs. (5) and (13) involve inverses of unique variances. Also, the es-
timate of the (1, 1) element of, i.e., the estimate of the largest eigenvalue of
1/2
1/2, gets very large as one of the MLEs of unique variances approaches
zero. Thus, Eqs. (8)(12) also need to be modified.
3. The case with a nearly singular matrix of MLEs of unique variances
We now discuss the case where one element of the matrix of MLEs of unique
variances is very close to zero, i.e., is nearly singular. (We deal only with explor-atory factor analysis in this paper. For confirmatory factor analysis, see, e.g., [6].)
In this section, we discuss several issues associated with a nearly singular matrix of
MLEs of unique variances.
First, note that the assumption of positive semidefiniteness of, i.e., {: 0},is actually{:0 }, since is also positive semidefinite. We further as-sume that the true parameter values 0 of unique variances lie in the interior of theparameter space{: 0 }; thus 0 is nonsingular. (One of the regularityconditions for the asymptotic normality of MLEs requires that the neighborhood
around the true parameter values needs to be inside the parameter space. See, e.g.,
assumption (vi) of Theorem 2 of Anderson and Amemiya [2, p. 764].) The assump-
tion of nonsingularity of0implies that the probability of obtaining a nonsingular(i.e., in the interior of{: 0 }) approaches unity as sample size increases.
Second, the MLEs of factor loadings can be computed even when one element of
MLEs of unique variances is exactly zero, by using the eigenvaluesr ofUU,
whereU is the Cholesky factor ofS1 (i.e., S1
=UU), instead of using the
eigenvalues r of1/2S1/2. We will state this as Observation 1, as follows.
Observation 1[12, 21, 22].(i) Assume thatSis positive definite. Letr be the eigenvalues ofU
U with
S1 =UU, where U is an upper triangular matrix with positive diagonalelements obtained by Cholesky decomposition of S1, and let r be the nor-malized eigenvectors corresponding tor. LetN
be anmm diagonal matrixwhose elements are the m smallest eigenvalues 1 , . . . ,
m(
1
8/10/2019 00b7d51f18159dbbb1000000
6/21
158 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
= 1/2UZN1/2. (15)
(ii)The matrix of factor loadings is computed without and,as follows:
=U1Z(ImN)1/2. (16)
Thus,alternatively,the MLEs of and are obtained by solving Eqs. (16)and(2),
instead of(1)and(2).
See Appendix A.1 for the proof of Observation 1. Note that Eq. (1) is derived from
(S
)1
=0, while Eq. (16) is derived from (S
)S1
=0. In fact, the
three equations (S)1 =0, (S)S1 =0, and (S)1 =0 areequivalent, except that positive definiteness of, S, and are assumed, respectively
[19, Theorems 4.2 and 4.3]. For example, equivalence of (S)1 =0 and(S)1 =0 can be shown by making use of the identity 11 = 1[18, Eq. (4.7)] to convert one to the other (see Appendix A.8).
Third, although the MLEs of factor loadings can be computed even when one
element of MLEs of unique variances is exactly zero, caution has to be taken in
interpreting a zero value of
j in the same way as a strictly positive value of
j. Even
if a zero value of an estimate is an MLE in the sense that the log-likelihood function
is maximized at that value, it is not a stationary point of the likelihood equation in theinterior of the parameter space. Thus, we restrict ourselves to dealing with the ACM
of MLEs of factor loadings under a nearly singular, but not a strictly singular, matrix
of MLEs of unique variances in this paper. (See e.g., [5,15] for attempts to include
a strictly singular matrix of MLEs of unique variances as long as we are confident
in our assumption that 0 is nonsingular. However, further studies are needed on
whether the formulas given below still approximate well the true ACM in such a
case.)
Fourth, as thejth diagonal element
jof the matrix of MLEs of unique variances
gets closer and closer to zero, the matrix of MLEs of factor loadings follows a spe-
cific pattern: thejth row of the matrix of MLEs of factor loadings approaches zero,except for the(j, 1)element, which gets close to the square root of the jth diagonal
element of sample covariance matrix (except for the sign change). We state this as
the following observation.
Observation 2. Ifj 0 (with the rest of the MLEs of unique variances not nearlyzero),thenj1s 1/2jj (orj1 s1/2jj )andj r 0, r= 2, . . . , m.
The proof is given in Appendix A.2. This result must have been known by Jr-
eskog [14], who described the phenomenon (e.g., in his Table 5) and developed
a partialing procedure to yield estimates of j r= r with the exact property thatr= 0.
8/10/2019 00b7d51f18159dbbb1000000
7/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 159
4. The ACM of MLEs of factor loadings and unique variances: version
We now give alternative formulas for the asymptotic covariance matrix of MLEs
of factor loadings and unique variances which can be used even when one element of
estimated unique variances is very close to zero. As japproaches zero, Eqs. (5) and
(13) become unstable (because they involve 1j
, which diverges). Thus it is nec-
essary to replace these equations with alternative formulas that do not involve 1j
(see e.g., [4]). Our approach is motivated by theirs. In addition, the(1, 1) element of
also gets very large as japproaches zero. Thus, Eqs. (8)(12) also need to be
modified. Our modified formulas are as follows:
A= {M+(MM)(diag(Km))(Im)}{A1A1A2}, (17)
B= {1M1p}{1mIp+(1m 1)(diag(Kmb))(Im)}, (18)
E= ()1, (19)with
A1=1m1p(diag())(Im1p1p), (20)
A2=N N
N1p1p, (21)
=vec((ImN )2N
N2 1m1m+( 12 )Im), (22)
M= (ImN )1, (23)
N
=vec1
(((N ImImN )2
)+1m2 ), (24)
b =(NImImN2 diag(vec(ImN )))11m2 , (25)
= 1 1M1, (26)where Nis an mm diagonal matrix whose elements are the m smallest eigenvalues(in ascending order) ofUUwith 1 =UU, whereUis an upper triangular ma-trix with positive diagonal elements obtained by the Cholesky decomposition of1.(The elementwise formulas corresponding to (17) and (18) are given in Appendix
A.4.)It is easy to show the equivalence between Eqs. (4)(13) and Eqs. (17)(26),
by noting the identities N= 1, 11 = 1, and 1(Im)1 =
8/10/2019 00b7d51f18159dbbb1000000
8/21
160 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
1M. See Appendix A.5 for the proof of the equivalence. As before, we assume
that the determinant|
| /=
0 and thus is nonsingular, so that we can
compute E in (19) using in (26). ( in itself is in general not of full rank, but of
rankpm, see, e.g., [1, p. 23]).In conclusion, the ACM of MLEs of factor loadings and unique variances can be
computed using Eqs. (3), (17)(26), in place of Eqs. (3), (4)(13), including when
one element of the MLEs of unique variances is nearly zero.
5. Alternative matrix formulas: version and version
Alternatively, it is possible to construct the matrix formulas for the ACM of esti-mates of factor loadings and unique variances based on the differentials reported in
[11,13] with some modifications. The version of formulas is as follows: Var() Cov(, )
Cov(,) Var( )
=
(Cov(s))
, (27)
where the matrices of partial derivatives ofand involving are given by
=(1 Ip)
Gp
(1 )(1+2), (28)
=K p( )1Kp()Gp, (29)
with
= 1, = 1, = 1 1,
Kp=p
i=1(Jp,i J
p,iJp,i ),
and
1= 12 (diag(vec(1)))( )Gp
, (30)
2= (Im Im)+
( )Gp((+Im) )
, (31)
with the p-dimensional i th unit vector Jp,i , and under normal sampling, Cov(s) is
given by
Cov(s)=( 1n
)Hp(Ip2+Kp)( )Hp, (32)
whereK
p is thep2
p2
commutation matrix (i.e.,K
pvec(A)
=vec(A
)
for anypp matrixA), andHp= (GpGp)1GpandGpis thep2 p(p+1)/2 duplica-tion matrix (i.e., vec(S) = Gpvech(S)forany ppsymmetric matrix S). (Essentially
8/10/2019 00b7d51f18159dbbb1000000
9/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 161
the identical expression to /is obtained by Ihara and Kano [8].) The proof forthe alternative matrix approach ( version) is given in Appendix A.6.
The version of the formulas (see also [7]) replaces the matrix of partial deriva-
tives ofand in (28) and (29) by
=(W1ZIp)
Gp
(W1 )(Y1+Y2), (33)
=K p(Q Q)1Kp(QQ)Gp, (34)
respectively, withW= 1, Z= 1,Q=IpW1Z, and
Y1= 12 (diag(vec(W1)))(ZZ)Gp
, (35)Y2= (Im WWIm)+
((ImW )ZZ)Gp(ZZ)
. (36)
See Appendix A.7 for the proof of the partial derivatives in the version.
6. The augmented information matrix approach
An alternative method to obtain the ACM of the MLEs of factor loadings is to
consider it as a constrained MLE problem, using the augmented information matrix
[9,17]. The augmented information approach gives a procedure to compute the ACM,
but it does not give explicit formulas for the elements of the ACM. However, this
approach is easy to implement; it is applicable to other rotated solutions as well;
therefore it is a very practical approach. In this section, we consider modifying the
standard augmented information matrix approach so that it can be used even when
an element of the matrix of MLEs of unique variances is nearly zero.
In case of the unrotated, unstandardized factor loadings, the formulas for the ele-ments of the information matrix are given by:
xir,js=
1n
E
2L
ir j s
=ij(1)rs+(1)is (1)j r , (37)
yir,j=
1n
E
2L
ir j
=ij(1)j r , (38)
zi,j=
1n
E 2L
ij
=1
2
(ij)2, (39)
8/10/2019 00b7d51f18159dbbb1000000
10/21
162 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
for 1 i, j p, 1 r, s m[9], whereL is the log likelihood function, ij the
(i,j)element of1, and(V )rs is the(r, s)element of matrix V. Them(m
1)/2
constraint functionsguv on the parametersir andjare
guv= (1)uv (40)for 1 u < v m, and the partial derivatives ofguv with respect toir andj are
given by
f1ir,uv=guv
ir=(ruiv+rviu )2i , (41)
f2j,uv=guv
j= j uj v 2j . (42)
Now define the matricesX=(xir,js ), Y= (yir,j), and Z=(zi,j) from the co-ordinatewise expressions in Eqs. (37)(39). (Note that the subscripts r and s serve
as row and column block indices, respectively, while i and jare row and column
indices within each block. That is,xir,js is the((r1)p+i, (s1)p+j )elementofX, yir,j the ((r1)p+i, j) element ofY, and zi,j is the (i, j) element ofZ.The orders of X, Y, and Z are pmpm, pmp, and pp, respectively. Xand Z are symmetric matrices. Likewise, define the matrices of partial derivatives
F1= (f1ir,uv)and F2= (f2j,uv)from the coordinatewise expressions in Eqs. (41) and(42). (f1ir,uv is the ((r
1)p
+i , u(2m
u
1)/2
+v
m) element ofF1, f
2j,uv
is the (j, u(2mu1)/2+vm) element ofF2. The orders ofF1 and F2 arepmm(m1)/2 andpm(m1)/2, respectively.) Then the augmented infor-mation matrix is given by the sample sizen times the (p(m+1)+m(m1)/2) (p(m+1)+m(m1)/2)matrix whose submatrices are arranged as
X Y F1Y Z F2F1 F
2 0
, (43)
and the ACM for the MLEs of factor loadings and unique variances is the
p(m
+1)
p(m
+1)submatrix corresponding to the first p(m
+1)rows and col-
umns of the inverse of the augmented information matrix.Here, note thatxir,js , yir,j, andzi,jin (37)(39) are functions of the elements of
1 and , which are all finite. Thus the functions xir,js , yir,j, andzi,jare all finite,
and the estimates ofxir,js , yir,j, andzi,jcan be computed without any modification
of the formulas even when the MLE ofj is nearly zero. On the other hand, the
equations for the partial derivatives f1ir,uv
and f2j,uv
in (41) and (42) involve 1j
and thus for some elements, the estimates of f1ir,uv and f2
j,uv become very large
when the MLE ofjis nearly zero.
Motivated by Bentler and Yuan [4], Okamoto [19], and Swain [20], we use the
alternative constraint functionsh
uv=(
1)
uv, instead ofg
uv=(
1)
uvin (30), when an element of MLE of is nearly zero. In fact, the constraint that
1 is diagonal is equivalent to the constraint that 1 is diagonal, except
8/10/2019 00b7d51f18159dbbb1000000
11/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 163
that the former constraint requires the assumption that is positive definite, while
the latter constraint requires the assumption that is positive definite. To see the
equivalence of the two constraints in the regular case, first note the identity:
1 = 1(Im+1)1 (44)
and note that, since the RHS of (44) is diagonal, the LHS of (44) also has to be
diagonal.
The partial derivatives ofhuv with respect to ir and j, which are used inside
the augmented information matrix are given by:
huv
ir=(Jir1+1Jir 1(Jir+Jir )1)uv, (45)
huv
j= (1Kjj1)uv, (46)
whereJir is a pm matrix whose (i, r) element is 1 and the rest of the elementsare all zero, and Kjj is a pp matrix whose (j,j) element is 1 and the rest ofthe elements are all zero (see [9, p. 125]). The proof for (45) and (46) is given in
Appendix A.9.
Thus to compute the ACM of MLEs of factor loadings and unique variances when
one element of the MLEs of unique variances is nearly zero, we recommend using the
partial derivatives ofhuvwith respect to ir and jgiven in (45) and (46), in place ofthe partial derivatives ofguvwith respect to ir and j in (41) and (42). The ACM of
MLEs of factor loadings and unique variances is given by the p(m+1)p(m+1)submatrix corresponding to the first p(m+1) rows and columns of the inverse ofthe augmented information matrix withF1and F2replaced by the partial derivatives
ofhuv . We should note that the augmented information matrix approach with the
partial derivatives of the alternative constraint functions can be used whether or not
an element of the MLEs of unique variances is nearly zero.
7. Conclusion
In this paper, we dealt with the ACM of MLEs of (unstandardized, unrotated)
factor loadings and unique variances when an element of MLEs of unique variances
is nearly zero, that is, the matrix of MLEs of unique variances is nearly singular.
The standard formulas for the ACM given by Lawley and Maxwell [18] involve
the inverse of the unique variances. Thus, we encounter a problem when one of the
MLEs of unique variances approaches zero, since the reciprocal of the MLE of this
unique variance gets very large.
We presented alternative formulas for the ACM which can be used even whenan element of MLEs of unique variances is nearly zero. The derivation of the alter-
native formulas involved replacing the expressions in terms of the inverse of unique
8/10/2019 00b7d51f18159dbbb1000000
12/21
164 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
variances by the expressions in terms of the inverse of the covariance matrixwhose
elements are all finite. The alternative formulas given in Sections 4 and 5 are exact
asymptotic formulas, and they can be used whether or not an element of MLEs of
unique variances is nearly zero. In this regard, we consider use of the alternative
formulas to be more practical than the standard formulas.
However, it should be noted that statistical instabilities that might arise in the case
of unique variances near zero cannot be avoided just by using alternative formulas.
For example, it is just possible that the asymptotic standard error of the MLE of a
unique variance that is near zero is a very bad approximation to the true standard er-
ror in small samples. (The authors thank the referee for noting this important point.)
This is the area where we certainly need further research. See also [5,15] on the
issues closely related with this point.Furthermore, we used alternative constraint functionshuv= (1)uvinsteadofguv= (1)uv , in the context of the augmented information matrix approach,when an element of the MLE of unique variances is nearly zero. In fact, use of
the constraint that 1 is diagonal is not new; for example, it was mentionedby Swain [20]. However, to our knowledge, use of the constraint functionhuv=(1)uvin the context of the augmented information matrix approach for avoid-ing use of the inverse of unique variances, as well as the formulas for the partial
derivatives ofhuv with respect toir andjto be used inside the augmented infor-
mation matrix, are new.
The augmented information approach has an advantage in that this approach isapplicable to other rotated solutions as well. In fact, only the matricesF1 andF2 of
the partial derivatives of the constraint functions need to be modified for obtaining
the standard errors for various rotated solutions. As long as the formulas for the
constraint functions are not very complex, in general, the partial derivatives can be
obtained fairly easily.
Appendix A
A.1. Proof of Observation 1
(i) By definition of the eigenvalueeigenvector equation, (UU)Z=ZN.Rearrange this equation to:
(1/2S1/2)(1/2UZN1/2)=(1/2UZN1/2)N1, (A.1)
and comparing (A.1) with the eigenvalueeigenvector equation(1/2S1/2)
= gives (14) and (15).(ii) Use Eq. (4.5) of Lawley and Maxwell [18]:(S)1 =0, and rearrange
it as
(UU)(U(S1)1/2)=(U(S1)1/2)(ImS1). (A.2)
8/10/2019 00b7d51f18159dbbb1000000
13/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 165
(See, e.g., [12,19]). Comparing (A.2) with the eigenvalueeigenvector equation
(UU)Z=
ZN gives N=
Im
S1 and Z=
U(S1)1/2=U(ImN)1/2. Thus (16) follows.
A.2. Proof of Observation 2
We omit the hat notation thereafter for simplicity. The rth largest eigenvalues
r , r= 2, . . . , m, of1/2S1/2, except the largest eigenvalue, do not get verylarge as long as j is the only unique variance which is nearly zero. The (j,r)
element of = 1/2(Im)1/2 in (1) is j r= 1/2j j r (r1)1/2. Let be apositive quantity very close to zero. When r /=1, j= with bounded j r(|j r | 1) and finite (not very large) rgives j r= r , r= 2, . . . , m, where rare quantitiesvery close to zero. Thus, the (j, j) element ofS + is sjj 2j1+( 22+ +2m+ ), andj1s 1/2jj orj1 s
1/2jj
follows.
A.3. Outline of proof of the equivalence of the standard elementwise expressions
and the matrix expressions (4), (5), (7)(12)
Fori, j= 1, . . . , p, andr, s=1, . . . , m, the standard elementwise formulas forAand B ( version: [13,18]) are
air,jr= r ij 12r ir j r+ m
k /=r(krk ik j k ) , (A.3)
air,js= {r s (rs )2}is j r forr /=s, (A.4)
bj,ir= j r (r1)12j
ijj
1
2
ir j r (r1)1 +r
mk /=r
(ik j k (rk )1)
,
(A.5)
where ris the rth element of the mm diagonal matrixwhich has as its elementsthe m largest eigenvalues of1/21/2; ij= 1 ifi= jand ij= 0 ifi /=j; andr andrk are defined in terms ofr s as follows:
r=r
r1, (A.6)
rk= r1r
k
2
1. (A.7)
First, we show that (A.3) is an element of the first term in (4). Express (A.3) as
air,jr= r (ij+m
k=r (k
rkik j k )), where
rk= rk if k /=r , rk= 1/2 if
8/10/2019 00b7d51f18159dbbb1000000
14/21
166 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
k=r , and write the first term in (4) as M+(MM)(diag(Km))(Im
)=
(M
Ip){
Im
+(Im
)(Im
M)(diag(vec()))(Im
)}
with
=vec1(). The equivalence ofand rkis obvious by comparing (Im)2and (r1)2, and ImIm and rk. The rest is straightforward bynoting that MIp in (4) corresponds to r,Imcorresponds to ij, and(Im)(Im M)(diag(vec()))(Im)corresponds to
mk=1(k
rkik j k).
Next, we show that (A.4) is an element of the second term in (4). First, notice that
A1in (7) corresponds tois in (A.4). 1m1pincludes both the cases ofr= sandr /=s . To set the block diagonal elements which correspond to the case ofr= sequal to zero, we need to subtract a block diagonal matrix (diag())(Im1p1p)from 1m1p. Similarly,A1in (7) corresponds toj r in (A.4). Next,A2in (8)corresponds tor s /(rs )2 inair,js in (A.4) and it is easy to see that
in (11)corresponds to(rs )2.
It is obvious that the first term in B in (5) corresponds to j r (r1)12j inbj,ir in (A.5). Express
1
2
ir j r (r1)1 +r
mk /=r
(ik j k (rk)1)
as rm
k=1ik j k brk, where
brk= (rk )1 ifr /=k and brk= (2r (r1))1
ifr= k . The equivalence ofb and brk is obvious by noting that ImIm corresponds to rk and (Im) corresponds to r (r1). The rest isstraightforward by noting that (1m)(diag(Kmmb))(Im)corresponds torm
k=1(ik j k brk ).
A.4. Alternative elementwise formulas forA andB ( version)
Fori, j= 1, . . . , p, andr, s=1, . . . , m,
air,jr= r
ij
1
2
r ir j r+
m
k /=r(krk ik j k )
, (A.8)
air,js= {r s (sr )2}is j r forr /=s, (A.9)
bj,ir= r
ps=1
j s sr
ij
1
2
ir r
ps=1
j s sr
+m
k /=rik (kr )1
ps=1
j s sk
, (A.10)
wherer are the eigenvalues ofUU, andr and rk are defined in terms ofr s
as follows:
r= (1r )1, (A.11)
8/10/2019 00b7d51f18159dbbb1000000
15/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 167
rk
=2k
1rkr
2
1. (A.12)
A.5. Proof of the equivalence of the matrix expressions ( version) and the
alternative matrix expressions ( version)
(i) To show the equivalence of (5) and (18):
B= {2(Im)1 1p}
{1
m
+(1
m
)(diag(K
mb))(I
m
)}= {1(Im)1 1p}
{1mIm+(1m1)(diag(Kmb))(Im)}= {1M1p}
{1mIm+(1m1)(diag(Kmb))(Im)}= {1M1p}
{1mIm+(1m11)()(diag(Kmb))(Im)}
= {1
M1p}{1mIm+(1m 1){diag(()Kmb)}(Im)}
= {1M1p}{1mIm+(1m 1){diag(Km()b)}(Im)}
= {1M1p}{1mIp+(1m 1)(diag(Kmb))(Im)},
since
1
1 = 1, 1(Im)1 = 1M,
and
()b=()(ImIm2 diag(vec((Im))))11m2=(1 ImIm1 2(11)diag(vec((Im))))11m2= [1 ImIm1 2diag{vec((Im)1)}]11m2=(NImImN2diag(vec(ImN )))11m2= b.
8/10/2019 00b7d51f18159dbbb1000000
16/21
168 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
(ii) To show the equivalence of (8) and (21):
vec(
)
=()vec =()((ImIm)2)+1m2=((1 1)(ImIm)2)+1m2=(()(1 1)2(Im Im)2)+1m2=(1 1)((1 ImIm1)2)+1m2=(NN ){((NImImN )2)+1m2}=(NN )vecN
=vec(NN
N ).
(iii) To show the equivalence of (9) and (22):
vec((Im)2
)
=(Im(Im)2)((ImIm)2)+1m2= {{(Im(Im))1(Im Im)}2}+1m2= {{(Im(Im)1)(Im Im)}2}+1m2
= {{Im
(
Im)
1
(
Im)
1
}2
}+1m2
= {{Im(Im1)1 (Im1)11}2}+1m2= {{Im(ImN )1 N1 (ImN )1N}2}+1m2= {(N2 (ImN )2){NImImN}2}+1m2=(N2 (ImN )2)vecN
=vec((ImN )2N
N2).
(iv) To show the equivalence of (10) and (23): UsingN= 1,
M= (Im)1 =N1(N1 Im)1 =(N N1 N )1 =(ImN )1.(v) To show the equivalence of (13) and (26): Use the identity 1 = 1
1
1
1, and subtract this from in (13):
1 =1(1 (Im)1)1=1(1 (Im)1)1=1(Im)11=1M1,
since
1
1 = 1
8/10/2019 00b7d51f18159dbbb1000000
17/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 169
and
1
(Im)1
= 1
(Im)1
.
A.6. Proof of the expressions for the matrices of partial derivatives ofand (version)
The likelihood equations are written in the form: (S)1 =0,diag(S)=0, and nondiag(1)=0, the differentials of which are
(dd dd)1 =0,
diag(dd dd)=0,
and
nondiag(d1 1 d1+1 d)=0,respectively, that is,
d =(dd) d, (A.13)diag(dd dd)=0, (A.14)nondiag(d
+ d
d)
=0. (A.15)
Vectorizing both sides of (A.13) and letting, db =vec(d), (Ip)(d)=(1 Ip)vec(dd)(Im)(db). Thus, letting b/=1+2,where 1 and 2 are the diagonal and the off-diagonal components of
b/,respectively, (28) follows.
Now, premultiplying (A.13) by gives d+ d= (dd). Forthe diagonal elements of d, 2 d= (dd), that is,
d=
1
2
1 (dd). (A.16)
Vectorizing (A.16) gives
db =
1
2
vec(1)vec((dd))
=
1
2
vec(1)( )vec(dd)
=
1
2
diag(vec(1))( )vec(dd).
Thus1 in (30) follows.
Next, for the nondiagonal elements of d
, inserting nondiag(
d)
=nondiag( dd) in (A.15) into nondiag( d+ d)=nondiag((dd))in (A.14) gives
8/10/2019 00b7d51f18159dbbb1000000
18/21
170 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
nondiag( dd)=nondiag( d d(+Im)). (A.17)
The vectorized version of (A.17) is(Im Im)(db)=( )vec(d)((+Im) )vec(d),
from which 2 in (31) follows. Finally, noting =0 and =0, (A.14) leadsto diag( dS)=diag( d). Then
vdg(d)=()1vdg(d). (A.18)Vectorizing the diagonal matrix whose elements are (A.18) gives
d =vec(d)=vec(diag(()1vdg((d))))
=Kp()1
Kpvec((d))=Kp()1Kp( )Gp(d ),
from which (29) follows.
A.7. Proof of the expressions for the matrices of partial derivatives ofand (version)
The likelihood equations in the version are
(S)S1 =0,diag(S)=0,nondiag(S1)=0,
and the differentials are
(dddd)1 =0,(A.14), and
nondiag(d1 1 d1+1 d)=0,that is, (A.14) and
dW= (dd)Z dZ, (A.19)
nondiag(dZ+Z dZ dZ)=0. (A.20)Let d=dZ. Then vectorizing both sides of (A.19) leads to
(WIp)(d)=(ZIp )vec(dd)(Im)(d),and (33) follows, whereY1 andY2 are the diagonal and off-diagonal components of
/, respectively. For the diagonal components of dZ, premultiplication of(A.19) byZ leads to 2 diagWdZ=diag(Z(dd)Z), that is,
dZ=( 12
)W1 Z(dd)Z. (A.21)
8/10/2019 00b7d51f18159dbbb1000000
19/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 171
Vectorizing (A.21) gives
d=(12 ){diag(vec(W
1
))}(Z Z)vec(dd),from whichY1in (35) immediately follows.
For the nondiagonal elements of dZ, inserting Zd =Z dZdZin (A.20) into (A.19) premultiplied byZ gives
WdZdZW= Z dZ(ImW )Z dZ. (A.22)
Vectorizing (A.22) leads to
(ImWWIm)(d)=((ImW )ZZ)Gp(d )(ZZ)(d),
from which Y2 in (36) follows. Now, let Q=IpW1Z. Then noting Q =0 and using (A.19), we have diag(Q dQ)=diag(Q dQ), that is, vdg(d)=(Q Q)1vdg(Q dQ). Thus
d =vec(d)=vec(diag((QQ)1vdg(Q(d)Q)))=Kp(Q Q)1Kvec(Q(d)Q)=Kp(Q Q)1K(Q Q)Gp(d ),
from which (34) results.
A.8. Proof of the equivalence of the three equations (S)1 =0,(S)S1 =0, and(S)1 =0 [19, Theorems 4.2 and 4.3]
We assume the positive definiteness of, , andS. Then
(S)1 =0 (S)11 =0(S)1 =0 (used 11 = 1)S1 = = S1(S)S1 =0.
A.9. Proof of the expressions for (35) and (36)
Take the partial derivatives ofh= 1 with respect toir andj:
h
ir=
ir
1+
1
ir
+1
ir =Jir1 1
ir
1+1Jir , (A.23)
8/10/2019 00b7d51f18159dbbb1000000
20/21
172 K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173
h
j =
1
j =1
j
1. (A.24)
Inserting the following partial derivatives of with respect toir andj:
ir=
ir
+
ir
=Jir+Jir ,
j=Kjj,
into (A.23) and (A.24), the results follow by taking the(u, v)element.
Acknowledgement
The authors thank the referee for very helpful comments.
References
[1] T.W. Anderson, Estimating linear statistical relationships, Ann. Stat. 12 (1984) 145.
[2] T.W. Anderson, Y. Amemiya, The asymptotic normal distribution of estimators in factor analysis
under general conditions, Ann. Stat. 16 (1988) 759771.
[3] T.W. Anderson, H. Rubin, Statistical inference in factor analysis, in: Proceedings of the Third Berke-
ley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley,
vol. 5, 1956, pp. 111150.
[4] P.M. Bentler, K.-H. Yuan, Optimal conditionally unbiased equivalent factor score estimators, in:
M. Berkane (Ed.), Latent Variables with Applications in Causality, Springer, New York, 1997, pp.
259281.
[5] D.W. Gerbing, J.C. Anderson, Improper solutions in the analysis of covariance structures: their
interpretability and a comparison of alternative respecifications, Psychometrika 52 (1987) 99111.
[6] T.K. Dijkstra, On statistical inference with parameter estimates on the boundary of the parameter
space, British J. Math. Statist. Psychol. 45 (1992) 289309.
[7] K. Hayashi, P.K. Sen, On covariance estimators of factor loadings in factor analysis, J. Multivar.
Anal. 66 (1998) 3845.
[8] M. Ihara, Y. Kano, Asymptotic equivalence of unique variance estimators in marginal and condi-
tional factor analysis models, Stat. Probab. Lett. 14 (1992) 337341.
[9] R.I. Jennrich, Simplified formulae for standard errors in maximum-likelihood factor analysis, British
J. Math. Stat. Psychol. 27 (1974) 122131.
[10] R.I. Jennrich, A GaussNewton algorithm for exploratory factor analysis, Psychometrika 51 (1986)
277284.
[11] R.I. Jennrich, D.B. Clarkson, A feasible method for standard errors of estimate in maximum likeli-
hood factor analysis, Psychometrika 45 (1980) 237247.[12] R.I. Jennrich, S.M. Robinson, A NewtonRaphson algorithm for maximum likelihood factor analy-
sis, Psychometrika 34 (1969) 111123.
8/10/2019 00b7d51f18159dbbb1000000
21/21
K. Hayashi, P.M. Bentler / Linear Algebra and its Applications 321 (2000) 153173 173
[13] R.I. Jennrich, D.T. Thayer, A note on Lawleys formulas for standard errors in maximum likelihood
factor analysis, Psychometrika 38 (1973) 571580.
[14] K.G. Jreskog, Some contributions to maximum likelihood factor analysis, Psychometrika 32(1967) 443482.
[15] Y. Kano, Causes and treatment of improper solutions: exploratory factor analysis, Bulletin, vol. 24,
The Faculty of Human Sciences, Osaka University, 1998 (in Japanese).
[16] D.N. Lawley, Some new results in maximum likelihood factor analysis, in: Proceedings of the Royal
Society of Edinburgh, Section A, vol. 67, 1967, pp. 256264.
[17] D.N. Lawley, The inversion of an augmented information matrix occurring in factor analysis, in:
Proceedings of the Royal Society of Edinburgh, Section A, 1976, pp. 171178.
[18] D.N. Lawley, A.E. Maxwell, Factor Analysis as a Statistical Method, second ed., Elsevier, New
York, 1971.
[19] M. Okamoto, Foundations of Factor Analysis, Nikka-Giren, Tokyo, 1986 (in Japanese).
[20] A.J. Swain, A class of factor analysis estimation procedures with common asymptotic sampling
properties, Psychometrika 40 (1975) 315335.
[21] O.P. van Driel, On various causes of improper solutions in maximum likelihood factor analysis,
Psychometrika 43 (1978) 225243.
[22] H. Yanai, K. Shigemasu, S. Maekawa, M. Ichikawa, Factor Analysis, Asakura-shoten, Tokyo, 1990
(in Japanese).