LATTICE MODELS FOR CONDITIONAL INDEPENDENCE IN A
MULTIVARIATE NORMAL DISTRIBUTION
by
Steen Arne AnderssonMichael D. Perlman
TECHNICAL REPORT No. 155 (Revised)
August 1991
Department of Statistics, GN~22
Universityof Washington
Seattle,Washington 98195 USA
LAITICE MODElS FOR CONDITIONAL INDEPENDENCE IN A
MULTIVARIATE NORMAL DISTRIBlITION1,2
BY
STEEN ARNE ANDERSSO~
DEPARTMENT OF MATHEMATI~
UNIVERSITY OF INDIANA
AND
MIaIAEL D. PERLMAN
DEPARTMENT OF STATISTI~
UNIVERSITY OF WASHINGTON
ITh i s research was supported in part by the Danish Research Council and by
U.S. National Science Foundation Grant Nos. DMS 86-03489 and 89-02211.
1991.
s was out at the Instititute of Mathematical
Statistics.
SU11llll8.ry
The lattice conditional independence model N(::Il} is defined to be the
set of all normal distributions on R1 such that for every pair L.M € ::Il,
XL and ~ are conditionally independent given XLnM' Here ::Il is a lattice
of subsets of the finite index set I and, for K € ::Il, ~ is the coordin
ate projection of x € R1 to RK. Statistical properties of N(::Il} are
studied, eg .• maximum likelihood inference. invariance. and the problem
of testing HO: N(::Il} vs H: N(i} when i is a sublattice of ::Il. The set J(::Il)
of join-irreducible elements of ::Il plays a central role in the analysis
of N(::Il}. This class of statistical models is relevant to the analysis of
non-nested multivariate missing data patterns.
W$ 1980 subject classification: Primary 62H12, 62H15; Secondary 62H20.
62H25.
Key words and phrases: Distributive lattice, join-irreducible elements.
pairwise conditional independence. multivariate normal distribution.
generalized block....tri.angular matrices., maximum likelihood
quotient spaces,
fl. INTRODUCTION •••••••••••••••••••••••••••••••••••••••••••••••••••••• ····1
§2. THE CLASS ':;If I) OF COVARIANCE MATRICES 2: DETERM:INED BY PAIRWISE
CONDITIONAL INDEPENDENCE WIm RESPECT TO A FINITE DISTRIBUTIVE LATTICE :;I ·13
2.1. The poset Jf:;l) of Join-irreducible elements ·····················14
2.2. The ~-parameters of I ···········································16
2.3. Characterization of conditional independence in terms of 2:-1 ····17
2.4. The :;I-preserving matrices: generalized block-triangular
matrices with lattice structure ·································19
2.5. The :;I-parametrization of ':;1(1) ··································24
2.6. Transitive action of the group of :;I-preserving matrices ·········26
2.7. Reconstruction of 2: from its :;I-parameters ·······················282.8. Examples •••••••••••••••••••••••••••••••••••••••••••••••••••••• ··32
§3. LIKELIHOOD INFERENCE FOR A NORMAL MODEL DETERMINED BY PAIRWISE
CONDITIONAL INDEPENDENCE ·················································47
3.1. Factorization of the likelihood function; the MLE of 2: ··········47
3.2. Examples of pairwise conditional independence models ············49
3.3. Invariance of the model ·········································56
§4. TESTING ONE PAIRWISE CONDITIONAL INDEPENDENCE MODEL AGAINST ANOTHER··56
4.1. The likelihood ratio statistic ··································57
4.2. Central distribution and Box approximation ······················58
4.3. Examples of testing problems ····································61
95. INVARIANT FORMULATION OF THE CI MODEL AND TESTING PROBLEM ············65
5.1. The lattice structure of quotient spaces ························65
5.2. Invariant formulation of the pairwise CI model·················
5.3. Reduction of the CI model to canonical coordinate-wise form ·····68
5.4. of the testing pr()blem ••••••••••••.•• "•• ".
§6. ···················································69
.............•........•................•...•.......... ···········71APPENDIX
A.I. De,C(lIRq:lOlllition Theorem and existence a ....1 •••••••••••••••••••••••••••••••••••••••••••
4.2 •••••••••••••••••••••••••••••••••••••••••••
" " " " " "" " ., .... ..
1
§1.INTRODUCTION.
Because conditional independence (CI) plays an increasingly important
role in statistical model bUilding, it is of interest to study classes of
CI models with tractable statistical properties and to develop methods
for testing one CI model against another. In this paper we define and
study a class of CI models determined by finite distributive lattices.
For multivariate normal distributions, the parameter sPaCe and the
likelihood function (LF) for such a lattice CI model can be factored into
a product of parameter sPaCes and conditional LF's, respectively,
corresponding to ordinary multivariate normal linear regression models.
This in turns yields explicit maximum likelihood estimators (MLE) and
likelihood ratio tests (LRT) by means of standard technique from
multivariate analysis.
These lattice CI models arise in a natural way in the analysis of
multivariate missing data sets with non-monotone missing data patterns.
The factorizations mentioned above can be readily applied to obtain
explicit MLE's and LRT's by standard linear methods (cf. [AP] (1991)1).
We introduce this class of lattice CI models by means of the following
simple and familiar model. Let (xl'''2,~)t denote a random observation
from the trivariate normal distribution N(z) with mean vector 0 and
unknown covariance matrix z.2 Consider the model that specifies that x2
1References to Andersson are abbreviated by [A]. Andersson
[AP] , etc.
Perlman by
iei paper we I assume ! is
and mean vector the YVjtJu;,a. ...ion is
tel' asswnp1:io.n is easi (1991) .
2
and ~ are condi tionally independent given xl' which we express in the
familiar notation
(l.l)
In terms of the covariance matrix~. (l.l) is equivalent to the condition
(1.2) -1 -1(~ )23 = (~ )32 =o.
In order to express this as a lattice CI model. let I - {1.2.3} denote
the index set and consider
(1.3) ~ ={0. {I}. {1.2}. {1.3}. I}.
a subring of the ring !l(I) of all subsets of 1. Clearly ~ is a finite
distributive lattice under the usual set operations U and n. Define the
class P~{I) of real positive definite IxI matrices as follows:
(1.4) V L. M € ~.
where x'" N{~) and "T denotes the T-subvector of x when T ~ 1. It is
readily verified that and (1.4) are eqlili"aJ.en.t conditions. 3
In the of parameter space and
mentioned :::Ih()vp are represented as follows:
1
(1.5)
(1.6)
3
-1 -11 ~ (111, 121111 , 122-1, 131111, 133_1)
f(xl'~'~) = f(xl)f(~lxl)f(~lxl)'
The five parameters on the right-hand side of (1.5) represent ordinary
unconditional and conditional variances and regression coefficients.
Whereas the range of the positive definite matrix 1 in (1.5) is
constrained by (1.2), the ranges of these five parameters are
unconstrained (except for the trivial requirement that 111, 222-1, and
233- 1 are positive). Thus the MLE's of these five parameters, called the
~-parameters of the CI model, are easily obtained from (1.6), and the MLE
of 1 may be reconstructed from these estimates.
A subset K € ~ is called join-irreducible if K is not the join (:
union) of two or more proper subsets of K (cf. Section 2.1). The
collection of all join-irreducible elements in 1 is denoted by J(1). Thus
when ~ is given by (1.3),
(1. 7) J(1) = {{I}, {1,2}. {1,3}}.
It will be seen that the baSic factorizations (1.5) and (1.6), as well as
their extensions to the general lattice CI model H(~) defined next,
always are indexed by the members of J(~).
(1.4) immediately extends to define the general lattice CI
model. I be an index set and let 1 be an arbi
subring of I), so again 1 is a finite distributive lattice. 4 (1.4:)
of
fini te lattice can be
a some finite set I.
4
restrictions wi th resp!ct .!.!!. the lattice 1: :I € '1(1} if and only if ~
and ~ ~ conditionally indep!ndent ~iven "u1M for every~ L. M € 1.
If N(:I) denotes the normal distribution on IRI with mean vector 0 and
unknown covariance matrix :I. the normal statistical model
(1.8)
is the lattice conditional independence (Q) model determined !?x. 1.
In this paper we study the structure of '1(1) and the statistical
properties of the model H(1). In Section 2.3 (Theorem 2.1) we generalize
(1.2) by characterizing :I € '1(1) in terms of the precision matrix :I-I
In Section 2.5 (Theorem 2.2) we generalize (1.5) by showing that each :I €
P1(1) can be uniquely represented in terms of its 1-parameters. whose
range are unconstrained. so that the parameter space P1 (1 ) again factors
into a product of parameter spaces for ordinary linear regression models.
In Section 2.7 we present a general algorithm for reconstructing :I €
P1(1) from its 1-parameters. A series of examples in Section 2.8
illustrates these results.
The factorization (1.6) of the LF a.s a product of conditional densities
involving only the 1-parameters of :I is extended to the general lattice
CI model H(1) in Section 3.1 (Theorem 3.1). The MLE' s of the 1...parameters
of :I are from the general factorization. the of :I
can be reconsrrueree the algorithm given in Section 2.7.
estimation procedure is illustrated by examples in Section 3.2. In Remark
3.5 it is n ....1r-""rI the model is
or iveri. and Car in
( ) with addi L ......LKl.'" lattice structure.
5
In Section 4 we treat the problem of testing one lattice eI model
against another, i.e., testing
(1.9)
when ~ is a sublattice of ~.5 For example, in the trivial case considered
above with I ={I,2,3}, suppose that ~ = {0, {I}, {I,2}, {I,3}, I} (cf.
(1.3)) and ~ = {0. I}. Then H(~} is simply the normal model with no
restriction on ~ and (1.9) becomes the problem of testing ~ II~ IXI
(equivalently, (1.2}) against the unrestricted alternative, which can be
stated equivalently as the problem of testing
(1.10)
where ~ = (a . . li,j =1,2,3). If, however,IJ
(1.11) 6~ = {0, {I}, {3}, {I,2}, {I,3}. I}
while ~ = {0, {I}, {I,2}, {1,3}, I}, then (1.9) becomes the problem of
testing (XI ,x2) II~ against x2 II~ lXI' which is equivalent to the
problem of testing
~ote ~ C ~ =) H(::tt} &; H(~} •
. 1) and ::I.' = =
lattices rlara,~_~~_ tions Jl . Thus
two different same eI
6
(1.12)
The LRT statistic A for the general testing problem (1.9) is derived in
Section 4.1 and is readily expressible in terms of the KLE's of the
~-parameters and ~-parameters of ~. In Section 4.2 the central
distribution of A is derived in terms of its moments by means of the
invariance of the testing problem. Specific examples of this testing
problem are considered in Section 4.3.
These and associated results are greatly facilitated by the fact that
the model H(~) is invariant under a group G == ~(I) that G acts
transitively on P~(I). This group G is a subgroup of a group of
nonsingular block-triangular IxI matrices. To illustrate this. return to
the trivariate lattice CI model considered above with ~ given by (1.3).
It can be seen that the CI model given by (1.1) == (1.2) is invariant
under all nonsingular linear transformations of the form
(1.13)
and that any nonsingular linear transformation A that leaves this CI
model invariant must be of the form (1.13). The collection of all such
matrices A forms a subgroup of the group of all 3x3 nonsingular lower
triangular matrices. It is also true. but not so easy to see. that G acts
{ ( ) (1. • is
a sp~ecJlal of ( .
7
transitively on the class P:f(I) of all covariance ma.trices l that satisfy
t(1.1) == (1.2). i.e .• for any such l there exists A € G such that l = AA .
These facts. some of which were used by Da.s Gupta (1977). Giri (1979).
Banerjee and Giri (1980), and Jlarden (1981) to study the distribution and
optima.lityof invariant tests for problems such as (1.10) and (1.12).
will be extended in the present paper to the general lattice CI model
N(:f). In Section 2.4 it will be shown how :f determines the invariance
group GL:f(I), a group of g~peralized block-triangular IxI ma.trices with
lattice structure, while the transitive action of ~(I) on P:f(I) is
demonstrated in Section 2.6 (Theorem 2.3), generalizing the well-known
Choleski decomposition of an arbitrary positive definite ma.trix. The
transitivity yields a factorization (Lemma. 2.5) of the determinant of l €
P:f(I) , a generalization of the well-known Schur formula det(l) =
det(lll)det(l22.1)·
As already seen for the trivariate example above, all statistical
properties of the general lattice CI model N(:f). including the definition
of the :f-parameters of l, the factorizations of its parameter space and
LF as products of those for linear regression models. the form of the
MLE, the form of the LRT statistic and its central distribution. and the
partitioning and location of zeroes in the invariance ma.trix A € ~(I).
are determined by the fundamental structure of the lattice:f. in
particular by associated
(cf. 2.1). As in the a balanced ANOVA des 1~:n Wll1er'e the
poset of join-irreducible elements of the lattice of subspaces determines
. [A] lattu:e CI
the model.
in
8
non-monotone missing data models. Under the assumption of multivariate
normality it is well known that a monotone missing data model with
unrestricted covariance matrix L admits a complete and explicit
likelihood analysis, remaining invariant under the appropriate group of
block-triangular matrices (in the usual sense), which acts transitively
on the unrestricted set of covariance matrices (cf. Eaton and Kariya
(1983), [AMP] (1990». If the missing data pattern is non-monotone,
however, then explicit analysis is not possible in general.
The relationship between lattice CI models and non-monotone missing
data patterns is developed fully in [AP] (1991) but can be illustrated in
terms of the trivariate example considered above. Suppose that one
attempts to observe a random sample from the trivariate normal
distribution N(L), where L is unknown and initially unrestricted, but
that some of the observations are incomplete. For example, suppose that
we have several complete vector observations of the form (x1,x2,~)t and
also several incomplete observations of the forms (x1,x2)t
and (x1,x3)t.
Then the missing data pattern (actually, the pattern of the observed
data) is the set
(1.14) ~ := {{1,2}, {1,3}, {1,2,3}},
i.e., the collection of subsets of I == {L2.3} corresponding to the
subvectors actually observed. Because the missing data pattern ~ is
non-monotone, i.e., is not totally ordered by inclusion, the LF cannot be
into a product of 's of linear relg;r4~s£:lioin models and the MLE
9
of ~ cannot be obtained explicitly.8 Instead. iterative estimation
methods such as the EM algorithm must be used. possibly accompanied by
difficulties with convergence or uniqueness of the estimates (cf. Little
and Rubin (1987».
An alternate approach. suggested by Rubin (1987) and developed in [AP]
(1991). is to restrict ~ by imposing the CI conditions of the lattice CI
model H(~). where ~ =~(~) is the lattice generated by ~. With ~ given by
(1.14) it is easy to see that ~ is given by (1.3). so the corresponding
CI condition is given by (1.1). Under this condition the densities for
the complete and incomplete observations factor as
(1.15)
f(Xl'~'~) = f(xl)f(x2Ixl)f(~lxl)'
f(xl'~) = f(xl)f(x2Ixl)'
f(xl'~) = f(xl)f(x3Ixl)'
so the overall LF is a product of LF's of only the three types f(x1).
f(x2Ix1). and f(~lxl)' the latter two corresponding to simple linear
regression models. Also. the overall parameter space is the product of
the parameter spaces for these three LF's. Therefore the similar terms
may be combined and the MLE of ~ may be obtained by maximizing these
three LF's separately. which involves only elementary calculations.
Furthermore. under theeI restriction ~ .€P~(I). this non...monotone
UlJ":>;::'J,.l~ data model remains invariant group ~(I) of
triangular matrices A in (1.13) and ~(I) acts transitively on P~(I).
fact some non-monotone ...... ,"'......."6
obf;ervat1(IDS, :I may not
10
Finally, the CI assumption may be tested by means of the LRT for (1.10)
as discussed above.
Whereas the determination of the appropriate CI conditions and the
factorization (1.15) is transparent in this simple example. a general
missing data pattern requires the lattice-theoretic approach developed in
the present paper - see [AP] (1991) for complete details. Thus. the
results in the present paper open the possibility of applying classical
multivariate techniques to a class of missing data models much larger
than the monotone class.
In Section 5 the CI models and results already described are recast in
an invariant (: coordinate-free) formulation. rather than in the matrix
(coordinate-wise) formulation just given. This is done for the following
reason: a model which. when presented in matrix formulation. may not
appear to be a lattice CI model according to the non-invariant definition
given above. may in fact belong to this class after an appropriate linear
f . 9trans ormatlon.
This is readily illustrated in terms of the trivariate missing data
example given in the paragraph containing (1.14). Rather than the missing
data pattern described by (1.14). consider a missing data array that
includes incomplete observations involving not only the coordinates of x
but also one or more linear combinations of these coordinates. For
also several incomplete observations of the forms
suppose that we have
t
complete observations of the form
9Of course is no means unique to the lattice CI For
must be described
in terms rather than
values certain coordinates the mean vector.
11
(X1,X2)t and (xl+~' ~)t. Alth0qgh this does not directly fit into the
framework of the coordinate...wise missing data models discussed above and
in [AP] (1991), it is easy to transform it to such a framework by means
of a nonsingular linear transformation (Yl'Y2'Y3) = (xl+~' x2' ~). In
terms of Yl' Y2, Y3 the missing data pattern is now given precisely by
(1.14), hence as before the associated lattice CI model imposes the
assumption that Y2 1 Y3 IY1, Le., ~ 1~ IX1+~ (equivalently,
Xl 1~ Ix1+x2)·
The existence and form of an appropriate linear transformation from x
to y (or equivalently, of an appropriate vector basis for the observation
space) may not be so apparent in more complex missing data schemes with
linear combinations present. The invariant formulation of a general
lattice CI model, presented in Section 5, allows one to recognize and
treat, without a preliminary transformation, a set of CI conditions such
as X2 1 ~lx1+~ in the same manner as the coordinate-wise lattice CI
conditions in (1.4).
The invariant formulation is stated in terms of a lattice fJ. of quotient
spaces Q of a real finite-dimensional vector space V.10 (See Section 5.1
for definitions, where it is noted that iffJ. is distributive then it is
finite.) For each Q € fJ. let PQ:V ~Q denote the projection onto Q. Then
the general latticeJD(,')(lel'V(Q) lsdefined Section 5.2 to be the
for ~ R, T € l/l. Theorem 5.1 it is noted that KV(l/l) is
if and only fJ. is distributive.
I vector spaces
over the field
matrices in paper are
12
To express our original coordinate-wise formulation of the lattice CI
models in this invariant framework. set V = IRI• identify each subset K ~
I wi the quotient space &1'. and let ~:IRI -+&1' denote the usual
coordinate projection mapping. Then the definition of the general lattice
CI model in the preceding paragraph reduces to (1.4).
The basic decomposition theorem for a distributive lattice Q of
quotient spaces (cf. Appendix A.l) states that the observation space V
can be represented as a product of vector spaces indexed by the poset
J(Q} of join-irreducible elements in Q in such a way that for each Q € Q.
the projection PQ:V -+Q becomes simply a canonical projection. By means
of this representation we may choose a Q-adapted basis for V (cf.
Proposition 5.1). In Section 5.3 it is shown that in terms of the
coordinate system determined by this basis. the CI model ~V(Q} can be
expressed in the canonical coordinate-wise form (1.4) and the statistical
analysis of the model may then proceed according to the coordinate-wise
formulation.
The general problem of testing one lattice CI model against another is
formulated invariantly as follows: test HO: ~V(Q} vs. H: ~V(~}. where Q
and ~ aredistribtitive lattic.es of qu.otient spaces of V su.ch that ~ C Q.
In Section 5.4 it is noted that one can choose a basis for V that is both
Q-adapted and ~-adapted. by means of which this testing problem can be
reduced to caJrlOIlic:al coordinate-wise form (1.9).
Se'veI'al 1"v,,,,.,,, ....... ,,, extensions class of lattice models are
dfscussed briefly in Section 6. Three important but technical
are in
tions dire<:ted or
13
increasing attention. Prominent references for normal distributions
include Dempster (1972). Frydenberg (1990). Frydenberg and Lauritzen
(1989). Kiiveri. Speed. and Carlin (1984). Lauritzen (1985. 1989).
Lauritzen. Dawid. Larsen and Leimer (1990). Lauritzen and Wermuth (1984.
1989). Porteous (1985). Speed and Kiiveri (1986). and Wermuth (1976.
1980. 1985. 1988); see Whittaker (1990) for a readable introduction to
this area. In many of these studies the CI assumptions are equivalent to
-1the occurrence of patterns of zeroes in the precision matrix ~ of a
multivariate normal distribution. hence the models are linear in ~-1. It
will be seen from Examples 2.6 - 2.8. however. that unlike the special
case (1.2). in general the lattice CI models introduced here are neither
linear in ~-1 nor ~. Furthermore. the statistical interpretation and
analysis of a lattice CI model apPear to differ from those of a model
defined by graphical conditions. Although it is of interest to determine
the relation between these two types of CI models and compare their
properties. our attempts to interpret either class in the framework of
the other have not been illuminating thus far.
§2. THE CLASS P:II(I) OF OOVARIANCE MATRICES :I DETERMINED BY PAIRWISE
OONDITIONAL INDEPENDENCE WITH RESPECT TO A FINITE DISTRIBUTIVE LATTICE :II.
Let I bea finite index let !D{I) denote the all subsets of
1. Le.• :II We
shall o € :II. a
lattice with U and n as the and meet operations.
T.U € I) write T C U to that T b U Let
the number in a set T.
14
Let NCI) denote the normal distribution on IItI with mean 0 € IItI and
covariance matrix I € pel). where P(I) denotes the set of all positive
definite IxI matrices. For any T ~ I and column vector x = (xili€I) € IItI
define ~ := (xili€T). the T-subcolumn of x. Note that XI =X and define
x0 := {OJ.
Definition 2.1. The class P1(I) ~ P(I) is defined as follows (cf. (1.4»:
(2.1) I € P1(I) <=> XL 1 XX IXLnM V L.M € 1 when x ~ N(I).
I.e .. XL and XX are conditionally independent (CI) given xlflM V L.M € 1.0
If lflM = 0 then (2.1) reduces to XL 1 Xw that is. XL and XX are
independent. Note that the right hand side of (2.1) is ordinarly written
in the form
(2.2) V L.M € 1.
Some of these pairwise CI cOnditions are trivially sa.tisfied, e.g .•
whenever L ~ M (or M~ L) (also see Remark 3.2). In particular. if 1 is a
chain then P1(I) = P(I). I.e., I is unrestricted (cf. Examples 2.1 and
2.2) .
2.1. The poset J{1) of join-irreducible elements.
structure of I € P1(I)
1, which we now define. e ,K
15
(K) := U(K' € 11K' C K)
[K] := K\(K),
so that
(2.3) K = (K) U [K],
where Uindicates that the union is disjoint. Then define
J(1) := {K € 11K ¢ 0, (K) C K}
= {K € 11K ¢ 0, [K] ¢ 0}
= {K € 11K ¢ 0, '/L,X € 1: K = LUM =) K = L or K = X}.
If K € J(1) we say that K is join-irreducible. (See Gratzer (1978),
Chapter II, or Davey and Priestley (1990). Chapter 8. for properties of
J(~); in particular. 1 is uniquely determined by J(1).)
For L € 1 define ~ := {K € 11K ~ L}. a sublattice of 1 (~I =1). The
following relations are elementary:
(2.4)
(2.5)
(2.6)
(2.7)
L = U(K € J(~»
J(~) = J(1) n ~
J(~) = J(~) n J(:tlx)
J(~) = J(~) U J(:tlx).
Proposition 2.1. Every L € ~ can be decomposed according to the members
of J(~) as follows:
= €
16
Proof. Let K.H € j(:1t) with K ¢I. so that KI1ftf C K or KOH C H. Suppose
that KI1ftf C H. Then KI1ftf t (H) and it follows that [K]n[l] ::
K n oo'n H n (H)c = 0. hence ([KlIK E J(:Jl)) is a disjoint family. The
inclusion ~ in (2.8) is trivial. To establish ~ consider , E L. Define K,
:= n(L' E ::It I, E L·). the smallest set in ::It containing L Then K E j(:1t).,as seen from the following indirect argument. Suppose that K, f. j(::It) and
thus that K, = L1U L2 where L1.L2 E :1t. L1 C K,' and L2 C K,' Then, E K1
or , E K2 contradicting the minimality of K,' Finally. if , E <K,> (C K,)
the minimality of K again would be contradicted. hence, E [K ]. Since, ,K, E j(::It) this establishes the inclusion ~ in (2.8).
In particular. set L = I in (2.8) to obtain
[]
(2.9) I = U([K]IK E j(:1t».
For example. suppose that I ={1.2.3} and :1t is given by (1.3). Then j(::It)
is given by (1.1) and we find that [{l}] = {l}. [{L2}] = {2}. and
[{1.3}] = {3}. so (2.9) is evident.
2.2. The :1t-parameters of I.
For any finite index sets T and U let I(TxU) denote the vector space of
all peT) positive TxT
I(T) :: I(TxT) algebra of and GL(T) the group of
nonsingular TxT matrices. For every I E P(I) and every subset T ~ I. let
peT»~ sul:lIDa·tri.x of }; and
€ to as
17
(2.10)
so ~<K> € P«K». ~[K] € perK]). ~[K> € M([K]x<K». and ~<K]
Furthermore. define
(2.11)
-1 -1 Iand let ~[K]. denote (~[K].) . Then for every x € m•
Definition 2.2. For ~ € P(I). the family of matrices
(2.13)
is called the family of ~~~rameters of ~.
2.3. Characterization of conditional independence in terms of ~-1.
o
Theorem 2.1 presents an algebraic characterization of the set P~(I) of
covariance matrices ~ defined in terms of pairwise conditional
independence (2.1). following description of pairwise CI is useful.
~~-=:.::...:..:..Let x "V N(~). ~ € P(I). Then for any L. M~ 1. ~ 11.14: IXIIlMif V x e mI :
= + - tr
18
Proof. The difference
tr(~) - tr(~tw.)
appears in the exponential term of the conditional density of
X(LUM)\.(lflM) given "uw. Therefore "L Jl "M l"uw if and only if this
difference is the sum of the differences aPPearing in the exponential
terms of the conditional densities of "L\.(lflM) given xlflM and "M\.(lflM)
given xlflM. This sum is
and the lemma. follows.
Theorem 2.1. (Characterization of P1(I).) For! € pel) the following
conditions are equivalent:
(i) ! € P1(l);
(il) V x €lRl:
(iii) V x € ,V L € 1:
[J
19
Proof. Trivially (iii) => (ii). On the other hand. (iii) follows from
(ii) if we replace I and ~ by L and~. respectively. in (ii).
To show (i) => (ii). use induction on IJ(1) I =: q. If q = 1 then by
(2.4). 1 = {0. I} and (ii) is trivial. Next. assume that (ii) is true
whenever q S k-l and suppose that q = k. If I € J(~) then J(~) =
J(1<I» U{I}. hence IJ(~<I»I = k-l and (iii) is true with L replaced by
<I>. so (ii) follows from (2.12) with K replaced by I. If. on the other
hand. I ~ J(1). then I = LUM where LeI and Mel. It follows from (2.4)
that IJ(1L) 1 < k and IJ(\tJI < k , so by the induction assumption. (iii)
is valid with L replaced by L. M. and LnM. Then (ii) follows from (2.6).
(2.7). and Lemma 2.1.
To show (iii) => (i). consider any pair L.M € 1. Apply condition (iii)
four times. with L replaced by DUM. L. M. and LnM. and then apply (2.6)
and (2.7) to obtain (2.14). By Lemma 2.1. therefore. (i) is satisfied. 0
2.4. The :1t-preserving matrices: generalized block-triangular matrices
with lattice structure.
We now introduce a group ~(I) of nonsingular matrices A that will be
seen in SeCtiOIl 2.6 to act transitively on P1(I). In the present section
~(I) is shown to be a group of block-triangular matrices with lattice
structure determined by 1.
For any A € Mel) anci any two subs.ets L.X € J(1) let A[LM] denote the
of A.
;;..::..;;:.a:;..;;;.;:...::..;;.=;,...;;;;.;:..;;:;.;;.. Let A € IfI ) . on A are
20
(i) Vx € mI, V L € ~: XL = 0 => (Ax)L = 0;
(ii) Vx € mI, V L €~: (Ax)L =~XL;
(iii) VL,M € J(~): Mg L =) A[LM] = o.
Proof: (ii) => (i) is trivial.
(iii) => (ii): By the usual formula for matrix multiplication by
blocks,
(Ax)L = (2(A[KM]x(MlIM € J(~))IK € J(=\))
= (2(A[KM]x[M]IM € J(=\»IK € J(=\»
= ~XL·
The first equality uses (2.8) and (2.9), the second uses condition (i),
while the third uses (2.8) twice.
(i) => (iii): Suppose L,M € J(~) with Mg L. Let c denote any column
vector in mI satisfyi~ C(K] =0 for K € J(~), K ¢ M. Then
But (Ae)L :: 0 by (i). hence (ACl(L] == o. Since C(M] is arbi trary this
implies A[LM] =0 as required. []
Let ~(I) denote the set of all A )(1) satisfy the equivalent
condi tions (i). (U). (iU) in Proposition 2.2 and let ~(I) denote
set of I nonsingular matrices in ~(I). It follows from (i) that ~(I)
is a matrix and ~(I) is a matrix group. It
I is set 1 matrices each €
21
preserve the kernel of the projection (RI ~nf given by x ~ xL' Note that
when ~ ={0,I}, ~(I) =M(I) and ~(I) =eL(I).
Definition 2.3. The algebra ~(I) is called the algebra of ~-preserving
matrices and ~(I) the group of ::tt-preserving matrices.
Remark 2.1. When::tt is a chain then J(::tt) E ::tt\{0} is also a chain, so it
follows from Proposition 2.2 (iii) that ~(I) is an algebra of
block-triangular matrices in the usual sense. For a general ::tt let q :=
o
IJ(::tt) I and let K1,K2 , · · · ,Kq be a never-decreasing listing of the members
of the poset J(::tt), i.e., i < j => K. g K.. If every A € M(I) isJ 1
partitioned according to the ordered decomposition
(2.15) [K ],q
then it is seen from Proposition 2.2 (iii) that ~(I) can be represented
as a subalgebra of the algebra of lower block-triangular matrices. That
is, A € ~(I) is lower block-triangular with additional blocks of zeroes
below the main diagonal ~ see (1.13) and also Section 2.8 for further
examples.
!£l!!!!:.!~b~For K € ::tt and A € M(I) let ~ denote the KxK submatrix of A
and tion A according to (2.3) and (2.10) as follows:
n
22
note that A[KK] =A[K] when K € J{~). By Proposition 2.2{ii). if A €
I~(I) then for every K € J{~) and x € IR •
(2.17)
(2.18)
A(K] =0
(Ax)[K] = A[K]x[K] + A[K)x(K)'
Furthermore. the linear mapping
(2.19) ~(I) -+ X{JI{[K]x(K»xJl([K]) IK € J(~»
A -+ ((A[K)' A[K]) IK € J(1»
is bijective. This holds because. by Proposition 2.2(iii). A € ~(I) if
and only if the [K]x{I\K)-submatrix of A is 0 for every K € J(1). Under
the correspondence (2.19) the subset ~(I) corresponds to the subset
(2.20) X(JI{[K]x(K»xGL{[K])IK € J(1». JJ
Lemma 2.2. For A € ~(I). L € 1. and K € J{~).
(2.21)
(2.22)
(2.23)
~~_ From Proposition 2.2{ii). (AC)L = ALCL for every A.C € ~(I). L €
JJfrom{2.23}{2.ies
23
Lemma. 2.3. The ma.pping
(2.24)
from ~ to its ~-parameters commutes with the actions of ~(l) on pel)
and on X(M([K]x<K»xP([K]lIK E: J(~» given by
(2.25)
and
(2.26)
~(I)xP(l) -+ pel)
(A,~) -+ AZAt
~(l)x(X(M([K]x<K»xP([K])IK E: J(~»)
-+ X(M([K]x<K))xP([K]) IK E: J(~»
(A, «R[K)' A[K])IK E: J(~»)
-1 ~1 t I-+ « A[K]R[K)A<K) + 4[K)A<K)' A[K]A[K]A.[K] li K E: J(~»,
respectively.
Proof. It is straightforward to verify that (2.26) is a group action. We
must show that for every A E: ~(l), ~ E: and K E: J(~).
(2.27)
and
(2.28)
be =
[J
24
tProposition 2.3. If I e P:fCI) and A € ~O). then nA € P:f0 ) .
Proof. We shall show that condition (ii) of Theorem 2.1 is valid with I
replaced by nAt. Since I € P:f0 ). (ii) holds for I. Now replace x by
A-Ix in (ii) and let B =A-I. The left-hand side of (ii) becomes
tr{(nAt)-lxxt) while the summands on the right-hand side become
The first equality uses (2.18) and Proposition 2.2(ii). the third uses
(2.22) and (2.23). and the fourth uses (2.27) and (2.28). Therefore
condition (ii) of Theorem 2.1 holds for nAt.
2.5. The :1l-parametrizaU<>n of P:1l(I).
Theorem 2.2 below establishes the one-to-one correspondence between I
and its :f-parameters. Together with Theorem 2.1(U) and Lemma 2.5. this
decomposition of the parameter space P:f(l) yields the fundamental
factorization of the likelihood function for the CI model .N(:1l) (cf.
[J
1).
For any fami
) € )
25
there exists a matrix A € ~(I) such that for every K € J(:1l).
(2.29)
(2.30)
Proof. First choose matrices A[K] € CL([K]). K € J(:1l). that satisfy
(2.30). As in Remark 2.1 let K1•••• .Kq
be. a never-decreasing listing of
the elements in J(::tl). For notational cOIlveIlienc~<abbr~vi<tt~I<k by k , <I<k>
by <k>. [I<k> by [k>. and EI<k] by [k] whenever they appear as subscripts.
If K1 C K2 then <K2> = [K1]. so A(2) = A[I] and A[2> is uniquely
determined by (2.29); if K1 ~ K2 then <K2> = 0 so (2.29) is vacuous. Now
suppose that we have determined A[2>.···.A[k_l> satisfying (2.29). These
k-2 matrices (some of which may be vacuous). together with
An].··· .A[k-l]' completely determine A(k>' This follows from the
decomposition (cf. (2.8»
(2.31)
and the fact that Ki ~ <I<k> => i < k for a never-decreasing listing. Now
A[k> is uniquely determined by (2.29) and. after indllction on k.
is the of
(2. A € CL:1l(I).
) is
26
(2.32)
Proof. By Theorem 2.2(11). (2.32) is injective. To show that (2.32) is
surjective. consider
((R[K)' A[K]) IK € J(:If» € X(JI([K]x<K))xP([K]) IK c J(:If».
By Lemma 2.4 there exists a matrix A € ~(I) satisfying (2.29) and
(2.30). Define! := !At; then! € P:If(I) by Proposition 2.3 (with! = 11),-1 -1
The :If-parameters of ! are given by ![K)!<K) = A[K)A<K> = R[K> and ![K]. =A[K]A[K] =A[K]' K € J(:If) (set! = 11 in (2.27) and (2.28».0
2.6. Transitive action of the group of :If-preservin~ matrices.
Theorem<2.3. The action
(2.33) ~(I)xP:If(I) ~ P:If(I)
(A.!) ~ !!At
is well-defined. transitive. continllous. and proper.
;:..;;..;:...:;;.;;;._ That (2.33) is well-defined Vii. .LV"';:' from Proposi tion By Lemma
and
Lemma so it
is transitive. That (2.33) is continuous is
) .and c!~lSs1CfL! action or I) on I} is proper i
the action is proper. u
27
action
(2.34) -1 -1~(I)xP1(I) ~ P1(I)
(A,A) ~ (A-1) t AA-1
-1induced on P1(I) by (2.33) is also well-defined, transitive,
continuous, and proper. [J
Remark 2.4. Since both P1(I) and P1(I)-1 contain the IxI identity matrix
II' it follows from the transitivity of the actions (2.33) and (2.34)
that
(2.35)
(2.36)
P1(I) = {AAt€ p(I)IA € ~(I)}
P1(1)-1 = {AtA € p(I)IA € ~fI)}. [J
-1If 1 = {0, I} then P1 ( 1) = P1(I) = P(I), so both actions (2.33) and
(2.34) reduce to the well-known transitive actions of CL(I) on P(I). If 1
is a chairi as in EXarnPles2.1 ELIld 2.2 in Section 2.8 then againP::c(I) =P::c(I)-l = P(I), but now ~{I) is agrQUP of nonsingular lower
block-triangular matrices in the usual sense ~ the actions (2.33) and
(2.34) are the well-known i- ....'ft.,,; ~(I) on P(I).
The following lemma generalizes the Schur decomposition for
28
Lemma .2.5. For I € '=tt(Il.
(2.37) det(I) = Jl(det(I[K]JIK e J(:1f».
Proof: By Theorem 2.3 there exists A € ~(I) such that I =AAt. Thus
tdet(I) = det(AA )
= Jl(det(A[K]A(K]J IK € J(:tt))
= Jl(det(I[K].)IK € J(:1f)l·
The second equality holds since A can be represented as a lower
block-triangular matrix (cf. Remark 2.1). while the third equality
follows from (2.28).
2.7. Reconstruction of I from its :1f-parameters.
By Theorem 2.2. I € ':1f(I} is uniquely determined by its :1f-parameters
[J
(2.3S)
it is imr>O"J"T&:I~'T to find an
icit method for reconstructing I € '1(1) from its 1-parameters.
=
29
which is just a re...e:xpression of Theorem 2.1(ii). where Ar(K) is the IxI
matrix whose KxK submatrix is
(2.40)[
t ...1lI[K>A.[.. K.]R.[K>
-1-A[K]R[K>
t -1]-R[K>A[K]-1
A[K]
and whose remaining entries are O. In general. however. it is not a
simple task to determine .~ from (2.39) by matrix inversion. We now
present a step-wise algorithm for reconstructing ~ directly from its
:1l-parameters.
Let K1.···.Kq be a never-decreasing listing of the members of the poset
J(:1l) (cf. Remark 2.1 and the proof of Lemma 2.4). partition ~ according
to (2.9), and list the :1l-parameters in the corresponding order:
(2.41) (A[l]' (R[2>' A(21).···' (R[q> , A[q]J) €
P([K1])xJ(([K2]x<~> )xP([K2])x••• xJ(([Kq
] x <Kq>)xP([KqD.
(Recall that whenever they appear as subscripts. ~, <~>, [~>, and [~]
are abbreviated bYk. ~>. [k>. arid[k] • respectively.) The
reconstruction algorithm proceeds step-wise as follows. At step k the
relations in (2.38) are inverted to determine .I[k> and .I[k).. from the
in k-l. The
• ·U(k-l)
relOlillLn:tng entries in ~1U- _-Uk are determined
by CI conditions.
=] 1 .
Step 2:
30
~(2) = R[2>~(2)'
~[2] = A[2] + R(2)~<2]'
At this point the submatrix ~1U2 is completely determined: if K1 C K2
then ~IU2 = ~2' while if K1 ~ K2 then K1nK2 = 0 so the [K1]x[K2]
submatrix of ~ is 0 by (2.2). (Recall that lU2 abbreviates K1UK2 when
appearing as a subscript.) By (2.42), <K3) ~ K1UK2, so ~(3) is a
submatrix of ~IU2' hence the next step maY be carried out.
Step 3a: ~(3) = R(3)~<3)'
~[3] = A[3]+ R(3)~<3]'
It is important to note that after Steps 1, 2. and 3a, the three
submatrices ~1' ~2' ~3 are now determined but the complete submatrix
~lU2U3 may not yet be fully determined. The remaining
[K3]x((Kl~UK3)\K3)-submatrixof ~1U2U3' which we denote by ~(3)' is
determined from ~lU2 by means of the pairwise CI requirements imposed by
':It [cf . (2.44)):
Step 3b:
is the )-subnlat:rix of . By (2. and
however, ~<3t is in fact a submatrix of hence may be used
in
After k-l such submatrix ful determined
and in turn may be used to OD'caln _-Uk as fol . First note
the never-decrea.sino- l'la'tUI"e of K ••• K imnliesthat~- 1"q -'Y
K1U•••l.JKk = U([KjlIJ=l
••••.k) •
~ =U([KjlIJ:l .••• .k, Kj ~ ~).
From these relations and (2.3) it may be deduced that
(2.42)
(2.43)
Thus. if we denote the [Kk]x«KIU···l.JKkl~)-submatrixof lIU•••Uk by
l[k} and the <~>x((K1U···l.JKk)~)-submatrixby l<k}' it follows from
(2.42) and (2.43) that both l<k> and l<k} are in fact submatrices of
l1V•• •V(k-l)' so the next step may be carried out:
Step k:
(2.42)
l[k> = R(k>l<k>'
l[k] =A[k]+ R[k>l<k]'
l[k} =
relation in i::>lnlCe ~ (:: L) and
is eqtlival.~nt
J-subnntJrlX of ••Uk)-1 is a zero matrix, which
··Ukis deitermbled after after q
• == I is ful
[In carrying out this algorithm one must use the convention that if C '#
9 and D '# 9. then the product of a Cx9 matrix with an 9xD matrix is the
CxD zero matrix.]
2.8. Examples.
A series of nine Examples will illustrate the following basic aspects
of a lattice CI model H(:f): (a) the distributive lattice :f ~ ~(I) and the
poset J(:f) of join-irreducible elements; (b) the :f-parame1:ri:zation (2.32)
of Pjt(I) and the associated decomposition of tr(};-lxxt) given in Theorem
2.1(ii); (c) the choice of a never-decreasing listing of the members of
J(:f) and the reconstruction of the covariance matrix}; € P:f(I) from its
ordered :f-parameters (cf. (2.38» by means of the step-wise algorithm in
Section 2.7. as well as the form of the precision matrix A=};-1 €
P:f{I)-l; (d) the form of the :f-preserving matrices. Le.• the group
GL:f{I) of matrices •• partitioned according the ordered decomposition
(2.15). that acts transitively on P:f(I) (cf. Remarks 2.1 and 2.2). The
reader should verify directly that (2.35) and (2.36) hold for P:f(I} and
and ~(I) in these nine Examples.
In each Example the lattice diagra.m of appears in an accompanying
Figure. in which the mem~rs of J(:1I) are in<licatedby open circles and
the remaining members of :f by solid dots. the minimal
I
These Examples will be continued in Section 3.2. where the .MLE i is
of and in .;,eC;:l:lOn 4.3 to 'n"",,",U1!rle>
tional CXlHll'UClS appear in (1991).
33
Example 2.1. First consider the simple case where :It = {0.L.I} (see Figure
2.1) .
*0o _.--..0---0 IL
Figure 2.1.
Since:lt is a chain. P:It{I) = P(I). Note that J(:It) = {L.I} and <L> = to}.
<D = [L] =L. Thus the :It-parametrization of P:It{I) becomes
(2.45) P(I) ~ P(L)xM([I]xL)xP([I])
and
(2.46)
The algorithm for reconstructing ~ from its ordered :It-parameters A[L]'
R[D' A[I] takes the following form:
Step 1:
Step 2:
~ =ALL]
~[I> = R[I>~
~[I] =A[I] + R[I>~<I]
The group ~(I) is a lower block...triangular matrix group in the ordinary
sense: ~(I) consists of all nonsingular IxI matrices of the
A=o
34
Example 2. 2. 1. == {S== KO' K1... •• Kq_1• Kq== I} is an ascending chain.
1. e .• S C K1C•••C Kq_1C I. then a well-known generalization of the
preceding example is obtained (see Figure 2.2).
S -0-- • •• -0-0 IK
1K
q_
1
Figure 2.2.
Again P1.{I) = P{l). but the l.-parametrization is changed. Note that J.(1.)
= {K1.···.Kq} and <K1> =s. <Kk> =Kk-l' k =2.···.q. Then the
:1!-parametrization of P1.( I) becomes
(2.48)
and
(2.49)
P{I) ~ P(Kl)x.{(~]xKl)xP{(K2])x••• x.{(Kq]xKq_l)xP{[Kq])-1 -1
~ ~ (~1' ~(2)~1 • ~(2]'" •••. ~[q>~q-l' ~(q].)'
where Kl.~.···.Kq.are abbreviated 1.2.···.q whenever they occur as
subscripts. Then ~ is reconstructed from its ordered l.-parameters A(I]'
R[2>' A(2]' •••• R[q>• A(q] as follows:
..•..
~l ,:
~[2> =R(2)~1
~[2] =A(2] + R[2>~<2]•....
=
= +
35
The group~{l) is again a group of lower block-triangular matrices in
the usual sense. For example. when q =4. ~(I) consists of all
nonsingular Ix! matrices of the form
(2.50) A =
Al 0 0 0
A(2) A[2] 0 0......................................................
A(3) : A[3]: 0............ to· ..
A[4> : A[4]
o
Example 2.3. Consider the lattice :It = {0 == UtI. L. M. UlM == I} (see
Figure 2.3).
L
0-<::>1M
Figure 2.3.
Here the CI requirement determined by :It is nontrivial. so P:It{I) C P(I).
Now J(:tt) = {L.M} and. <.I..> = <II> =0. The :It--parametrization takes the. form
(2.51)
and
.52)
P:It(I) +--+ P(L)xP(M)
.I +--+ ~).
, I may
as
Step 1:
Step 2:
36
~ =4[L]
\t = 4[)I]
l[M} = O.
Thus P~(I} consists of all block-diagonal matrices 1 of the form
(2.53)
where 1 is partitioned according to the ordered decomposition
(2.54) I = L U)I.
In this Example. as in Examples 2.1 and 2.2. P~(I) = P~(I}-1 and both are
linear. i.e .• closed under (no;n,;n,egaUve) linear combi:naUons. The group
~(I) consists of all nonsi;n,gular IxI matrices of the form
(2.55) [J
Example 2."l, If ~ = {0 == UW:. L. M. LUM. l} (see Figure 2.4)
L
0~IM
Figure 2.4.
J(~) = )I. = ()I) =0. (I) =P~(I} assumes
(2.56)
and
37
P:It( I) +--t P(L) xP{K)x.{[ I J~(ll.Jl;I)JxP(I1J)....1
}: +--t (~. ~. }:[D~' }:[IJ.)·
tr(}:....1xx5)....1 t -1 t -1 -1 t
= tr(~ "L"L) + tr(\t VM) + tr(}:[I]. (X[I] - }:[n~H···) ).
Now L. M. 1 is a never-decreasiI'lg listiI'lg of J{:It). so}: may be
reconstructed froID its ordered nontrivial :Jt...~rameters A[L]' A[K]' R[n'
A[I] as follows:
Step 1.2:
Step 3:
RePeat Steps 1.2 in ExaJDple 2.3.
}:[I> =R[I>Diag(~.~)
![I] = A[I] + R[I>}:(I]'
Thus P:'It0 ) consists of all }: of the form
(2.57)
where ! is
[~ 0 : ]
~ ~!..?i~:~~;![D:}:[l]
Honed accordiI'lg to the ordered decoIDPOsiHon
(2.5S) •I=LUK
that
is linear is not.
by
group CL:It( consists of all
I
I matrices form
38
(2.59) A=
o : 0
Ax: 0 []
Example 2.f). Suppose that :JI. = {8. I.fII. L. X. I...IJM == I} (see Figure 2.5).
(Note that (1.3) is a special case.)
L
8~IX
Figure 2.5.
Now J(:JI.) = {I.fII. L. X}. and <I.fII> =8. <L> = <X> = I.J1M. The
:J/.-parametrization of P:JI.{I) is given by
(2.60) P:JI.(I) +--+ P{LfIf)xl:([L]x(LfIf»xP{[L])xl:((X]x{I.fII) )xP((M])
-1 -1! +--+ (~. I[L>~' l[L]_' I[X>~' I[M] _) •
and
(2.61)
Since M
reconstructed from its ordered :JI.-parameters A[I.fII]' R[L}' ArL]' R[M}'
as follows:
39
Step 2:
Step 3:
(2.62)
l(L> =R(L>hnt:
l[L] = A(L] + R(L>l(L]
1(11) = R(II>hnt:
1[11]= A(II] + R[II>l(ll]
l(lIl = R(II>l(L]-1
(= l(II>~(L])'
(Note that 1(1I) = l(Lr} Thus l":;It(I} consists of all I € pel} of the form
(2.63)
such that l(lIl satisfies (2.62) and where I is partitioned according to
the ordered decomposition
(2.64) • •I = (LnN) U [L] U (II].
Then P:;It(I}-l consists of all A € P(I) having the simple form
(2.65)A(L]A(L]o
Thus, in this example P:;It(I}-1 is linear while P:;It(I} is not. The group
~(I) consists of all nonsingular Ixl matrices of form
0 0
A = 0 [J
0
40
Example 2.6. Consider the lattice :t = {0. lJ'1H. L. K. LI..JK. I} (see Figure
2.6).
L
0~IK
Figure 2.6.
Note that J(:1t) = {0. lJ'1H. L.K. I}and<lJ'1H> =0. <L> = <K> = I.flM. <1> =LI..JK. The :1t-parametrization of P:tt(l) is given by
(2.67) P:tt(I) ~
P(I.flM) xJl([L]x(lJ'1H» xP([L]) xJl([K]x (I.flM) )xP([K]) xJl([I]x(LI..JK» xP([ I])
and
(2.68)
Since lJ'1H. L. K. I is a ne"er-dElcr.~sing 16 of J(:t) • .I can be
reconstructed from • A[ as follows:
Steps 1.2.3: Rep~t Steps 1.2 and 3 in Example 2.5 to obtain
Step 4: = n .....·'_
+
41
Thus P=tt{ I) consists of all I of the form
(2.69)
•parti t Ioned according to I = IllJM U Dr where 1...uM is given by (2.63).
-1 -1(2.64) and (2.62). The precision matrix A == I € P:/l(I) is
characterized by the condition that~ have the form (2.65). Thus
neither<Pj{I) nor iPj{I)-1 is anear.Thegroup~{I)consists of all
nonsingular IxI matrices of the form
(2.70) A =
\.nM0 0'0
A[L> A[L] 0 0
A[X> 0 A[X]: 0• '" OIl '" ..
A[l> :A[I]
o
Example 2.7. Let :/l be the lattice in Figure 2.7:
L L I
9~IX XI
Figure 2.7.
= L. X. LI. XI} and <l...flM:> =0. <L> = 01> = l...flM:. <L I> =
<1(1) = llJM == L 'rwl• The :/l-parametrization of P:/l{I) is given by
(2.71)
42
P:f(I) ~
P(J.J'ItI)xX([L]x{J.J'ItI))xP([L]) xX{[X]x{lfII» xP{[X])
xX([L' ]x(UII) )xP([L' J)xX([X']x(WM»xP{[M'])
I~
-1 -1(ILnM ' I[L)ILnM • I[L].· I[X>ILnM • I[X].·-1 -1
I[L')IUiM • I[L'].· I[X')IUiM • I[X'].)·
from which the decomposition of tr{I-1xxt) is directly obtained. The
matrix can be reconstructed from its ordered :f-parameters A[U1M]' R[L)'
A[L]' R[M)' A[X]' R[L')' A[L']' R[X')' A[X'] as follows:
Steps 1,2.3: Repeat Steps 1.2.3 in Example 2.5 to obtain I UII =IL'n X'
Steps 4,5: Repeat Steps 2.3 in Example 2.5 with L,X replaced by L' .X'
Thus P:f{I) consists of all I of the form (2.63) withL. X replaced by L' .
M', partitioned according to the ordered decomposition
(2.72) I = (UII) U [L'] U [X']
and where 1.'nM' =~ is given by (2.63). The precision matrix A = I-1
has the form (2.65) with L. X replaced by L'. X' and satisfies the
condition that~ has the form (2.65). Again. P:f(I}
is linear. The group ~(I) consists of
the form
matrices of
(2.73)
43
.Aun. 0 0 0 0
A[L) A[L] 0 0 0
A = A[M) 0 A[M] : 0 0........ '" '" .A[L') :A[L'] 0A[M') . 0 A[M']
D
Example 2.8. Let :It be the lattice in Figure 2.8:
L"
I
Figure 2.8.
Here J(:It) = {lJ)M, L, X, L", M'} and (lJ)M) = 0, (L) = (X) = lJ)M, (L") =L,
()I' > = UJM = L 'n X'. The :It-parametrization of P:It(I) is given by:
(2.74) P:It(I) +-+
P(lJ)M)~M([L]x(lJ)M»xP([L])xJ([M]x(lJ)M»xP([X])
xJ([L"]xL)xP([L"])xJ([X']x(UJM»xP([)I'])
I[M] .• '
I[M' ].)
from which the decomposition
can be reconstructed from its ordered :It-parameters A[lJ)M]' R[L> , A[L]'
, R[M')' A[M'] as follows:
I =
t:l0:neo. accordlna: to
twhere l{LUl =l(L"t'
l[LU) =,R.[Lu>~
l(LU] = A(L"] + R(LU)l(LU]
ThusP:t(I) consists of a.ll I of the form
twherel{M]:::::l(M.t; thus weobta.in
Step 5: l(M'> = R(M')~
l(M']= A(M'] + R(M·)l(M·]
~:E.!~~~Repeat Steps 1.2.3 in Example 2.5. to obta1n
(2.76)
(2.77)
(2.75)
. • • • •]I = U U U U .
45
where I[M)' I[L")' IrK') satisfy (2.62). (2.75). (2.76). respectively.
The precision matrix A == I-I satisfies the following three conditions:
its [K']x[L"]- and [L"]x[K']-submatrices are O. the [L"]x[M]- and
-1 -1[M]x[L"]-submatrices of :I.L' are O. and~ has the form (2.65). Neither
-1P~(I) nor P~(I) is linear. The group ~(I) consists of all nonsingular
IxI matrices of the form
(2.79)
Auw. 0 : 0 0 0
A[L) A[L]: 0 0 0.................................................
A = A[K) 0 :A[M]: 0 0....................................................
A[L") : 0 :A[L"]: 0....................................................
A[K') 0 :A[K']
n
Example 2.9. Finally consider the lattice ~ in Figure 2.9a:
L"
I
Figure 2.9a.
The lattice ~.
Al though this lattice properly contains the lattices Examples 2.7 and
2.8 as sUI)la.tt:lcE~S. the set P~(I) that it is much simpler than
those in Examples 2.7 and 2.8. The reader may verify that P~(I) is
identical to PA(I), where A is the sublattice in 2.9b:
47
Au. :0:0 :0:0
(2.80)
A[L> : A[L]: 0 : 0 : 0'" ..... '" .. '" '" '" . '" '" '" .. '" '" '" .. '" .. '" '" '" '" . '" .. '" . '" ...
A = A[M> : 0 : A[M]: 0 : 0'" . '" '" .... '" '" ... '" '" '" . '" '" . '" ... '" '" '" . '" '" '" . '" '" . '"
A[Ltt> : 0 :A[Ltt]: 0.. '" '" . '" '" .. '" '" '" '" '" '" '" . '" '" '" '" '" '" '" . '" '" '" . '" ... '" '" '" '" .A[Mtt(JJlM)]: 0 :A[MttM]: 0 :A[Mtt]
(Note that (A[Mtt(JJlM)] A[MttM]) =A[Mtt> in (2.80).) []
Remark 2.5. For any K£ 'J(I) defineK' := I\K. It is an elementary
exercise to verify that for L. 11£ ~(I).,.,.Jl.,.1 xJJlM under N(I) if and
only if xL' .Jl.,., I ""AM' under N(I-1
) . From this it follows that P~(I) =-1 1P~,(I) • where ~' := {K' K £ ~} is the dual lattice of ~. For example.
if ~ is the lattice in Figure 2.4. then~' has the same form as the
lattice in Figure 2.5; the relation P~(I) = P~, (1)-1 may be verified by
comparing (2.57) and (2.65).
§3. LIKELIHOOD INFERENCE FOR A NORMAL MODEL DETERMINED. BY PAIRWISE
roNDITIONAL INDEPENDENCE.
3.1. Factorization of the likelihood function; the MLE of I.
Consider n independent. identically distributed (1. 1.d.) observations
[]
from lattice CI model JI(~) defined by (1.8) and (1.4). and
of y.
L£~ del"ot:e the submatrix of y,
K£ tion YK aCl~or,d hill: to (2.3) as
48
The fundamental factorization of the LF for the model 8(:f) is an
immediate consequence of Theorem 2.1(ii). Lemma 2.5. and Theorem 2.2.
Theorem 3.1. (Factorization Theorem.) The likelihood function based on n
t . i. d. observations from the statistical model 8(:f) has the following
factorization:
(3.2) P:f(I)xM(IxN) ~ ]O.oo[
(:I.y) ~ (detCI))-n/2exp(-tr(:I-1yyt)/2) =
-n/2 -1 -1 t I1l«det(:I[K]e)) xexp(tr(:I[K].fY[K] - :I[K):I<K>Y<K>)(eee) )/2) K€J(::1t)).
The parameter space P:f(I) has the factorization given by (2.32). []
Note that the factor corresponding to K E J{:f) is the density for the
conditional distribution of Y[K] given Y<K>'
It follows readily from Theorem 3..1 and well--known. results for the
multivariate normal linear regression model that the MLE ~(y) of :I €
P::1t(I) is unique if it exists. and it exists for a i e , Y € M(IxN) if and
only if
(3.3) n l max{I<K) 1+I.[Kll IK € J(:f)} == max{ IKI IK € J(::1t)}.
case usual
ref!~ressjlon es t imators:
(3.4)
49
K € j{:1t).
twhere S{y) = yy is the empirical covariance matrix. The explicit
expression for ~ itself may be obtained from its :1t-parameters in (3.4) by
means of the reconstruction algorithm given in Section 2.7.
If I e j{:1t) then the condition (3.3) reduces to N ~ II I. so in this
case S is positive definite a.e .• hence !!. fortiori ~[KJ. exists a .e , for
every K e j{:1t). If.. on the other hand. I t£ j{:1t). then condition (3.3)
does not guarantee that S is positive definite. but it still guarantees
that ~[K]. (and hence ~) exists a.e.
By Lemma. 2.5. when (3.3) is satisfied the maximum value of the LF (3.2)
is given by
(3.5)
where c = nn/2x exp{-n.11 1/2). This fact is used in Section 4 to express
the likeliho.od ratio statistic for testing one model against another.
Remark 3.1. The statistical model 1{:1t) is a curved exponential family; it
is linear if and only if p:1t{I)-l is a linear set. Le.• closed under
positive linear In the linear case the MLE ~ based on n
1. 1.d. a minimal ~ is
not nec::es:salri in the general case. [J
associated
:f is a chain as in ~::UU}'''''''''' 2.1 2.2 P:f{I) =P{I)
50
and H(~) is the unrestricted covariance model regardless of the length of
the chain. (The ~-parametrization of P~(I) does depend on this length.
however.) Condition (3.3) for existence of the MLE ~ reduces to the
familar condition n ~ III. while (3.4) reduces to ~ =S.
For the lattice ~ in Example 2.3. partition the observation x € RI
according to (2.54) as x = (x.:..x:.)t. The model H(~) states simply that ~
1l ~ . According to (3.3). the MLE ~ exists if and only if n i
max{ILLIMI} (whereas S is positive definite if and only if n i III) and
is given by ~ = Diag(Sz.'~)'
For the lattice ~ in Example 2.4. partition x € RI according to (2.58)
t t t t . 1las x = (xL.~.x[I]) . Then the model H(~) agaIn states that ~ ~.
Condition (3.3) for the existence of the MLE takes the form n i III.while from (3.4).
We reconstruct ~ from its ~-parameters by following Steps 1-3 in Example
2.4 to obtain
~ = DiagfSz. .~)
xJ[l> =S[I>S;1.niagfSz..SM)
xJ[l] = S[I]. + S[I>(J)iag(SL'~1)-1S<I] (¢ S[I])'
In Example 2.5. x is partitioned according to (2.64) as
The model H(~) states
tion bec:omE~S n i }. while
(3.6b)
(3.6c)
= SuR
=s(L)SuR'
=s(M)SuR'
~(L)- =S(L)_
~(M) - = S(M]- .
By Steps 1-3 in Example 2.5. ~ is given by (3.00.) and
(3.7a)
(3.7b)
(3.7c)
In Example 2.6. x is pa.rtitionedas (~i'X(L)'X(M)'X(I) t and the
model .N(~) states that x(L) .Jl. x[M)I~. Condition (3.3) reduces to n ~
III. while (3.4) is given by (3.oo..b.c) and
From Steps 1-4 in Example 2.6. ~ is given by (3.00.). (3.7a.b.c). and
from
(3.8) ~=
l"uw
IC"uw
I("uw ,x[L],x[M]}'
(il) ~[X] JL
(iii) x[L"J JJ. x[M'J
b,c}.
tion
(Note that XL 'flM' =~ = ("uw ,x[LrX[Xl)'} Condition (3.3) becomes n
~ max{ IL '1.Ix' n. while (3.4) is given by (3.6a.b,c) and (3.6b.c) with
(3.9a)
(3.9b)
(3.9c)
52
X[L] JL x[X] IXJ.nr.I and that x[L'] JJ. xEN'] I(XJ.nr.I ,xELrxEX1) .
where~ is given by (3.8).
I T:'.___ 1 2 8' .. d (t t t t t)t In ~lIpe .• , X IS partltloneas "uw 'X[LrX[XrX[L"rX[X'l . t
may be seen fI'()J1l the form (2.11}o{ :I€P::tt(I) that the IIlOdel N(::tt} is
determined by the following three cOIlditions:
. . t t t tt tIn Example 2.1. x Isparti tioned as (XJ.nr.I ,x[LrxEXrx[L'l,xEX'l)
and the>model N(:I} states that
L,M replac:ed by L',X' (note that SL 'flM' = SLUM) • From Steps 1-5 in
Example 2.1, ~ is giyen)by(3.&3,).(3.1a,b.c}, and
53
~[L"]. =S[L"].
~[M·]. =S[M·].·
From Steps 1-5 in Example 2.8. ~ is given by (3.98.) and (3.7a.b.c). by
~[L") =S[L")' ~[L"] =S[L"]
by (3. 9a. b). and by
Finally. for the lattice 1 in Example 2.9. x is partitioned as
t t t t t t .(xUlM ,x[L]'x[M]'x[L"]'X[M"]) . It readt ly seen (cf'. Remark 3.2) that the
model H(1) is determined by the single condition that
This reflects the fact that this model is of the same form as that in
Example 2.5 (see discussion 1n Example 2.9).
Remark 3.2. Recall the definition of the normal model H(1) for a
distributive every L.M € 1. It may be
seen from the that many of these tions are r'edundant;
may be om! exEll1npJle wltlen.evE~r L & M:.
54
L', x {; X', and L n II = L 'n II'. then "L' 1"., I"L 'flll' => "L 1 ".1"I.11II .hence the latter condition maY be omitted. The question of characterizing
a minimal set of CI conditions that determines N(::I) is currently under
investigation. For a given lattice ::I. however, such minimal determining
sets are not unique. In Example 2.8. the following four sets of CI
conditions are (equivalent) minimal determining sets for N(::I):
(i) "L 1,. 1"I.11II ; (it) "I...uII 1"L" I"L; (it i) "L' 1,.· IxLl.JM
(i) "L 1". 1"I.11II (it) "L" 1 ,.. I"L ;
(i) ,. 1 XL,,1"I.11II (it) "L' 1,..1"I...uII
(i) ,. 1 "L"I"I.11II (it) "L" 1 ,.. I"L . 0
Remark 3.3. For I = {1.2.3.4}. consider the statistical model consisting
of all normal distributions on IRI such that Xl is independent of x2 and
x3 is independent of x4 . It is readily seen that this model is not of the
form N(::I) for any ::I. The same is true for the normal model determined by
the two conditions that Xl and ~ are CI given (~,x4) and ~ and x4 are
u
Remark 3.4. The general model N(~) is defined by the p!irwise CI
req'U.ireIll~nt {1.4} for every pair L.M € ::I. requirement does not
neice!;ss.ry iIllPly. however. that for every SU1,SElt c,j are
CI given xn(KIKeI)' For the ::I in ~"UjJJ,""
be seen by considering the subset c,j = {L". Ll.JM. )In}.
this may
n
alternative statistical Int:erlprEltat1ctn of model
may be obtailned from (2. :X= € € is an
55
observation from the normal mod~l H(~) if and only if x can .be
represented in the form x =Azfor some (generalized block-triangular)
matrix A € ~(I). where Z == (Z(KJ1K e J(~)) € 1R1 is an (unobservable)
stochastic variate such that Z '" N(lI). From Proposition 2.2(iii). this
representation is equivalent to the system of equations
(3.10) x(L] = I(A(UllZ(M]IM € H(L)). L € J(~).
where H(L) := {M € J(~rIM ~ L) J(=\.). This shows that theeI model JI(~)
can be interpreted as a multivariate linear recursive model (cf. Wermuth
(1980). Kiiveri. Speed. and Carlin (1984» with lattice constraints.
Conversely. suppose that J is a finite index set and let (H(t)lt € J)
be a family of subsets of J that satisfies the following two conditions:
(i)
(ii)
t € H(t)
m € H(t) => H(m) ~ H(t}.
For each t € J let D1 and E1 be finite index sets such that IDtl ~ IEtl
and let I = U(D1It e J}. I I =U(£1 It € J}. Consi.der the normal
statistical model defined by the system of equations
(3.lt)
Dt Emwhere x(t] € IR is observable. z(m] € IR is unobservable.
€ '" N(1I} on • Atm € K(Dt xEm}. and t) =Z==
I. Let 'it
be the subsets of J lZeTler:atEld and for H € 'it
= It € ~ := { e is a of
multivariate normal distribution that the L Ld. model determined by N(:'f)
56
of I and themodel&ietermined by the system (3.11) has the form
(3.10). i.e.. it is themoael N(:'f).
3.3. Invarianceof the model.
It follows from the well-known transformation property of the
is invariant under the transitive action (2.33) of ~(I) on the
parameter space P:'ffl) and the action
[]
(3.12) ~(I)xJl(IxN) -+ JI(IxN)
(A,y) -+ Ay
of ~(I) on the observation space JI(IxN). The MLE is thus equivariant.
§4. TESTING ONE PAIRWISE roNDITIONAL INDEPENDENCE MODEL AGAINST ANOTHER.
Let :'f and .M. be two sublattices of ~(I) such that .M. C :'f. Then P:'f(I) ~
P.u(I) and one can consider the following general testing problem: based
on n i.i.d. observations x1.···.xn € mI from the model N(.M.). test
(4.1)
problem. A, expressed in of its
moments, is derived by means of the invariance of this f"""~1"''''HT problem
under on obselrV!:ltion space space.
WhllCh establ ishes the mutual
invariant statistic 11 's
57
2:[K]'" K € J(:1t). Examples of the general testing problem are presented in
Section 4.3.
A warning about the notation is needed here. Since J{:f) ~ J(.M).
quantities such as <K>. (K]. L(Kl' L(K>' 2:(K]" depend not only on the
subset K of I but also on the lattice of which K is considered a member.
Thus. for example. <K>:1t and <K>.M need not be the same. To alleviate this
dHficul ty without introducing :1t and .M as subscripts. the letter K shall
denote a subset of I that is to be considered ass. member of :1t. while M
shall denote a subset of I that is to be considered a member of .M.
4.1. The likelihood ratio statistic.
Denote the MLE's of Lunder H(:1t) and H(.M) by ~:1t == ~ and ~.M == ~.
respectively.
Theorem 4. 1. Suppose that n ~ max{ IMI 1M € J(.M)}. Then for every 2: e
Pi(I). ~ and ~ exist a.e .. The LR statistic X for testing HO against H is
given by
(4.2) X2/n _ det{~}
- det(~)
_ ll(detf~[Ml"lIME J(~ll _ ll(det(SIlMl"lIM € Jf.M»
- ll(det(~(J(lJIK€ J{:1t)) - ll(detfS(K]" )IK e J(:1t))'
.;;..;;...:;..;;;.;.;;._. The first assertion follows from (3.3) and the inequali
max{IMI 1M € J(.M}} ~ max{IKI IK € J(:f}}. To establish this inequality
mapI>ing 'Ii: J(:1t) ~ J(.M) by 'Ii(K) := By an
simi to that in Pro})Osition 3. ii) of ( it may be
" is3. i) of (
58
implies that .p is surjective. hence
ma.x{11(1 II( € J(.«H = ma.x{ 1.p(K>I IK € J(:1IH
~ ma.x{ IKI IK € J(:1IH·
The second assertion of the Theorem now follows from (3.5) and (3.4). n
For computational purposes. note that
(4.3)
K € J(:1I). where S(y) =yyt (cf. (3.1» with an analogous formula for
4.2. Central distribution and Box approximation.
The testing problem (4.1) is inva,riant under the action (3.12) of the
group ~(I) on the sample space Jf(IxN) and the action
(4.4) ~(I)xP.«(I) -+ P.«(I)
(A.I) -+ AlAt
on the parameter space. Let
(4. ...:Jf(IxN) -+ Jf(IxN)/~(I)
the t prC)Jectl[On onto t space
59
under the action (3.12). Since the LR statistic is invariant under
(3.12). A depends on y € Jf(IxN) only through T(Y). The central
distribution of A is readily derived from this fact and Theorem 4.2.
whose proof is deferred to Appendix A.3. Since the restriction of (4.4)
to P:1t(I) is transitive (cf. Theorem 2.3). under HO the distribution of A
does not depend on }; € P:1t(I).
Theorem 4.2. Under BO' the statistics T and ~[K].' K € J(:1t). are mutually
independent. The statiStic~[K]. has the Wishart distribution on peEK])
with n-I<K>I degrees of freedom and expected value };[K].' [J
It follows from Theorem 4.2 that A and ~[K].' K € J(:1t). are mutually
independent. Therefore for every}; € P:1t(I) (~ Pj(I» and a > O.
hence from (2.37) and (4.2).
However. it follows from the Wishart distribution of ~[K]•
...
The Box approximation for the central distribution of -2logA may be
freeffA1rAnt~A between the number
.~... ·..•tr...(.....(n.-.'<.••••>•. ,-i+1)/2••. +0:....lJ... ]llll . . . .... . ... .... .. .. . .. i=1. • • •. IfMll.M € J(A)
20:/ .. f«n-I<M>I-i+1 )/2)E{A n) = _
+~~::r: :~::: ;:::a} j=I.···. I[KliJK € J(1)]
60
ll(det(IEKJ.)
IK € J{::1t» = det(I) = ll(det{I[Ml.) 1M € J{A»
(4.7)
(4.8) f = -2l{l«-I<M>I-i+1)/2Ii=1.···.IEMJI)IM € J{A»
+2l(l«-I<K>I-J+1)/2IJ=1.···.IEKJI)IK € J(::1t»
= l{ IEMJ Ix I<M> 1+IEMJ I( IEMJ 1-1 )/2 1M e J(A»
-l(IEKJlxl<K>I+IEKJI{IEKJI-1)/2IK € J(::1t»
= J{A»
(4.6) ICI[KIIIK € J(::1t» = II I = IClfMlIIM€ J(A»
for I € P::1t(I). one obtains that
obtained as in Anderson (1984) p.3H-316. In Anderson' s notation we have
a = b = III and
and
where the final equality is obtained using (4.6). From (2.37). one
r-ecognfaea f to
,." 0:for K € J(::1t). with an analogous formula for EHdet(I
EM].» ). M e J(A) .
Since
61
4.3. Examples of testis problems.
Let 11 , - - -. 18 , 19, 1 10 , 111 denote the lattices appearing in Figures
2.1. - --. 2.8. 2.9a. 2.10a. 2.11a. respectively. In this subsection we
consider examples of the testing problem (4.1) with (1.J) = (1.•1.) for1 J
variouspa.irs (i.j). In each example the LR statistic A in (4.2) and the
parameter f in (4.8) is rewritten in forms that reflect the statistical
interpretation of the testing problem. i.e .. that reflect the conditional
independence (eI) condition being tested.
For this purpose we must introduce the following notation: for any :I €
P(l) and any K. L € ~(I) such that L k K. let
~ = [•• ~ ~.K\L]~.L~
denote the pa.rtitioning of ~ according to the decomposition
K = L U (K\L)
and define
-1~-L = ~- ~.L~ ~.K\L € P(K\L).
(When K € )(1) and M€ )(J). ~-<K> = :I[K]_ and :I.M-<M> = :I[M)_') The well
known formula
may be ied in (4.2) to oblcain UnA that appear
62
First:. set: .M = {e.I} in (4.1) and consider the testing problems of the
form
(4.9}
for '1 = '13 ••••• '18. For i =3.···.8. the following forms of the LR
statistics Ai directly reflect the statistical interpretations of the
models K{'1i} given in Section 3.2.:
2/n _ det{S)~ - det(SL)det{~)'
2/n _ . det{Sl_{UlMl)
~ - det(SL. (lfll() )det{~.(IJ1M»'
= Ix I[M] I;
:It = ~:
63
~2/n =
f 7 = I[Lllx I[M] I + I[L 'II xl[M ' ] I ;
x deteSL' .L) x det(SI.(WM»
det(S(WM) .L)det(SLn.L) det(SL' .(WM)Jdet(~,• (WM»
_ •. det(SV_(LnM» . x • det(SI.(WM»- det(SM.(lflMJ)det(SLn• (lflM» det(~, .(WM»det(~,. (WM»
f S = I[Lllxl[M]1 + I[M]lxl[Ln]1 + I[Ln]lxl[M']1
= IILllx I[MIl + I[Lnll( I[Mll+I[M'll>
= I[Mll(IILll+ IIv'll> + I[Lnllx11M'] I;
Remark 4.1. The three equivalent expressi9ns for ~n given a.bove
correspond to the .........,......"..... determining sets of CI condi tions
for H(:ltS)
given in Remark 3.2. The expression for A~n suggested by the
fourth set is
[]
is in some
::;:
2/n . .2/n det(SI·(LUM»A:i ,6 ::;: (A.(A6) ::;: det'Sr". (l1JM) )det(~ '. (l1JM) ) ,
2/nequal to AS • Thus the fourth determining
Next we consider five testing problems of the form (4.1) with (:'It •..4t) ::;:
(:'It. ,:'It.). From (4.2) and (4.S) one may obtain the following expressions:1 J
sense·unsatisfactory for describing .N(:'ItS) '
but this is
65
These five testing problems involve the five aciJacent pairs of lattices
in the diagram
The LR statistic A and the parameter f for non"""Rdjacent pairs may be
obtained from those for acijacent pairs in the usual way. for example:
Remark 4.2. It is thus seen that in each example, the LR statistic can be
represented as a product of LR statistics for testing CI of two blocks of
variates. We conjecture that this is true in general. i.e .• that the LR
statistic A in (4.2) for the general testing problem (4.1) may be written
as such a product. and that furthermore. the factors are mutually
independent under HO' Of course it must be realized that the above
examples involve only very simple lattices. More complex distributive
lattices. e.g. non-planar lattices. may lead to statistical models and
tests with more complex structure. IJ
Let V be a finite-dimensional real vector space. A quotient space (or
V is defined to be a pair (Q'PQ)
a vector space Q a 1inear ma]ppilng
ease of is abbrl!!vilat:ed to Q.
66
Let R and T be two quotients of V. If there exists a linear mapping
PRT:T --+ R such that .~ =~ 0 Pr then ~ is necessarily surjective and
unique. hence (R.~l is a quotient of T. In this situation we write
(R.PR) ~ (T,PT)' or simply R ~ T. This relation is equivalent to the
-1 -1condf tion that ~ (0) ~ Pr (0). The relation ~ on the set of all
quotients of V is not antisymmetric. hence one defines an equivalence
-1 -1relation - on this set by R - T if PR (0) =PT (0). The collection of
equivalence classes is den()ted by QCV). Equipped with the relation
induced by ~ (also denoted by~) .Q(V) bec:;:.()mes a partially ordered set (:;:
poset) .
We identify a quotient (Q'PQ) of V with its equivalence class in Q(V).
A convenient representive for this equivalence class is the canonical
-1 -1quotient space (V/PQ (O).p). where p:V --+V/PQ (0) is the canonical
quotient mapping given by p(x) =x + PQ1(0 ) . x € V.
TlleposetQ(V) is in fact a lattice: if R.T € QCV) then their minimum
and maximum exist and are given by
RAT
RVT
-1 -1:= V/(PR (0) + PT (0»
:::: V/fP;1(O) n p;;:1(0»
respeetively. The minilDl:l.l and max~lDl:l.l elements exist and are given by {OJ
and V re~;pectj~vely dim(V) ~ 2 then Q(V) is not distributive and
= Qt. Since V is finite dimensional. the lattice Q(V) has
length. hence so does any sublattice Q k Q(V). Therefore. if Q is a
of tl(V) it must be . The is
to Se.~t:t.on 3 ( lattices
61
5.2. Invariant f'ormulation of' the pairwise CI.model.
For a € P(V} := the cone of' all positive def'inite f'orms on the dual
vector space v* of' V. let NCa} denote the normal distribution on V with
mean vector 0 € V and covariance a (cf'. [A] (1915). Section 5). Let Q l;;
Q(V} be a sublattice such that {O}. V e Q.
Def'inition 5.1. The class PQ(V} l;; P(V} is def'ined as f'ollows:
(5.1) a € PQ(V} (=) J1l(:X:} J1. PT(x} IJ1lAT(x} V R. T € Q when x "" N(a}.
i.e .• PR and PT are conditionally independent (CI) given J1lAT (compare to
Def'inition 2.1). 0
Theorem 5.1. The class PQ(V} is nonempty if' and only if the lattice Q is
distributive.
Proof'. See Appendix A.2. n
The normal statistical model NV(Q} def'ined by the requirement (5.1) of'
pairwise conditional iridependence W'rt Q is then given by
(5.2)
(compare to (I.a)). By Theorem 5. Nv(Ql ~ " if' and only if' Q is
distributive.
te index set.
I)I
Q(~) l;; Q(R ) as fo1
68
each K E :It define the coordinate projection Ii<:IRI -+.t< by Ii<((xiliEI)) =
(xi liEK). Since:lt is a ring, it follows that Q(:It) := U.t<'Ii<lIK E :It} is a
distributive lattice of quotients of IRI. If fif, I €:It, then to} ,IRI e Q(:It).
Thus each canonical coordinate-wise CI model N(:It) given by (1.8) is a
special case of the general CI model NV(Q) given by (5.2). D
Conversely, by Proposition 5.1 below every distributive sublattice Q ~
Q(V) can be represented in the form Q =Q(:It) for some ring of subsets :It
and everl CI model NV(Q) can be represented as a canonical model N(:It).
5.3. Reduction of the CI model to canonical coordinate-wise form.
Proposi tion 5.1. Let Q ~ Q(V) be a distributive lattice of quotients.
Then there exists a set I, a ring :It of subsets of I with the property fif,I
€:It, a lattice isomorphismQ -+K(Q) of Q -+:It, and a basis (eili € I) for
V such that the quotients (Q'PQ) € Q can be represented as follows:
(5.3)
(5.4)
Q = sPan{e.li € K(Q)},1
[e i for i € K(Q)PQ(ei) =
0 for i e I\K{Q) .
Proof. See Appendix A.l.
We say that a basis (e.li € Il for V satisfying the conditions in1
D
tion 5.1 is ~!E.!~ to Q. when V is l(1lenl:l1 ied wi th
Thl~nl'lUh a lJ-:Bd~'ln1:ed (e.li€I),1
lattice Q ~ Q(V)
is wi the == subsets I and the
69
quotients})Q' Q EQare identi:£ied with the cOQrdinate projections
I1c=ml... mK.K E :I(QJfcf. ~le 1). FurtherlDQre. PfV) is identified
with P(I) through the correspondence a ... I. where I is the matrix of a
wrt the dual basis (e~li E I) for v*. The condition (5.1) is then1
transformed into the condition (2.1). hence PQCV) is identified with
P:I(Q) (I) and the model NV(Q) is transformed into the canonical form
N(:1l{Q» .
Remark5.L Since the identity matrix 11 E P:I(Q) (I). the model NV(Q) is
nonempty when Q is distributive. IJ
5.4. Invariant formulation of the testing problem.
Let Q and ,. be two distributive sublattices of Q(V) such that" C Q.
Then PQ{V) ~ P,.{V) and one may consider the general problem of testing
Nv(Ql against the i(possible) larger model NVC") 011 the basis oin i. i .d,
observations from V. Le .. testing
(5.5)
By Proposition 5.1 we may choose a Q-adapted basis (e. Ii E I) for V;1
clearly this basis is also adapted to '!l. It follows immediately that the
tes ting prllDJ.em is into the caJIlOJI1C:al
(4.1) by
§6.
Cnl)1C~e of a fJ.-jac:Ul.Dl:ed basis.
remain open COlllc«~nling the structure of
is QUlesl~1(Jtn under
70
minimal determining sets of CI conditions for N(:Jt) (cf. Remarks 3.2 and
4.1). A second question is whether every testing problem of the general
form (4.1) can be decomposed into a product of simpler testing problems
[cf , Remark 4.2). The answer to this question will be of use for a
decision-theoretic study of the LR test and other invariant tests for the
problem (4.1).
The normal statistical models N(:Jt) may be generalized in several ways.
One natural and possibly fruitful extension is suggested by an
examination of the::tl-p:;'lr~tri~tion.(2.32) of P:Jt(l). A large class of
"second-order" submodelsof H(:Jt) may be obtained by replacing each perK])
in (2.32) by P::tt' ([K]), where each ::tt' == ::tt' (K) is subring of ';([K]).
Third-order and higher-order submodels may be obtained by iterating this
process. This construction yields a rich class of normal conditional
models and associated testing problems which, despite their app:;'lrent
complexity 'adJnit a relativelystandardexplici t li}(.(illihood analysis.
Alternatively, one might replace each term 1I([K]x<K»xP([K]) in the
:Jt....parametrization (2.32) by a suitable covariance selection model
requirement (cf. Dempster (1972), Wermuth (1976, 1980», thus
generalizing the JJlulflva,ria,te gra.pl1ical chain models of Lauritzen and
Wermuth (1989) to "multivariate graphical lattice models".
Another interesting question the rela.tion of the lattice CI models
(1985, 1989),decomposable Q'rl~ntls
CI models determined by
and Wermuth
toexteIls:l.on.s just rlp!Qt'":l~iH(::Jl) (and
(1989), etc.). It appears that the class of decomposable graphical CI
contains nor is co:nulirled in the of lattice
ttl'"',....""~. "" contailnEid here.
71
APPENDIX.
In Appendiees A.1 and A.2. the notation and terminology of Seetion 5
are followed.
A.1. The J)eeoW9si tion Theorelll and existence of a :It-adapted basis.
Lemma A.I. For R € (l(V). the set (l(V}R := {Q € (l IQ ~ R} is a sublattice
of (l(V) iSOlllOrphic to th(3 lattice (l(R) of quotients of R through the
lattice isomorphism
(l(R) ~ (l(V}R
(Q'PQR) ~ (Q'PQRol1l)'
Proof. Straightforward. []
-1Lemma A.2. Let R. T € (l(V) with RVT =V and let rR:R -+ I1MT.R(O) and
rr:r -+ P~T.r(O} be surjective linear mappings. Then the linear mapping
(A.1)-1 -1
cp:V -+ (RAT) x I1MT.R(O} x I1MT.T(O)
x ..+ (~T(x). I'R(PR(x)). rT(PT(x)))
is bijective.
I1Mr(x) =0 and we obtain that l1l(x) €
r R is Similarly PT{x) = O.
Suppose that cp(x) = O.
fact PR(x) = 0
-1 -1hence x € I1l (0) n Pr (0) = {O}. The linear mapping cp is thus injective.
-1 -1O:ln,ce dim(V} = dim{(RAT) x I1MT.R{O) xI1MT. T{O». cp is also
Le.•
dimension dim(Q) - dim«Q>}.
Q.l.Q s R}, aSQ.plattice of fl. (lly= fl.).
VfQ E J(llR})
J(llR} = J(fl.} n llR
-1 I"y:Y -+ X(P<Q>,Q(O} Q E J(ll}}
x -+ (rQ(PQ(x)} IQ E J(ll}}
J.(ll} := {Q E<Q..IQ~ {O}. <Q> <Q}
= {QE lllQj1f {OJ .VR.T EQ.: Q = RVT => Q = R or Q = T}.
<Q> := V(Q' E lllQ' < Q}
As in Section 5.3. let II be a distributive sublattice of ll(Y} such that
In the following theorem the space Y is represented as a product of
is bijective.
(A.2)
(A.3)
(AA)
(A.5)
Proof. For R II define llR :=
vector s.paces indexed by J(ll} such that the space with index Q E J(ll} has
{O}.Y E ll(Y}. For Q E ll. Q ~ {O}. define
TheoremA.l. i(necompositloll Theorem}. For eachQ E JfQ.}. let
rQ:Q -+ P~~>,Q(O} be any surjective linear mapping. Then the linear
mapping
and let J(ll} denote the poset of all join~irreducible elements in ll.
73
(cf. (2.4) - (2.7)). The proof proceeds by induction on IJ(tJ) 1 =: q. If q
= 1. then tJ = {{O}. V} and the result is trivial. Next. assume that the
result is true whenever q ~ k-l and suppose that q = k. If V € J(tJ) then
IJ(l2<v»1 =k-I. hence the mapping
is bijective by the induction assumption and Lemma A.I. Since the linear
mapping
-1V ~ <V> x p<V>(O)
x ~ (P<V>(x). rV(x))
is bijective and PQ.<V>op<V> =PQ for every Q € J(tJ<V»' the mapping
(A.2) is bijective in this case.
If. on the other hand. V f J(tJ). then V =R V T where R < V and T < V.
It follows from (A.3) that IJ(tJR)1 < k and IJ(~)I < k. so by the
induction assumption and Lemma A.I. the mapping
-1 1V ~ X(P<Q>.Q(O) Q € J(tJR))
x ~ (rQ(PQ(x)) IQ e J(tJR))
is (equivalent to) the quotient mapping ~:V ~R. Similarly. the quotient
mappings Pr and I1MT can be represented in an analogous way. hence
74
Thus. by fA.S} and (A.6).
Lemma A.2 now implies that "V is bijective. a
Remark A.!. The representation (A.2) shows that V can be identified with
a product of vector spaces indexed by J(~}; similary. each R € Q can be
identified with the product X(p~~>.Q(O)IQ € J(QR» through the bijective
linear mapping "R defined by "R(x) = (rQ(PQ(x)) IQ € J(QR))' x € R; under
these identifications. each mapping PRT' R S T S V. is simply a canonical
projection mapping.
Proof of Proposition 5.1. For each Q € J(Q). let [K(Q)] be a set with
For R € Q. define
n
(A.7)
and ...~ ...._- I := KeV). Fr()IDfA.5} and (A.6) it follows that ~ 5: ~(~) :=
{K(R)IR €Q} is a ';::l1T1'1"'il'\(T of ~(I) and the mapping R ~ K(R) is a lattice
isomorphism between Q and ::I.. Now Remark A.l implies that there exists a
basis (e. €1
V such the elements (R.I1t) in Q can be
75
A.2. Proof of Theorem 5.1.
Lemma A.3. Suppose that x '" N(o) , a E: P(V). Then for any R, T E: (l(V) ,
1 I -1 -1PR(x) ·Pr(lC)~T(lC) (=) ~ (0) and Pr {OJ are geometrically orthogonal
(g ;o .'} wrt the inner product 6 := a-Ion V (cf. [A] (1990), Definition
4.1, for the definition of g.o.).
-1 ~ -1 ~ -1 ~Proof. Let PR (0) , Pr (OJ , and ~T{O) denote the orthogonal
-1 -1 -1complements of ~ (O), Pr (0), and ~T(O), respectively, wrt 6.
Furthermore, 1etqR' qT' and qRAT be the orthogonal projections of V onto
-1 L -I . .1 -1 .. L ..•• -1. L -1 . L~ (OJ , Pr (0), arl.d~T(O) . Then (~(O) , qR)' (PT (0) , qT) and
-1 ~(~T(O) , qRAT) represent the quotients (R,~). (T.PT). and (RAT,PRAT).
respectively. Therefore
onto
orthogonal
(=)
(=)
(=)
(=)
<=)
~(X) 1 PT(x) I~T(x)
qR(lC) 1 qT(x) IqRAT[x]
(qR(lC) - qRAT(x)) 1 (qT(x) - qRAT(x» IqRAT(x)
(qR(x) - qRAT(x» 1 (qT(x) - qRAT(x».
-1 ~-l -1 ~-1(PR (OJ II~TfO» ~ (PT (0) I"lpRAT(O»
-1 ~ ~1 ~PR (0) and PT(O) are g.o.
-1···· -1PR(O) and Pr.(O) ar-eg ;o.
and this direct sum is
fourth (=) follows from (*). the
fifth and sixth <=)'s are elementary properties of geometric
76
-1Proof of Theorem 5.1. Since the correspondence Q +-+ PQ(O) between Q(V)
and the lattice ~(V) of all subspaces of V [ef . [A] (1990), Section 4.1)
-1 Iis a lattice anti-isomorphism it follows that ~ := {PQ (OlQ € Q} ~ ~(V)
is a lattice and is anti-isomorphic to Q. If a € PQ(V) #- 0, then by Lemma
A.3. ~ is g.o. wrt 6 := a-1 . Thus by Proposition 4.1 of [A] (1990) ~ is
distributive. hence so is Q. Conversely, if Q is distributive. then PQ(V)
#- 0 by Remark 5.1.
A.3. Proof of Theorem 4.2.
Let n C .(IxN) be the open subset
u
(A.8) n := {y € .(IxN)I rank(y) = min{III.n}}.
Since .(IxN)\n is a Lebesgue-null set. we may replace the sample space
.(IxN) by O. Also. since rank{Ay)= rank(y) for A € GL:/l(I) and y €
.(IxN). it follows that ~(I) acts on n by restriction of (3.12).
Furthermore. since n is locally compact. Lemma A.5 at the end of this
subsection implies that this restriction is a proper action (whereas
(3.12) itself is not proper). Thus. in order to prove Theorem 4.2 we may
apply the method of [A] (l9S2) to study the transformation of the normal
distributions in the model HO under the mapping
(A.9) n -+ O/~{I) x (X{p([K])IK € J{:/l»))
y -+ (If"{Y) , (t[K]. (y) IK e J(:/l))).
sA and
group is
rA:::; {A E~(I114[K> ::; O. K E J(::ll}
'!J :::; {T E ~fIrl T[KJ ::; I[K]' K E J(::l}}.
Therefore we may apply the method of [AJ (1982). ~ction 5. with K =
~( I). H = rA. G = '!J. and X = {) to see that 7T can be represented as 7T =
7TrA0Tr'!J' where 1I''!J:{) ~ O/'!J and 7TrA :O/'!f ~ (O/'!J)/rA ~ O/~{I). (The action of '!J
on n is the restriction of (3.I2) to '!JxD. and the induced action of rA on
O/'!J is defined as equation (2I) of [AJ (1982).)
Since the ma.pping (A.9) is invariant un<ier the t'lction of '!J on n {cf.
(2.26J). it has a unique factorization through 7T'!J' Therefore we may first
transform the normal distributions in the model HO from n to O/'!J by 7T'!J'
To do this. we need the following explicit representation:
Lemma A.4. A representation of 7T'!J:n ~ O/'!J is given by
(A. H)
Proof: To show that O/'!J in {A. to) is a cross...s.ection of n and that 7T'!J in
(A. H) is a maximal invariant function. it sl1ffices to show that for each
yEO.
(A.12)
show k in . suppose
(A. Pr~tnnlqition 2.2{ii). {2.
78
for each K € J{~}. hence
{A.13}
K € J{~}. i.e .• -Ty = T~{Y}. To show the opposite inclusion~. it is easy
to verify that T~{Y} € (nt~) for every y € 0; to show that T~{Y} €
{TyIT € ~}. simply note that T~{Y} =Ty where
K € J{~). Finally. the mapping T~:O ... nt~ defined in (A.10) and (A.ll) is
clearly continuous. so this representation is also topological and the
resul t follows.
We may now apply fOl'Dll.lla. {16} of [A] {1982} to transform the normal
distributions in the model HO by the mapping T~ given by {A.ll}. In the
notation of Section 40f [A] {1982}. G = ~. X = O. ). is the restriction
of LebesglJ.e~sl.lreonll(hN} to the open subset 0.11 = T~. 13 is a Haar
measure on ~.AG = ~. == 1. and P = p-).. i.e .• P is the normal
distribution with density p given by
[]
= Y €
79
with respect to A. For .'I E: P::f(I). the density q of If,.(P) wrt the quotient
measure All3 on fV!f is thus given by
-nl21 -1 t(A.14) q(lf,.(Y» = (det(.'I»,.exp{-tr(.'I (Ty)(Ty) )/2}d{j(T)
where
K E: J(::f). T E: ,..
Since d{j(T) = U(~(T[K»IKE: J(::f», where ~ is the Lebesgue measure
on M([K]x<K» [cf". (2.19), the last integral in (A.14) can be calculated
using Fubini's Theorem and the translation invariance of ~. K E: J(::f).
The order of integration should be determined a never-incr~sipg
listing K1,K2, ••• ,KIJf::f)I of the elements in J(::f) [cf . Remark 2.1). After
some calculation we obtain
(A.I5)
xU( E: }
80
-(n-I<K>IJ/2 - 1 I= Ilf{det{2f Kl_l)
exp{-tr{2f Kl_S[Kj_{yl)/2}
•• K € J(:1ll)
1f,.{y) € on.
twhere Sly) = yy .
By Lemma A.4. where 0/,. is represented as a subset of O. the induced
action of the subgroup d on on is simply the restriction of the action
(3.12) to dx(o/sl. The next step is to represent the transformed measure
1f,.fP) q-(M/l) aS1f,.(PJ = ql-v, where v is an invariant measure under
this actiono£ don 0/,..
It follows from the statement following the proof of Proposition 2 on
p , 961 of [AJ (1982) that the quotient measure A/P is relatively
invariant under the action of d on 0/,. with multiplier X given by X(A) =
(mod~A)-IXo(A), A € d. where Xo is the multiplier for A as a relatively
invariantmea.sure under the action of ~(I) on 0 and where the
-1 .automorphisms ~A:" -+,. are defined by ~A(T) = ATA • T € ,.. SInce A =
Diag(A[K1IK € J(:ltl) it is clear that
V K e J(:1l).
hence
I I I<K>1·1 ·1·lfKJIImod~A =llf<iet(,AfKll.· det(A<K» . ·K € J(:ltl).
However.
Xo(A) = Idet(Alln
= Il( Idet(A[KJ) In IK € J(:1l».
so
= Il( ) € }.
81
If we define m=O/" -+ JO,co[ by
it follows that m{Az) =~(A)m(z), z € 0/,.. A € ~ (compare to (17) in [A]
1982). Thus the measure v := m-1· 0v p) is invariant under the action of ~
on 0/,.. From (A.15). the density q1 := mq of W,.(P) with respect to v is
therefore given by
~det (~ (»] (n-I <K> 1)/2[K)· y -1
H( det(z[K].) x exp{-ntr(z[K].~[K].(Y»/2}IK € J(~».
where it should be recalled that z € P~(I).
The final step in the proof of Theorem 4.2 is to obtain the
transformation of the measure W,.(P) =q1·v under the mapping
(A.16) 0/,. -+ (o/,.)/~ x (X(P([K])tK € J(~»)
w:r(y) -+ (W,.(w:r(Y»' (~[KJ~ (ylIK € J(~»).
Since the action of ~ on 0/,. is the restriction to the closed subset
tdx (0/,.) of the proper action of~(I) on 0, it is a proper action. Thus
we may apply Lemma 3 of Andersson. BrimS and Jensen (1983) to see that
there exists a unique measure " on (O/,.)/~ such that the invariant
measure v is transformed into tbe product measure d~vO under the mapping
is an invariant measure on X{P{ ) € the
proper transitive action
(A. 17)
S2
~X(X,(p([K])IK E J(:f») X,(p([K])IK E J(:f»
(A. (A[K]IK E J(:f») (A(K]A(K]AtK]IK € j(:f».
(Lemma 3 of Andersson. Brems. and jensen (1983) is applied with G = sl. X
= DI,.. Y = X(p«(K])IK € J(:f». t = (1T,.(y) .... (~(K]. (y) IK € j(:f»). 1T = 1Tsl•
and v = v.)
Since q1 (z) depends on z := 1T,.(y) only through (~[K].(y>lK € j(:f». the
probability measure Q1·v is therefore transformed under (A.1S) into the
probabf li ty measure r· (dOvo)' where
1': (DI")/~ x (X(P([K]) IK e j(:f») IR+
(w. (A(K] IK € j(:f») ....
(A.1S) ~d teA ) ](n- I<K>I)/2
e [K] . -1llCdet(I[K].> x ex:p{-ntr(I(K].A[K])/2} IK € j(:1l».
Because I' does not depend on w. under HO it follows that 1T == 1T~01T,. is
independent of (~(K].IK€ J(:1l». 1T has distribution K. and
(~[KI.IK € J(:f» has distributions·vO' where sHA[K] IK € J(:1l» is given
by the product (A.1S). Furthermore. since "e = 8(vKIK € J(:f» where vK is
an invariant measure onP([K]) under the usual action of CL(EK]). it
follows that under ~[KI.' K<€ j(:Jt). a.relDUtua1ly independent and
~[K]. has the on perK]) with n-I<K>I degrees of
freedom and expected value I[K].' This ends the proof of Theorem 4.2.
following lemma. which was cited at the beginning of this
interest in its own group
actions in statistics.
Prooos t t.Ioa 5 (ti».
( ) . tre . 14.
83
Lemma A.5. Suppose that C and C' are locally COlllpaCt groups that act
continuously on the locally COIJlP8.Ct spaces X and X' , respectively. Let
'I'=C 4 C' be a continuous group homomorphism and >/1= X 4 X' be a continuous
mapping such that >/I(gx) = 'P(g)>/I(x), x € X, g € C. If 'I' is proper and if
the action of C' on X' is proper, then the action of C on X is also
proper.
Proof: Consider the diagram
C X0
X XX 4 X
'Px>/l J J>/Ix>/l
C'x X' 0' x:« X',4
-1where S(g,x)=(gx,x) and S' [g ' .x ' )=(g'x' .x "}. We must show that S (C) is
compact whenever C ~ XxX is compact. Let PC' denote the projection of
C' xX' ontoC'. Since the diagram commutes, I.e., S'o('Px>/l)=(>/Ix>/l)oS, it
follows that
S-l(C) ~ S-l«>/Ix>/l)-l«>/IX>/l)(C»)) = ('Px>/l)-l(S,-l«>/IxIjlHC»))
-1-1. • • -1~('Px>/l) (PC ,(0 , «>/Ix>/lHC»))xX) = 'I' (C' j-x
where C' = PG•.(S,-l«>/Ix>/lHC))). Sinc.e trivially S-I(C) ~ GXP2(C), where
P2 denote$ the projection of XxX on the sec()nd component, we have that
-1 -1S (C) ~ 'I' (C') xP2(C),
C is since S· is proper and -1t'h"'1"'"",fn1""'" 'I' .) is COlllp;ElCt
84
-1because ~ is proper. Thus 8 (e) is a closed subset of a compact subset
of GxX. hence is compact.
With the identifications G =G' =~(I), X =O. X' = P~(I}. ~ = the
identity mapping on ~(I). and Vi = t Lemma A.S may be applied as
indicated at the beginning of this subsection.
[J
85
REFERENCES •
Anderson. T. W. (1985). An Introduction to Mul tivariate Statistical- ... -Analysis (2nd ed.). John Wiley and Sons. New York.
Andersson. S.A. (1975). Invariant normal models. Ann. Statist. 1.
132-154.
Andersson. S.A.(19S2). Distributions of maximal invariants using
quotient measures. Ann. Statist. 10. 955-961.
Andersson. S.A. (1990). The lattice structure of orthogonal linear models
and orthogonal variance component models. Scand. J.. Statist. 17.
287-319.
Andersson. S.A.• Brsns.H.K.• and Jensen. S.T. (1983). Distribution of
eigenvalues in multivariate statistical analysis. Ann. Statist. 11.
392-415.
Andersson. S.A.• Marden. J.I.. and Perlman. M.D. (1990). Totally ordered
mul tivariate linear models wi th applications to monotone missing data
problems. In pr~paration.
Andersson. S.A.• M.D. (1991).
independence models for missing data. To apPear in Statist.
S6
Banerjee. P.K.• and Giri. N. (1980). On D-. E-. DA- . and »X-optimality
properties of test procedures of hypotheses concerning the covariance
matrix of a normal distribution. In Multivariate Statistical Analysis
(R.P. Gupta. ed.) 11-19. North Holland Pub. Co•• New York.
Bourbaki. N. (1971). Elements de Mathematique. Topologie generale. Chap.
1 a 4. Herman. Paris.
Das Gupta. S. (1977). Tests on multiple correlation coefficient and
multiple partial correlation coefficient. I. Multivariate Analysis I.
82-88.
Davey. B. A. and Priestley. H. A. (1990). Introduction ~Lattices and
Order. Cambridge University Press. Cambridge.
Dempster. A. (1972). Covariance selection models. Biometrics 28. 157-175.
Eaton. M.L. (1983). Multivariate Statistics: A Vector Space Approach.
John Wiley and Sons. New York.
Eaton. M.L. and Kariya. T. (1983). Multivariate tests wi th incomplete
data. Ann. Statist. !!.. 654-665.
Frydenberg. M. (1990). The chain graph Markov property. Scand. J.
17. 333-353.
87
F~denberg. M. and Lauritzen. S.L. (1989). Decomposition of maximum
likelihood in mixed graphical interaction models. Biometrika 76.
539-555.
Giri. N. (1979). Locally minimax test for multiple correlations. Canad.
J. Statist. I. 53-60.
Gratzer. G. (1978). General Lattice Theory. Birkhauser. BaseL
Kiiveri. H.• Speed. T.P .• and Carlin. J.B. (1984). Recursive causal
models. J.. AustraL Math. Soc. (Ser. A) 36. 30-52.
Lauritzen. S.L. (1985). Test of hyPOtheses in decomposable mixed
interaction models. Bull. Int. Statist. Inst. 4. 24.3(1)-24.3(6).-- --
Lauri tzen; S.L. (1989). Mixed graphical association models. Scand. J..
Statist. 16. 273-306.
Lauritzen. S.L.• Dawid. A.P.• Larsen. B.N. and Leimer. H.G. (1990).
Independence Properties of Directed Markov Fields. To apPear in
Networks.
Lauritzen. S.L. and Wermuth. N. (1984). Mixed interaction models. Res.
rep. R-84-8. Inst. of Electr. Systems. Aalborg UBiv.
Lauritzen. S.L. and Wermuth. N. (1989). Graphical models for association
between variables. some of which are quali tative and some quantitative.
Ann. Statist. 11. 31-57.
Little. R.J.A. and Rubin. n.B. (1987). Statistical Analysis with Missing
Data. John Wiley and Sons. New York.
Marden. J.I. (1981). Invariant tests on covariance matrices. Ann.
Statist. ~. 1258-1266.
Porteous. B. T. (1985). Properties of log-linear and covariance selection
models. Doctoral thesis. University of Cambridge.
Rubin. D.B, (1987). Multiple Imputation for Nonresponse in Sample
Surveys. John Wiley and Sons. New York.
Speed. T.P. and Kiiveri. H. (1986). Gaussian Markov distributions over
finite graphs. ~. Statist. 16. 138-150.
Wermuth. N. (1976). Analogies between multiplicative models in
contingency tables and covariance selection. Biometrics 32. 95-108.
Wermuth. N. (1980). Linear recursive equations. covariance selection. and
analysis. J. Amer. Statist. Assoc. 75. 963-972.
(
structures.
89
Wermuth. N. (1988). On block-recursive linear regression equations.
Manuscript. Psychologisches Insti tut. Universi tat M'ainz.
Whittaker. J. (1990). Graphical Models .!!!. Applied. Jful tivariate
Statistics. Wiley. New York.