+ All Categories
Home > Documents > Gram Schmidt Process

Gram Schmidt Process

Date post: 15-Sep-2015
Category:
Upload: man
View: 228 times
Download: 1 times
Share this document with a friend
Description:
1. From Wikipedia, the free encyclopedia2. Lexicographical order
Popular Tags:
55
Gram–Schmidt process From Wikipedia, the free encyclopedia
Transcript
  • GramSchmidt processFrom Wikipedia, the free encyclopedia

  • Contents

    1 General linear group 11.1 General linear group of a vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 In terms of determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 As a Lie group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.3.1 Real case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.2 Complex case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    1.4 Over nite elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.5 Special linear group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.6 Other subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.6.1 Diagonal subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.6.2 Classical groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

    1.7 Related groups and monoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.7.1 Projective linear group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.7.2 Ane group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.7.3 General semilinear group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.7.4 Full linear monoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

    1.8 Innite general linear group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.11 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    2 Generalizations of Pauli matrices 82.1 Generalized Gell-Mann matrices (Hermitian) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.1.1 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.2 A non-Hermitian generalization of Pauli matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2.2.1 Construction: The clock and shift matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    3 Generalized eigenvector 123.1 For defective matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

    i

  • ii CONTENTS

    3.2.1 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.2.2 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    3.3 Other meanings of the term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4 The Nullity of (A I)k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    3.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4.2 Existence of Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4.3 Constructive proof of Schurs triangular form . . . . . . . . . . . . . . . . . . . . . . . . 153.4.4 Nullity Theorems Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    3.5 Motivation of the Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.5.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5.3 Preliminary Observations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.5.4 Recursive Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.5.5 Generalized Eigenspace Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.5.6 Powers of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.5.7 Chains of generalized eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.5.8 Ordinary linear dierence equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    3.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

    4 Generalized singular value decomposition 354.1 Higher order version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.2 Weighted version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

    5 Gershgorin circle theorem 375.1 Statement and proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.3 Strengthening of the theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.5 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

    6 GoldenThompson inequality 416.1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 416.2 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

    7 Graded (mathematics) 42

  • CONTENTS iii

    8 GramSchmidt process 448.1 The GramSchmidt process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 448.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468.3 Numerical stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468.4 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478.5 Determinant formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478.6 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 498.9 Text and image sources, contributors, and licenses . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    8.9.1 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508.9.2 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 508.9.3 Content license . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

  • Chapter 1

    General linear group

    For other uses of GLN, see GLN.

    In mathematics, the general linear group (GLN) of degree n is the set of nn invertible matrices, together with theoperation of ordinary matrix multiplication. This forms a group, because the product of two invertible matrices isagain invertible, and the inverse of an invertible matrix is invertible. The group is so named because the columns ofan invertible matrix are linearly independent, hence the vectors/points they dene are in general linear position, andmatrices in the general linear group take points in general linear position to points in general linear position.To be more precise, it is necessary to specify what kind of objects may appear in the entries of the matrix. Forexample, the general linear group over R (the set of real numbers) is the group of nn invertible matrices of realnumbers, and is denoted by GLn(R) or GL(n, R).More generally, the general linear group of degree n over any eld F (such as the complex numbers), or a ring R (suchas the ring of integers), is the set of nn invertible matrices with entries from F (orR), again withmatrix multiplicationas the group operation.[1] Typical notation is GLn(F) or GL(n, F), or simply GL(n) if the eld is understood.More generally still, the general linear group of a vector space GL(V) is the abstract automorphism group, not nec-essarily written as matrices.The special linear group, written SL(n, F) or SLn(F), is the subgroup of GL(n, F) consisting of matrices with adeterminant of 1.The group GL(n, F) and its subgroups are often called linear groups or matrix groups (the abstract group GL(V)is a linear group but not a matrix group). These groups are important in the theory of group representations, and alsoarise in the study of spatial symmetries and symmetries of vector spaces in general, as well as the study of polynomials.The modular group may be realised as a quotient of the special linear group SL(2, Z).If n 2, then the group GL(n, F) is not abelian.

    1.1 General linear group of a vector spaceIf V is a vector space over the eld F, the general linear group of V, written GL(V) or Aut(V), is the group of allautomorphisms of V, i.e. the set of all bijective linear transformations V V, together with functional compositionas group operation. If V has nite dimension n, then GL(V) and GL(n, F) are isomorphic. The isomorphism is notcanonical; it depends on a choice of basis in V. Given a basis (e1, ..., en) of V and an automorphism T in GL(V), wehave

    Tek =

    nXj=1

    ajkej

    for some constants ajk in F; the matrix corresponding to T is then just the matrix with entries given by the ajk.In a similar way, for a commutative ring R the group GL(n, R) may be interpreted as the group of automorphisms of

    1

  • 2 CHAPTER 1. GENERAL LINEAR GROUP

    a free R-moduleM of rank n. One can also dene GL(M) for any R-module, but in general this is not isomorphic toGL(n, R) (for any n).

    1.2 In terms of determinantsOver a eld F, a matrix is invertible if and only if its determinant is nonzero. Therefore an alternative denition ofGL(n, F) is as the group of matrices with nonzero determinant.Over a commutative ring R, one must be slightly more careful: a matrix over R is invertible if and only if its deter-minant is a unit in R, that is, if its determinant is invertible in R. Therefore GL(n, R) may be dened as the group ofmatrices whose determinants are units.Over a non-commutative ring R, determinants are not at all well behaved. In this case, GL(n, R) may be dened asthe unit group of the matrix ring M(n, R).

    1.3 As a Lie group

    1.3.1 Real case

    The general linear group GL(n,R) over the eld of real numbers is a real Lie group of dimension n2. To see this,note that the set of all nn real matrices, Mn(R), forms a real vector space of dimension n2. The subset GL(n,R)consists of those matrices whose determinant is non-zero. The determinant is a polynomial map, and hence GL(n,R)is an open ane subvariety of Mn(R) (a non-empty open subset of Mn(R) in the Zariski topology), and therefore[2]a smooth manifold of the same dimension.The Lie algebra of GL(n,R), denoted gln; consists of all nn real matrices with the commutator serving as the Liebracket.As a manifold, GL(n,R) is not connected but rather has two connected components: the matrices with positivedeterminant and the ones with negative determinant. The identity component, denoted by GL+(n, R), consists of thereal nn matrices with positive determinant. This is also a Lie group of dimension n2; it has the same Lie algebra asGL(n,R).The group GL(n,R) is also noncompact. The[3] maximal compact subgroup of GL(n, R) is the orthogonal groupO(n), while the maximal compact subgroup of GL+(n, R) is the special orthogonal group SO(n). As for SO(n), thegroup GL+(n, R) is not simply connected (except when n = 1), but rather has a fundamental group isomorphic to Zfor n = 2 or Z2 for n > 2.

    1.3.2 Complex case

    The general linear GL(n,C) over the eld of complex numbers is a complex Lie group of complex dimension n2. Asa real Lie group it has dimension 2n2. The set of all real matrices forms a real Lie subgroup. These correspond tothe inclusions

    GL(n,R) < GL(n,C) < GL(2n,R),

    which have real dimensions n2, 2n2, and 4n2 = (2n)2. Complex n-dimensional matrices can be characterized as real2n-dimensional matrices that preserve a linear complex structure concretely, that commute with a matrix J suchthat J2 = I, where J corresponds to multiplying by the imaginary unit i.The Lie algebra corresponding to GL(n,C) consists of all nn complex matrices with the commutator serving as theLie bracket.Unlike the real case, GL(n,C) is connected. This follows, in part, since the multiplicative group of complex num-bers C* is connected. The group manifold GL(n,C) is not compact; rather its maximal compact subgroup is theunitary group U(n). As for U(n), the group manifold GL(n,C) is not simply connected but has a fundamental groupisomorphic to Z.

  • 1.4. OVER FINITE FIELDS 3

    1.4 Over nite elds

    Cayley table of GL(2,2), which is isomorphic to S3.

    If F is a nite eld with q elements, then we sometimes write GL(n,q) instead of GL(n,F). When p is prime, GL(n,p)is the outer automorphism group of the group Znp, and also the automorphism group, because Znp is Abelian, so the inner automorphism group is trivial.The order of GL(n, q) is:

    (qn 1)(qn q)(qn q2) (qn qn1)This can be shown by counting the possible columns of the matrix: the rst column can be anything but the zerovector; the second column can be anything but the multiples of the rst column; and in general, the kth column canbe any vector not in the linear span of the rst k 1 columns. In q-analog notation, this is [n]q!(q 1)nq(

    n2):

    For example, GL(3,2) has order (8 1)(8 2)(8 4) = 168. It is the automorphism group of the Fano plane and ofthe group Z32, and is also known as PSL(2,7).

  • 4 CHAPTER 1. GENERAL LINEAR GROUP

    More generally, one can count points of Grassmannian over F: in other words the number of subspaces of a givendimension k. This requires only nding the order of the stabilizer subgroup of one such subspace and dividing intothe formula just given, by the orbit-stabilizer theorem.These formulas are connected to the Schubert decomposition of the Grassmannian, and are q-analogs of the Bettinumbers of complex Grassmannians. This was one of the clues leading to the Weil conjectures.Note that in the limit q1 the order of GL(n,q) goes to 0! but under the correct procedure (dividing by (q-1)^n)we see that it is the order of the symmetric group (See Lorscheids article) in the philosophy of the eld with oneelement, one thus interprets the symmetric group as the general linear group over the eld with one element: S GL(n,1).

    1.4.1 HistoryThe general linear group over a prime eld, GL(,p), was constructed and its order computed by variste Galois in1832, in his last letter (to Chevalier) and second (of three) attached manuscripts, which he used in the context ofstudying the Galois group of the general equation of order p.[4]

    1.5 Special linear groupMain article: Special linear group

    The special linear group, SL(n,F), is the group of all matrices with determinant 1. They are special in that they lieon a subvariety they satisfy a polynomial equation (as the determinant is a polynomial in the entries). Matrices ofthis type form a group as the determinant of the product of two matrices is the product of the determinants of eachmatrix. SL(n, F) is a normal subgroup of GL(n,F).If we write F for the multiplicative group of F (excluding 0), then the determinant is a group homomorphism

    det: GL(n,F) F.

    that is surjective and its kernel is the special linear group. Therefore, by the rst isomorphism theorem, GL(n,F)/SL(n,F)is isomorphic to F. In fact, GL(n,F) can be written as a semidirect product:

    GL(n,F) = SL(n,F) F

    When F is R or C, SL(n,F) is a Lie subgroup of GL(n,F) of dimension n2 1. The Lie algebra of SL(n,F) consistsof all nn matrices over F with vanishing trace. The Lie bracket is given by the commutator.The special linear group SL(n,R) can be characterized as the group of volume and orientation preserving linear trans-formations of Rn.The group SL(n,C) is simply connected while SL(n,R) is not. SL(n,R) has the same fundamental group as GL+(n,R), that is, Z for n = 2 and Z2 for n > 2.

    1.6 Other subgroups

    1.6.1 Diagonal subgroupsThe set of all invertible diagonal matrices forms a subgroup of GL(n, F) isomorphic to (F)n. In elds like R and C,these correspond to rescaling the space; the so-called dilations and contractions.A scalar matrix is a diagonal matrix which is a constant times the identity matrix. The set of all nonzero scalarmatrices forms a subgroup of GL(n, F) isomorphic to F . This group is the center of GL(n, F). In particular, it is anormal, abelian subgroup.The center of SL(n, F) is simply the set of all scalar matrices with unit determinant, and is isomorphic to the groupof nth roots of unity in the eld F.

  • 1.7. RELATED GROUPS AND MONOIDS 5

    1.6.2 Classical groupsThe so-called classical groups are subgroups of GL(V) which preserve some sort of bilinear form on a vector spaceV. These include the

    orthogonal group, O(V), which preserves a non-degenerate quadratic form on V, symplectic group, Sp(V), which preserves a symplectic form on V (a non-degenerate alternating form), unitary group, U(V), which, when F = C, preserves a non-degenerate hermitian form on V.

    These groups provide important examples of Lie groups.

    1.7 Related groups and monoids

    1.7.1 Projective linear groupMain article: Projective linear group

    The projective linear group PGL(n, F) and the projective special linear group PSL(n,F) are the quotients of GL(n,F)and SL(n,F) by their centers (which consist of the multiples of the identity matrix therein); they are the induced actionon the associated projective space.

    1.7.2 Ane groupMain article: Ane group

    The ane group A(n,F) is an extension of GL(n,F) by the group of translations in Fn. It can be written as asemidirect product:

    A(n, F) = GL(n, F) Fn

    where GL(n, F) acts on Fn in the natural manner. The ane group can be viewed as the group of all ane transfor-mations of the ane space underlying the vector space Fn.One has analogous constructions for other subgroups of the general linear group: for instance, the special ane groupis the subgroup dened by the semidirect product, SL(n, F) Fn, and the Poincar group is the ane group associatedto the Lorentz group, O(1,3,F) Fn.

    1.7.3 General semilinear groupMain article: General semilinear group

    The general semilinear group L(n,F) is the group of all invertible semilinear transformations, and contains GL. Asemilinear transformation is a transformation which is linear up to a twist, meaning up to a eld automorphismunder scalar multiplication. It can be written as a semidirect product:

    L(n, F) = Gal(F) GL(n, F)

    where Gal(F) is the Galois group of F (over its prime eld), which acts on GL(n, F) by the Galois action on theentries.The main interest of L(n, F) is that the associated projective semilinear group PL(n, F) (which contains PGL(n,F)) is the collineation group of projective space, for n > 2, and thus semilinear maps are of interest in projectivegeometry.

  • 6 CHAPTER 1. GENERAL LINEAR GROUP

    1.7.4 Full linear monoid

    If one removes the restriction of the determinant being non-zero, the resulting algebraic structure is a monoid, usuallycalled the full linear monoid,[5][6][7] but occasionally also full linear semigroup,[8] general linear monoid[9][10] etc. Itis actually a regular semigroup.[6]

    1.8 Innite general linear groupThe innite general linear group or stable general linear group is the direct limit of the inclusions GL(n,F) GL(n+1,F) as the upper left block matrix. It is denoted by either GL(F) or GL(,F), and can also be interpreted asinvertible innite matrices which dier from the identity matrix in only nitely many places.[11]

    It is used in algebraic K-theory to dene K1, and over the reals has a well-understood topology, thanks to Bottperiodicity.It should not be confused with the space of (bounded) invertible operators on a Hilbert space, which is a larger group,and topologically much simpler, namely contractible see Kuipers theorem.

    1.9 See also List of nite simple groups

    SL2(R)

    Representation theory of SL2(R)

    1.10 Notes[1] Here rings are assumed to be associative and unital.

    [2] Since the Zariski topology is coarser than the metric topology; equivalently, polynomial maps are continuous.

    [3] A maximal compact subgroup is not unique, but is essentially unique, hence one often refers to the maximal compactsubgroup.

    [4] Galois, variste (1846). Lettre de Galois M. Auguste Chevalier. Journal de Mathmatiques Pures et Appliques XI:408415. Retrieved 2009-02-04, GL(,p) discussed on p. 410.

    [5] Jan Okniski (1998). Semigroups of Matrices. World Scientic. Chapter 2: Full linear monoid. ISBN 978-981-02-3445-4.

    [6] Meakin (2007). Groups and Semigroups: Connections and contrast. In C. M. Campbell. Groups St Andrews 2005.Cambridge University Press. p. 471. ISBN 978-0-521-69470-4.

    [7] John Rhodes; Benjamin Steinberg (2009). The q-theory of Finite Semigroups. Springer Science & Business Media. p. 306.ISBN 978-0-387-09781-7.

    [8] Eric Jespers; Jan Okniski (2007). Noetherian Semigroup Algebras. Springer Science & Business Media. 2.3: Full linearsemigroup. ISBN 978-1-4020-5810-3.

    [9] Meinolf Geck (2013). An Introduction to Algebraic Geometry and Algebraic Groups. Oxford University Press. p. 132.ISBN 978-0-19-967616-3.

    [10] Mahir Bilen Can; Zhenheng Li; Benjamin Steinberg; Qiang Wang (2014). Algebraic Monoids, Group Embeddings, andAlgebraic Combinatorics. Springer. p. 142. ISBN 978-1-4939-0938-4.

    [11] Milnor, John Willard (1971). Introduction to algebraic K-theory. Annals of Mathematics Studies 72. Princeton, NJ:Princeton University Press. p. 25. MR 0349811. Zbl 0237.18005.

  • 1.11. EXTERNAL LINKS 7

    1.11 External links Hazewinkel, Michiel, ed. (2001), General linear group, Encyclopedia of Mathematics, Springer, ISBN 978-1-55608-010-4

    GL(2,p) and GL(3,3) Acting on Points by Ed Pegg, Jr., Wolfram Demonstrations Project, 2007.

  • Chapter 2

    Generalizations of Pauli matrices

    In mathematics and physics, in particular quantum information, the term generalized Pauli matrices refers to fam-ilies of matrices which generalize the (linear algebraic) properties of the Pauli matrices. Here, a few classes of suchmatrices are summarized.

    2.1 Generalized Gell-Mann matrices (Hermitian)

    2.1.1 ConstructionLet Ejk be the matrix with 1 in the jk-th entry and 0 elsewhere. Consider the space of dd complex matrices, dd,for a xed d.Dene the following matrices,

    For k < j , fk,jd = Ekj + Ejk .

    For k > j , fk,jd = i (Ejk Ekj) .

    Let h1d = Id , the identity matrix.

    For 1 < k < d , hkd =hkd1 0 .

    For k = d , hdd =q

    2d(d1)

    hd11 (1 d)

    :

    The collection of matrices dened above without the identity matrix are called the generalized Gell-Mann matrices,in dimension d.[1] The symbol (utilized in the Cartan subalgebra above) means matrix direct sum.The generalized Gell-Mann matrices are Hermitian and traceless by construction, just like the Pauli matrices. Onecan also check that they are orthogonal in the HilbertSchmidt inner product on dd. By dimension count, one seesthat they span the vector space of d d complex matrices, gl (d,). They then provide a Lie-algebra-generator basisacting on the fundamental representation of su (d ).In dimensions d=2 and 3, the above construction recovers the Pauli and Gell-Mann matrices, respectively.

    2.2 A non-Hermitian generalization of Pauli matricesThe Pauli matrices 1 and 3 satisfy the following:

    21 = 23 = I; 13 = 31 = ei31:

    8

  • 2.2. A NON-HERMITIAN GENERALIZATION OF PAULI MATRICES 9

    The so-called WalshHadamard conjugation matrix is

    W = 1p2

    1 11 1

    :

    Like the Pauli matrices,W is both Hermitian and unitary. 1; 3 andW satisfy the relation

    1 = W3W:

    The goal now is to extend the above to higher dimensions, d, a problem solved by J. J. Sylvester (1882).

    2.2.1 Construction: The clock and shift matricesFix the dimension d as before. Let = exp(2i/d), a root of unity. Since d = 1 and 1, the sum of all rootsannuls:

    1 + ! + + !d1 = 0:Integer indices may then be cyclically identied mod d.Now dene, with Sylvester, the shift matrix[2]

    1 =

    266666664

    0 0 0 0 11 0 0 0 00 1 0 0 00 0 1 0 0... ... ... . . . ... ...0 0 0 1 0

    377777775and the clock matrix,

    3 =

    26666641 0 0 00 ! 0 00 0 !2 0... ... ... . . . ...0 0 0 !d1

    3777775:

    These matrices generalize 1 and 3, respectively.Note that the unitarity and tracelessness of the two Pauli matrices is preserved, but not Hermiticity in dimensionshigher than two. Since Pauli matrices describe Quaternions, Sylvester dubbed the higher-dimensional analogs non-ions, sedenions, etc.These two matrices are also the cornerstone of quantum mechanical dynamics in nite-dimensional vectorspaces[3][4][5] as formulated by Hermann Weyl, and nd routine applications in numerous areas of mathematicalphysics.[6] The clock matrix amounts to the exponential of position in a clock of d hours, and the shift matrix is justthe translation operator in that cyclic vector space, so the exponential of the momentum. They are (nite-dimensional)representations of the corresponding elements of the Heisenberg group on a d-dimensional Hilbert space.The following relations echo those of the Pauli matrices:

    d1 = d3 = I

    and the braiding relation,

  • 10 CHAPTER 2. GENERALIZATIONS OF PAULI MATRICES

    31 = !13 = e2i/d13;

    the Weyl formulation of the CCR, or

    31d13

    d11 = ! :

    On the other hand, to generalize the WalshHadamard matrixW, note

    W = 1p2

    1 11 !21

    = 1p

    2

    1 11 !d1

    :

    Dene, again with Sylvester, the following analog matrix,[7] still denoted byW in a slight abuse of notation,

    W =1pd

    26666641 1 1 11 !d1 !2(d1) !(d1)21 !d2 !2(d2) !(d1)(d2)... ... ... . . . ...1 ! !2 !d1

    3777775 :

    It is evident thatW is no longer Hermitian, but is still unitary. Direct calculation yields

    1 = W3W ;

    which is the desired analog result. Thus, W , a Vandermonde matrix, arrays the eigenvectors of 1, which has thesame eigenvalues as 3.When d = 2k, W * is precisely the matrix of the discrete Fourier transform, converting position coordinates tomomentum coordinates and vice versa.The family of d2 unitary (but non-Hermitian) independent matrices

    provides Sylvesters well-known basis for gl (d,), known as nonions gl (3,), sedenions gl (4,), etc...[8]

    This basis can be systematically connected to the above Hermitian basis.[9] (For instance, the powers of 3, the Cartansubalgebra, map to linear combinations of the hkds.) It can further be used to identify gl (d,) , as d , with thealgebra of Poisson brackets.

    2.3 See also Hermitian matrix Bloch sphere Discrete Fourier transform Generalized Cliord algebra WeylBrauer matrices Circulant matrix Shift operator

  • 2.4. NOTES 11

    2.4 Notes[1] Kimura, G. (2003). The Bloch vector for N-level systems. Physics Letters A 314 (56): 339. doi:10.1016/S0375-

    9601(03)00941-1., Bertlmann, Reinhold A.; Philipp Krammer (2008-06-13). Bloch vectors for qudits. Journal ofPhysics A:Mathematical and Theoretical 41 (23): 235303. Bibcode:2008JPhA...41w5303B. doi:10.1088/1751-8113/41/23/235303.ISSN 1751-8121. Retrieved 2014-04-10.

    [2] Sylvester, J. J., (1882), Johns Hopkins University Circulars I: 241-242; ibid II (1883) 46; ibid III (1884) 79. Summarizedin The Collected Mathematics Papers of James Joseph Sylvester (Cambridge University Press, 1909) v III . online andfurther.

    [3] Weyl, H., Quantenmechanik und Gruppentheorie, Zeitschrift fr Physik, 46 (1927) pp. 146, doi:10.1007/BF02055756.

    [4] Weyl, H., The Theory of Groups and Quantum Mechanics (Dover, New York, 1931)

    [5] Santhanam, T. S.; Tekumalla, A. R. (1976). Quantum mechanics in nite dimensions. Foundations of Physics 6 (5):583. doi:10.1007/BF00715110.

    [6] For a serviceable review, see Vourdas A. (2004), Quantum systems with nite Hilbert space, Rep. Prog. Phys. 67 267.doi: 10.1088/0034-4885/67/3/R03.

    [7] J.J. Sylvester, J. J. (1867) . Thoughts on inverse orthogonal matrices, simultaneous sign successions, and tessellated pavementsin two or more colours, with applications to Newtons rule, ornamental tile-work, and the theory of numbers. PhilosophicalMagazine, 34:461475. online

    [8] Patera, J.; Zassenhaus, H. (1988). The Pauli matrices in n dimensions and nest gradings of simple Lie algebras of typeAn1. Journal of Mathematical Physics 29 (3): 665. Bibcode:1988JMP....29..665P. doi:10.1063/1.528006.

    [9] Fairlie, D. B.; Fletcher, P.; Zachos, C. K. (1990). Innite-dimensional algebras and a trigonometric basis for the classicalLie algebras. Journal of Mathematical Physics 31 (5): 1088. Bibcode:1990JMP....31.1088F. doi:10.1063/1.528788.

  • Chapter 3

    Generalized eigenvector

    In linear algebra, for a matrix A , there may not always exist a full set of linearly independent eigenvectors that forma complete basis. That is, a matrix may not be diagonalizable.[1][2] This happens when the algebraic multiplicity of atleast one eigenvalue is greater than its geometric multiplicity (the nullity of the matrix (AI) , or the dimensionof its nullspace).[3] In such cases, a generalized eigenvector of rank k corresponding to the matrix A is a nonzerovector v, which is associated with having algebraic multiplicity k 1, if

    (A I)kv = 0but

    (A I)k1v 6= 0: [4]

    The set spanned by all generalized eigenvectors for a given , forms the generalized eigenspace for .[5]

    Ordinary eigenvectors and eigenspaces are obtained for k = 1.[6]

    3.1 For defective matricesGeneralized eigenvectors are needed to form a complete basis of a defective matrix, which is a square matrix whichdoes not have a complete basis consisting entirely of ordinary eigenvectors.[7] The generalized eigenvectors, however,do allow choosing a complete basis, as follows from the Jordan normal form of a matrix.[8]

    In particular, suppose that an eigenvalue of a matrix A has an algebraic multiplicity m but fewer correspondingeigenvectors. We form a sequence of m eigenvectors and generalized eigenvectors x1; x2; : : : ; xm that are linearlyindependent and satisfy

    (A I)xk = k;1x1 + + k;k1xk1for some coecients k;1; : : : ; k;k1 , for k = 1; : : : ;m . It follows that

    (A I)kxk = 0:The vectors x1; x2; : : : ; xm can always be chosen, but are not uniquely determined by the above relations. If thegeometric multiplicity (dimension of the eigenspace) of is p, one can choose the rst p vectors to be eigenvectors,but the remaining m p vectors are only generalized eigenvectors.[9]

    3.2 ExamplesHere are some examples to illustrate the concept of generalized eigenvectors. Some of the details will be describedlater.

    12

  • 3.2. EXAMPLES 13

    3.2.1 Example 1This example is simple but clearly illustrates the point. This type of matrix is used frequently in textbooks.[10][11][12]Suppose

    A =

    1 10 1

    :

    Then there is only one eigenvalue, = 1 , and its algebraic multiplicity is m = 2.There are several ways to see that there will be only one generalized eigenvector. The easiest is to notice that thismatrixis in Jordan normal form but is not diagonal. Hence, this matrix is not diagonalizable. Since there is one superdiagonalentry, there will be one generalized eigenvector (or one could note that the vector space V is of dimension 2, so therecan be only one generalized eigenvector). Alternatively, one could compute the dimension of the nullspace ofAIto be p = 1, and thus there are m p = 1 generalized eigenvectors. (See the nullspace page.)

    Computing the ordinary eigenvector v1 =10

    is left to the reader. (See the eigenvector page for examples.) Using

    this eigenvector, we compute the generalized eigenvector v2 by solving

    (A I)v2 = v1:Writing out the values:

    1 10 1

    11 00 1

    v21v22

    =

    0 10 0

    v21v22

    =

    10

    :

    This simplies to

    v22 = 1:

    The element v21 has no restrictions. The generalized eigenvector is then v2 =a1

    , where a can have any scalar

    value. The choice of a = 0 is usually the simplest.Note that

    (A I)v2 =0 10 0

    a1

    =

    10

    = v1;

    so that v2 is a generalized eigenvector,

    (A I)v1 =0 10 0

    10

    =

    00

    = 0;

    so that v1 is an ordinary eigenvector, and that v1 and v2 are linearly independent and hence constitute a basis for thevector space V .

    3.2.2 Example 2This example is more complex than example 1. Unfortunately, it is a little dicult to construct an interesting exampleof low order.[13] The matrix

    A =

    0BBBB@1 0 0 0 03 1 0 0 06 3 2 0 010 6 3 2 015 10 6 3 2

    1CCCCA

  • 14 CHAPTER 3. GENERALIZED EIGENVECTOR

    has eigenvalues 1 = 1 and 2 = 2 with algebraic multiplicities 1 = 2 and 2 = 3 , but geometric multiplicities

    1 = 1 and 2 = 1 .The generalized eigenspaces of A are calculated below. x1 is the ordinary eigenvector associated with 1 . x2 isa generalized eigenvector associated with 1 . y1 is the ordinary eigenvector associated with 2 . y2 and y3 aregeneralized eigenvectors associated with 2 .

    (A 1I)x1 =

    0BBBB@0 0 0 0 03 0 0 0 06 3 1 0 010 6 3 1 015 10 6 3 1

    1CCCCA0BBBB@

    03993

    1CCCCA =0BBBB@00000

    1CCCCA = 0;

    (A 1I)x2 =

    0BBBB@0 0 0 0 03 0 0 0 06 3 1 0 010 6 3 1 015 10 6 3 1

    1CCCCA0BBBB@

    11530145

    1CCCCA =0BBBB@

    03993

    1CCCCA = x1;

    (A 2I)y1 =

    0BBBB@1 0 0 0 03 1 0 0 06 3 0 0 010 6 3 0 015 10 6 3 0

    1CCCCA0BBBB@00009

    1CCCCA =0BBBB@00000

    1CCCCA = 0;

    (A 2I)y2 =

    0BBBB@1 0 0 0 03 1 0 0 06 3 0 0 010 6 3 0 015 10 6 3 0

    1CCCCA0BBBB@00030

    1CCCCA =0BBBB@00009

    1CCCCA = y1;

    (A 2I)y3 =

    0BBBB@1 0 0 0 03 1 0 0 06 3 0 0 010 6 3 0 015 10 6 3 0

    1CCCCA0BBBB@

    00120

    1CCCCA =0BBBB@00030

    1CCCCA = y2:This results in a basis for each of the generalized eigenspaces ofA . Together the two chains of generalized eigenvectorsspan the space of all 5-dimensional column vectors.

    fx1; x2g =

    8>>>>>>>:

    0BBBB@03993

    1CCCCA0BBBB@

    11530145

    1CCCCA9>>>>=>>>>; ; fy1; y2; y3g =

    8>>>>>>>:

    0BBBB@00009

    1CCCCA0BBBB@00030

    1CCCCA0BBBB@

    00120

    1CCCCA9>>>>=>>>>; :

    An almost diagonal matrix J in Jordan normal form, similar to A is obtained as follows:

    M = (x1; x2; y1; y2; y3) =

    0BBBB@0 1 0 0 03 15 0 0 09 30 0 0 19 1 0 3 23 45 9 0 0

    1CCCCA; J =0BBBB@1 1 0 0 00 1 0 0 00 0 2 1 00 0 0 2 10 0 0 0 2

    1CCCCA;

    whereM is a generalized modal matrix for A , the columns ofM are a canonical basis for A , and AM = MJ .[14]

  • 3.3. OTHER MEANINGS OF THE TERM 15

    3.3 Other meanings of the term The usage of generalized eigenfunction diers from this; it is part of the theory of rigged Hilbert spaces, sothat for a linear operator on a function space this may be something dierent.

    One can also use the term generalized eigenvector for an eigenvector of the generalized eigenvalue problem

    Av = Bv:

    3.4 The Nullity of (A I)k

    3.4.1 Introduction

    In this section it is shown, when is an eigenvalue of a matrix A with algebraic multiplicity k , then the null space of(A I)k has dimension k .

    3.4.2 Existence of Eigenvalues

    Consider a n n matrix A. The determinant of A has the fundamental properties of being n linear and alternating.Additionally det(I) = 1, for I the n n identity matrix. From the determinants denition it can be seen that for atriangular matrix T = (t) that det(T) = (t). In other words, the determinant is the product of the diagonal entries.There are three elementary row operations, scalar multiplication, interchange of two rows, and the addition of a scalarmultiple of one row to another. Multiplication of a row of A by results in a new matrix whose determinant is det(A). Interchange of two rows changes the sign of the determinant, and the addition of a scalar multiple of one rowto another does not aect the determinant. The following simple theorem holds, but requires a little proof.Theorem: The equation A x = 0 has a solution x 0, if and only if det(A) = 0.Proof: Given the equation A x = 0 attempt to solve it using the elementary row operations of addition of a scalarmultiple of one row to another and row interchanges only, until an equivalent equation U x = 0 has been reached, withU an upper triangular matrix. Since det(U) = det(A) and det(U) = (u) we have that det(A) = 0 if and only if atleast one u = 0. The back substitution procedure as performed after Gaussian Elimination will allow placing at leastone non zero element in x when there is a u = 0. When all u 0 back substitution will require x = 0. QEDTheorem: The equation A x = x has a solution x 0, if and only if det( I A) = 0.Proof: The equation A x = x is equivalent to ( I A) x = 0. QED.

    3.4.3 Constructive proof of Schurs triangular form

    The proof of the main result of this section will rely on the similarity transformation as stated and proven next.Theorem: (Schur Transformation to Triangular Form Theorem) For any n n matrix A, there exists a triangularmatrix T and a unitary matrix Q, such that A Q = Q T. (The transformations are not unique, but are related.)Proof: Let 1 be an eigenvalue of the n n matrixA and x be an associated eigenvector, so thatA x = 1x. Normalizethe length of x so that |x| = 1.For

    x =

    26664x1x2...xn

    37775construct a unitary matrix

  • 16 CHAPTER 3. GENERALIZED EIGENVECTOR

    Q =

    26664x1 q1 2 q1 3 q1nx2 q2 2 q2 3 q2n... ... ... ...xn qn 2 qn 3 qnn

    37775

    Q should have x as its rst column and have its columns an orthonormal basis for Cn. Now, A Q = Q U1, with U1of the form:

    Let the induction hypothesis be that the theorem holds for all (n-1) (n-1) matrices. From the construction, so far, itholds for n = 2. Choose a unitary Q0, so that U0 Q0 = Q0 U2, with U2 of the upper triangular form. Dene Q1 by:

    Now:

  • 3.4. THE NULLITY OF (A I)K 17

    Summarizing,

    U1Q1 = Q1U3

    with:

    U3 =

    26666641 z1 2 z1 3 z1n0 2 z2 3 z2n0 0 3 z3n... ... ... ...0 0 0 n

    3777775Now, A Q = Q U1 and U1 Q1 = Q1 U3, where Q and Q1 are unitary and U3 is upper triangular. Thus A Q Q1 = QQ1 U3. Since the product of two unitary matrices is unitary, the proof is done. QED.

    3.4.4 Nullity Theorems Proof

    Starting from A Q = Q U, we can solve for A to obtain A =QUQT, sinceQQT = I. Now, after subtracting xI fromboth sides, we nd

    x I A = Q (x I U) QT

    and hence

    det(x I A) = det(x I U).

    So, the characteristic polynomial of A is the same as that for U and is given by

    p(x) = (x 1)(x 2)...(x ),

    where the s are the eigenvalues of A and U.

  • 18 CHAPTER 3. GENERALIZED EIGENVECTOR

    Observe, the construction used in the proof above, allows choosing any order for the eigenvalues of A that will endup as the diagonal elements of the upper triangular matrix U obtained. The algebraic multiplicity of an eigenvalue isthe count of the number of times it occurs on the diagonal.Now. it can be supposed for a given eigenvalue , of algebraic multiplicity k, that U has been contrived so that occurs as the rst k diagonal elements.

    Place U I in block form as below.

    The lower left block has only elements of zero. The = 0 for i = k+1, ..., n. It is easy to verify the following.

  • 3.5. MOTIVATION OF THE PROCEDURE 19

    Where B is the k k subtriangular matrix, with all elements on or below the diagonal equal to 0, and T is the (n-k) (n-k) upper triangular matrix, taken from the blocks of (U I), as shown below.

    Now, almost trivially,

    That is Bk has only elements of 0 and Tk is triangular with all non zero diagonal elements. Observe that if a columnvector v = [v1, v2, ..., v]T, is multiplied by B, then after the rst multiplication the last, kth, component is zero. Afterthe second multiplication the second to last, (k-1)th component is zero, also, and so on.The conclusion that (U I)k has rank (n-k) and nullity k follows.It is only left to observe, since (A I)k = Q (U I)k QT, that (A I)k has rank (n-k) and nullity k, as well. Aunitary, or any other similarity transformation by a non-singular matrix preserves rank.The main result is now proven.Theorem:If is an eigenvalue of a matrix A with algebraic multiplicity k, then the null space of (A I)k has dimension k.An important observation is that raising the power of (A I) above k will not aect the rank and nullity any further.

    3.5 Motivation of the Procedure

    3.5.1 Introduction

    In the section Existence of Eigenvalues it was shown that when a n n matrix A, has an eigenvalue , of algebraicmultiplicity k, then the null space of (A I)k, has dimension k.The Generalized Eigenspace of A, will be dened to be the null space of (A I)k. Many authors prefer to call thisthe kernel of (A I)k.Notice that if a n n matrix has eigenvalues 1, 2, ..., with algebraic multiplicities k1, k2, ..., k, then k1 + k2 + ...

  • 20 CHAPTER 3. GENERALIZED EIGENVECTOR

    + k = n.It will turn out that any two generalized eigenspaces of A, associated with dierent eigenvalues, will have a trivialintersection of {0}. From this it follows that the generalized eigenspaces of A combined span Cn, the set of all ndimensional column vectors of complex numbers.The motivation for using a recursive procedure starting with the eigenvectors of A and solving for a basis of thegeneralized eigenspace of A, using the matrix (A I), will be expounded on.

    3.5.2 NotationSome notation is introduced to help abbreviate statements.

    Cn is the vector space of all n dimensional column vectors of complex numbers. The Null Space of A, N(A) = {x: A x = 0}. V W denotes V is a subset ofW. V W denotes V is a proper subset ofW. The Range of A over V, is A(V) = {y: y = A x, for some x V}. W \ V denotes the set {x: x W and x is not in V}. The Range of A is A(Cn) and will be denoted by R(A). dim(V) denotes the dimension of V. {0} is the trivial subspace of Cn.

    3.5.3 Preliminary ObservationsThroughout this discussion it is assumed that A is a n n matrix of complex numbers.Since Am x = A (Am-1 x), the inclusions

    N(A) N(A2) ... N(Am-1) N(Am),

    are obvious. Since Am x = Am-1(A x), the inclusions

    R(A) R(A2) ... R(Am-1) R(Am),

    are clear as well.Theorem:When the more trivial case N(A2) = N(A), does not hold, there exists k 2, such that the inclusions,

    N(A) N(A2) ... N(Ak-1) N(Ak) = N(Ak+1) = ...,

    and

    R(A) R(A2) ... R(Ak-1) R(Ak = R(Ak+1) = ...,

    are proper.Proof: 0 dim(R(Am+1)) dim(R(Am)) so eventually dim(R(Am+1)) = dim(R(Am)), for some m. From the inclusionR(Am+1) R(Am) it is seen that a basis for R(Am+1) is a basis for R(Am) as well. That is, R(Am+1) = R(Am). SinceR(Am+1) = A(R(Am)), when R(Am+1) = R(Am), it will be R(Am+2) = A(R(Am+1)) = A(R(Am)) = R(Am+1). By therank nullity theorem, it will also be the case that dim(N(Am+2)) = dim(N(Am+1)) = dim(N(Am)), for the same m.From the inclusions N(Am+2) N(Am+1) N(Am), it is clear that a basis for N(Am+2) is also a basis for N(Am+1) andN(Am). So N(Am+2) = N(Am+1) = N(Am). Now, k is the rst m for which this happens. QEDSince certain expressions will occur many times in the following, some more notation will be introduced.

  • 3.5. MOTIVATION OF THE PROCEDURE 21

    A, = (A I)k

    N, = N((A I)k) = N(A,) R, = R((A I)k) = R(A,)

    From the inclusions N, N, ... N,- N, = N, = ..., N, \ {0} = (N, \ N,-), for m = 1, ..., kand N, = {0}, follows.When is an eigenvalue of A, in the statement above, k will not exceed the algebraic multiplicity of , and can beless. In fact when k would only be 1 is when there is a full set of linearly independent eigenvectors. Lets considerwhen k 2.Now, x N, \ N,-, if and only if A, x = 0, and A,- x 0.Make the observation that A, x = 0, and A,- x 0, if and only if A,- A, x = 0, and A,- A, x 0.So, x N, \ N,-, if and only if A, x N,- \ N,-.

    3.5.4 Recursive ProcedureConsider a matrixA, with an eigenvalue of algebraic multiplicity k 2, such that there are not k linearly independenteigenvectors associated with .It is desired to extend the eigenvectors to a basis for N,. That is a basis for the generalized eigenvectors associatedwith .There exists some 2 r k, such that

    N, N, ... N,- N, = N, = ..., N, \ {0} = (N, \ N,-), for m = 1, ..., r and N, = {0}.

    The eigenvectors are N, \ {0}, so let x1, ..., x1 be a basis for N, \ {0}.Note that each N, is a subspace and so a basis for N,- can be extended to a basis for N,.Because of this we can expect to nd some r2 = dim(N,) dim(N,) linearly independent vectors x1, ..., x12such that x1, ..., x1 , x1, ..., x12 is a basis for N,Now, x N, \ N,, if and only if A, x N, \ {0}.Thus we can expect that for each x {x1, ..., x12} ,A, x = 1 x1 + ... + 1 x1, for some 1, ..., 1, dependingon x.Suppose we have reached the stage in the construction so that m-1 sets,

    {x1, ..., x1}, {x1, ..., x12}, ..., {x1 ... -, ..., x1 ... -}

    such that

    x1, ..., x1, x1, ..., x12, ..., x1 ... - , ..., x1 ... -

    is a basis for N,-, have been found.We can expect to nd some

    r = dim(N,) dim(N,-)

    linearly independent vectors

    x1 ... -, ..., x1 ...

    such that

  • 22 CHAPTER 3. GENERALIZED EIGENVECTOR

    x1, ..., x1, x1, ..., x1 2, ..., x1 ... - , ..., x1 ...

    is a basis for N, Again, x N, \ N,-, if and only if A, x N,- \ N,-.Thus we can expect that for each x {x1 ... - , ..., x1 .... }, A, x = 1 x1 + ... + 1 ... - x1 ... -, for some 1, ..., 1 ... -, depending on x.Some of the {1 ... - , ..., 1 .... -}, will be non zero, since A, x must lie in N,- \ N,-.The procedure is continued until m = r.The are not truly arbitrary and must be chosen, accordingly, so that sums 1 x1 + 2 x2 + ... are in the range ofA,.

    3.5.5 Generalized Eigenspace DecompositionAs was stated in the Introduction, if a n n matrix has eigenvalues 1, 2, ..., with algebraic multiplicities k1, k2,..., k, then k1 + k2 + ... + k = n.When V1 and V2 are two subspaces, satisfying V1 V2 = {0}, their direct sum, is dened and notated by

    V1 V2 = {v1 + v2 : v1 V1 and v2 V2} .

    V1 V2 is also a subspace and dim(V1 V2) = dim(V1) + dim(V2).Since dim(N,) = k, for i = 1, 2, ..., r, after it is shown that N, N, = {0}, for i j, we have the main result.Theorem: Generalized Eigenspace Decomposition TheoremCn = N1,1 N2,2 ... N,.This follows easily after we prove the theorem below.Theorem:Let be an eigenvalue of A and . Then A,(N, \ N,-) = N, \ N,-, for any positive integers m and r.Proof:If x N, \ {0}, A, x = (A I)x = 0, then A x = x and A, x = (A I)x = ( )x.So A, x N, \ {0} and A, ( )1x = x.It holds A, (N, \ {0}) = N, \ {0}.Now, x N, \ N,-, if and only if A, x = (A I)A,- x = 0, and A,- x 0.In the case, x N, \ N,-, A,- x N, \ 0, and A, A,- x = ( ) A,- x 0. The operators A, andA,- commute. Thus A, (A, x) = 0 and A,- (A, x) 0, which means A, x N, \ N,-.Now, let our induction hypothesis be, A,(N, \ N,-) = N, \ N,-.The relation A, x = ( ) x + A, x holds.For y N, \ N, , let x = ( )1 y + z.Then A, x = y + ( )1A, y + ( ) z + A, z = y + ( )1A, y + A, z.Now, A, y N, \ N,- and, by the induction hypothesis, there exists z N, \ N,- that solves A, z = ( )1A, y.It follows x N, \ N, and solves A, x = y.So A,(N, \ N,) = N, \ N,.Repeatedly applying A, = A, A,- nishes the proof.In fact, from the theorem just proved, for i j, A,(N,)= N,.Now, suppose that N, N, {0}, for some i j.Choose x N, N, 0.

  • 3.5. MOTIVATION OF THE PROCEDURE 23

    Since x N,, it follows A, x = 0.Since x N,, it follows A, x 0, because A, preserves dimension on N,.So it must be N, N, = {0}, for i j.This concludes the proof of the Generalized Eigenspace Decomposition Theorem.

    3.5.6 Powers of a Matrix

    Using generalized eigenvectors

    Assume A is a n n matrix with eigenvalues 1, 2, ..., of algebraic multiplicities k1, k2, ..., k.For notational convenience A, = I.Note that A, = ( )I + A, . and apply the binomial theorem.

    A;s = (( )I +A;1)s =sX

    m=0

    s

    m

    ( )smA;m

    When is an eigenvalue of algebraic multiplicity k, and x N, ,then A, x = 0, form k, so in this case:

    A;sx =

    min(s;k1)Xm=0

    s

    m

    ( )smA;mx

    Since Cn = N1, 1 N2, 2 ... N, ,any x in Cn can be expressed as x = x1 + x2 + ... + x ,with each x N, . Hence:

    A;sx =rXi=1

    min(s;ki1)Xm=0

    s

    m

    (i )smAi;mxi

    The columns of A, are obtained by letting x vary across the standard basis vectors.The case A, is the power As of A.

    The minimal polynomial of a matrix

    Assume A is a n n matrix with eigenvalues 1, 2, ..., of algebraic multiplicities k1, k2, ..., k.For each i dene (), the null index of , to be thesmallest positive integer such that N, = N, .It is often the case that () < k.Then p(x) = (x )() is the minimal polynomial for A.To see this note p(A) = A , and the factors can be commuted in any order.So p(A) (N, ) = {0}, because A , (N, ) = {0}. Being thatCn = N1, 1 N2, 2 ... N, , it is clear p(A) = 0.Now p(x) can not be of less degree because A , (N, ) = N, ,when , and so A , must be a factor of p(A), for each j.

  • 24 CHAPTER 3. GENERALIZED EIGENVECTOR

    Using conuent Vandermonde matrices

    An alternative strategy is to use the characteristic polynomial of matrix A.Let p(x) = a0 + a1 x + a2 x2 + ... + a- xn-1 + xn

    be the characteristic polynomial of A.The minimal polynomial of A can be substituted for p(x) in this discussion, if it is known,and dierent, to reduce the degree n and the multiplicities of the eigenvalues.Then p(A) = 0 and An = (a0 I + a1 A + a2 A2 + ... + a- An-1).So An+m = b, I + b, A + b, A2 + ... + b, - An-1,where the b, , b, , b, , ..., b, -, satisfy the recurrence relationb, = a0 b-, -,b, = b-, a1 b-, -,b, = b-, a2 b-, -,...,b, - = b-, - a- b-, -with b, = b, = b, = ... = b, - = 0, and b, - = 1.This alone will reduce the number of multiplications needed to calculate a higherpower of A by a factor of n2, as compared to simply multiplying An+m by A.In fact the b, , b, , b, , ..., b, -, can be calculated by a formula.Consider rst when A has distinct eigenvalues 1, 2, ..., .Since p() = 0, for each i, the satisfy the recurrence relation also. So:

    266641 1

    21 n11

    1 2 22 n12... ... ... ...

    1 n 2n n1n

    3777526664

    bm;0bm;1...

    bm;n1

    37775 =26664n+m1n+m2 ...n+mn

    37775The matrix V in the equation is the well studied Vandermondes,for which formulas for its determinant and inverse are known.

    det(V (1; 2; : : : ; n)) =Y

    1i

  • 3.5. MOTIVATION OF THE PROCEDURE 25

    26666641 2

    22 n12

    0 1 22 (n 1)n221 3

    23 n13... ... ... ...

    1 n 2n n1n

    3777775 =2666664

    n+m2(n+m)n+m12

    n+m3 ...n+mn

    3777775The new system has determinant:

    det(V (2; : : : ; n)) =Y

    3jn(j 2)2

    Y3i

  • 26 CHAPTER 3. GENERALIZED EIGENVECTOR

    The b, , b, , b, , b, , b, , for whichA5+m = b, I + b, A + b, A2 + b, A3 + b, A4,satisfy the conuent Vandermonde system next.

    2666641 1 12 13 14

    0 1 2 1 3 12 4 131 2 22 23 24

    0 1 2 2 3 22 4 230 0 1 3 2 6 22

    377775266664bm;0bm;1bm;2bm;3bm;4

    377775 =266664

    15+m

    (5 +m) 15+m125+m

    (5 +m) 25+m112 (5 +m)(5 +m 1) 25+m2

    377775266664bm;0bm;1bm;2bm;3bm;4

    377775 =26666416 8 17 10 448 20 48 29 1248 18 48 30 1320 7 20 13 63 1 3 2 1

    377775266664

    1(5 +m)32 2m

    16(5 +m) 2m4(5 +m)(5 +m 1) 2m

    377775

    Using dierence equations

    Returning to the recurrence relation for b, , b, , b, , ..., b, -,

    b, = a0 b-, -,b, = b-, a1 b-, -,b, = b-, a2 b-, -,...,b, - = b-, - a- b-, -withb, = b, = b, = ... = b, - = 0, and b, - = 1.

    Upon substituting the rst relation into the second,

    b, = a0 b-, - a1 b-, -,

    and now this one into the nextb, = b-, a2 b-, -,

    b, =

    a0 b-, - a1 b-, - a2 b-, -,

    ..., and so on, the following dierence equation is found.b, - =a0 b-, - a1 b-, -

    a2 b-, - ... a- b-, - a- b-, -

    with b, - = b, - = b, - = ... = b-, - = 0, and b-, - = 1.

    See the subsection on linear dierence equations for more explanation.

  • 3.5. MOTIVATION OF THE PROCEDURE 27

    3.5.7 Chains of generalized eigenvectorsSome notation and results from previous sections are restated.

    A is a n n matrix of complex numbers. A, = (A I)k

    N, = N((A I)k) = N(A, ) For V1 V2 = {0}, V1 V2 = {v1 + v2 : v1 V1 and v2 V2}.

    Assume A has eigenvalues 1, 2, ..., of algebraic multiplicities k1, k2, ..., k.For each i dene (), the null index of , to be the smallest positive integer such that N, = N, .It is always the case that () k.When () 2,N, N, ... N, N, = N, = ...,N, \ {0} = (N, \ N, -), form = 1, ..., and N, = {0}.x N, \ N, -, if and only if A, x N, - \ N, -Dene a chain of generalized eigenvectors to be a set{ x1, x2, ..., x } such that x1 N, \ N, -, and x = A, x.Then x 0 and A, x = 0.When x1 N, \ {0}, {x1} can be, for the sake of not requiring extra terminology, considered trivially a chain.When a disjoint collection of chains combined form a basis set for N, ,they are often referred to as Jordan chains and are the vectors used for the columns of a transformationmatrix in theJordan canonical form.When a disjoint collection of chains that combined form a basis set, is needed that satisfy x = A, x, for somescalars , chainsas already dened can be scaled for this purpose.What will be proven here is that such a disjoint collection of chainscan always be constructed.Before the proof is started, recall a few facts about direct sums.When the notation V1 V2 is used, it is assumed V1 V2 = {0}.For x = v1 + v2 with v1 V1 and v2 V2 , then x = 0,if and only if v1 = v2 = 0.In the discussion below = dim(N, ) dim(N, ), with 1 = dim(N, ).First consider when N, \ N, {0} , Then a basis for N, can beextended to a basis for N, . If 2 = 1, then there exists x1 N, \ N, ,such that N, = N, span{x1}. Let x2 = A, x1. Thenx2 N, \ {0}, with x1 and x2 linearly independent. If dim(N, ) = 2,since {x1, x2} is a chain we are through. Otherwise x1, x2 can be extendedto a basis x1, x2, ..., x1 for N, . The sets {x1, x2}, {x3}, ..., {x1}form a disjoint collection of chains. In the case that 2 > 1, then there existlinearly independent x1, x2, ..., x2 N, \ N, , such thatN, = N, span{x1, x2, ..., x2}. Let y = A, x.Then y N, \ {0}, for i = 1, 2, ..., 2. To see the y1, y2, ..., y2are linearly independent, assume that for some 1, 2, ..., 2,that 1y1 + 2y2 + ... + 2y2 = 0, Then for x = 1x1 + 2x2 + ... + 2x2,x N, , and x span{x1, x2, ..., x2}, which implies that x = 0, and1= 2= ... = 2 = 0. Since span{y1, y2, ..., y2} N, , the vectorsx1, x2, ..., x2 , y1, y2, ..., y2 are a linearly independent set.If 2 = 1, then the sets {x1, y1}, {x2, y2}, ..., {x2, y2} form a

  • 28 CHAPTER 3. GENERALIZED EIGENVECTOR

    disjoint collection of chains that when combined are a basis set for N, .If 1 > 2, then x1, x2, ..., x2 , y1, y2, ..., y2 can be extended to a basisfor N, by some vectors x2, ..., x1 in N, , so that{x1, y1}, {x2, y2}, ..., {x2, y2} , {x2}, ..., {x1}forms a disjoint collection of chains.To reduce redundancy, in the next paragraph, when = 1 the notationx1, x2, ..., x will be understood simply to mean just x1 and when = 2to mean x1, x2.So far it has been shown that, if linearly independentx1, x2, ..., x2 N, \ N, , are chosen, such thatN, = N, span{x1, x2, ..., x2}, then there exists a disjointcollection of chains with each of the x1, x2, ..., x2 being the rst member or topof one of the chains. Furthermore, this collection of vectors, when combined,forms a basis for N, .Now, let the induction hypothesis be that, if linearly independentx1, x2, ..., x N, \ N, , are chosen, such thatN, = N, span{x1, x2, ..., x}, then there exists a disjointcollection of chains with each of the x1, x2, ..., x being the rst member or topof one of the chains. Furthermore, this collection of vectors, when combined,forms a basis for N, .Considerm < (). A basis for N, can always be extended to a basis forN, . So linearly independent x1, x2, ..., x N, \ N, , such thatN, = N, span{x1, x2, ..., x}, can be chosen. Let y = A, x.Then y N, \ N, , for i = 1, 2, ..., . To see the y1, y2, ..., yare linearly independent, assume that for some 1, 2, ..., ,that 1y1 + 2y2 + ... + y = 0, Then forx = 1x1 + 2x2 + ... + x, x N, , andx span{x1, x2, ..., x}, which implies that x = 0, and1= 2= ... = = 0. In addition, span{y1, y2, ..., y} N, = {0}.To see this assume that for some 1, 2, ..., ,that 1y1 + 2y2 + ... + y N, Then forx = 1x1 + 2x2 + ... + x, x N, , andx span{x1, x2, ..., x}, which implies that x = 0, and1= 2= ... = = 0. The proof is nearly done.At this point suppose that b1, b2, ..., b is any basis for N, .Then B = span{b1, b2, ..., b} span{y1, y2, ..., y}is a subspace of N, . If B N, , thenb1, b2, ..., b, y1, y2, ..., y can be extended to a basis for N, ,by some set of vectors z1, z2, ..., z , in which caseN, = N, span{y1, y2, ..., y} span{z1, z2, ..., z }.If = , thenN, = N, span{y1, y2, ..., y}or if > , thenN, = N, span{z1, z2, ..., z , y1, y2, ..., y}In either case apply the induction hypothesis to get that there exists a disjointcollection of chains with each of the y1, y2, ..., y being the rst member or topof one of the chains. Furthermore, this collection of vectors, when combined,forms a basis for N, . Now, y = A, x, for i = 1, 2, ..., , so each of thechains beginning with y can be extended upwards into N, \ N, to a chainbeginning with x. Since N, = N, span{x1, x2, ..., x},the combined vectors of the new chains form a basis for N, .

  • 3.5. MOTIVATION OF THE PROCEDURE 29

    Dierential equations y= Ay

    Let A be a nn matrix of complex numbers and an eigenvalue of A , withassociated eigenvector x . Suppose y(t) is a n dimensional vector valuedfunction, suciently smooth, so that y(t) is continuous. The restriction that y(t)be smooth can be relaxed somewhat, but is not the main focus of this discussion.The solutions to the equation y(t) = Ay(t) are sought. The rst observation is thaty(t) = etx will be a solution. When A does not have n linearly independenteigenvectors, solutions of this kind will not provide the total of n needed for afundamental basis set.In view of the existence of chains of generalized eigenvectors seek a solution ofthe form 'y(t) = etx1 + t'etx2 , theny(t) =' etx1 + etx2 + t etx2 =' et( x1 + x2)'''' + t et(x 2)andAy(t) = etA x1 + t etA x2 .In view of this, y(t) will be a solution to y(t) = Ay(t) , when 'A x1 = x1 + x2' andA x2 = x2 . That is when (A I)x1 = x2 and (A I)x2 = 0 . Equivalently,when {x1, x2} is a chain of generalized eigenvectors.Continuing with this reasoning seek a solution of the formy(t) = etx1 + t etx2+ t2 etx3 ,theny(t) = etx1 + etx2 + t etx2+ 2 t etx3+ t2 etx3= et( x1 + x2) + t et( x2 + 2 x3) + t2 et( x3)' andAy(t) = etA x1 + t etA x2 + t2 etA x3 .Like before, y(t) will be a solution to y(t) = Ay(t) , when 'A x1 = x1 + x2' ,'A x2 = x2 + 2 x3' , and A x3 = x3 . That is when (A I)x1 = x2 ,(A I)x2 = 2 x3 , and (A I)x3 = 0 . Since it will hold (A I)(2 x3) = 0 ,also, equivalently, when {x1, x2, 2 x3} is a chain of generalized eigenvectors.More generally, to nd the progression, seek a solution of the formy(t) = etx1 + t etx2 + t2 etx3 + t3 etx4 + ... + tm2 etx + tm1 etx ,theny(t) = etx1 + etx2 + t etx2' + 2 t etx3' + t2 etx3' + 3 t2 etx4' + t3 etx4'+ ...' + (m2)tm3etx' + tm2 etx' + (m1)tm2 etx' + tm1 etx=' et( x1 + x2) + t et( x2 + 2 x3) + t2 et( x3 + 3 x4) + t3 et( x4 + 4 x5)+ ...+ tm3 et( x + (m2) x) + tm2 et( x + (m1) x) + tm1 et( x)'andAy(t) =etA x1 + t etA x2 + t2 etA x3 + t3 etA x4 + ... + tm2 etA x + tm1 etA x .Again, y(t) will be a solution to y(t) = Ay(t) , when'A x1 = x1 + x2' , A x2 = x2 + 2 x3 , A x3 = x3 + 3 x4 , A x4 = x4 + 4 x5 ,A x = x + (m2) x , A x = x + (m1) x ,and A x = x .That is when(A I)x1 = x2 , (A I)x2 = 2 x3 , (A I)x3 = 3 x4 , (A I)x4 = 4 x5 ,...,(A I)x = (m2) x , (A I)x = (m1) x , and(A I)x = 0 .Since it will hold (A I)((m1)! x3) = 0 , also, equivalently, when{x1, 1! x2, 2! x3, 3! x4, ..., (m2)! x, (m1)! x}

  • 30 CHAPTER 3. GENERALIZED EIGENVECTOR

    is a chain of generalized eigenvectors.Now, the basis set for all solutions will be found through a disjoint collectionof chains of generalized eigenvectors of the matrix A.Assume A has eigenvalues 1, 2, ..., of algebraic multiplicities k1, k2, ..., k.For a given eigenvalue there is a collection of s, with s depending on i, disjoint chains of generalized eigenvectorsCi,1 = {1z1, 1z2, ...,1z}, Ci,2 = {2z1, 2z2, ...,2z}, ..., Ci,js(i) = {sz1, sz2, ...,sz},that when combined form a basis set for N, . The total number of vectorsin this set will be j1 + j2 + ... + js = k. Sets in this collection may have only oneor two members so in this discussion understand the notation {z1, z2, ...,z}will mean {z1} when j = 1, and {z1, z2} when j = 2, and so forth.Being that this notation is cumbersome with many indices, in the next paragraphsany particular Ci,, when more explanation is not needed, may just be notated asC = {z1, z2, ..., z}.For each such of these chain sets, C = {z1, z2, ..., z}the sets {x}, {x, x}, {x, x, x}, ..., {z2, z3, ..., z}, {z1, z2, ..., z}are also chains. This notation being understood to mean whenC = {z1} just {z1}, when C = {z1, z2} just {z2}, {z1, z2} and whenC = {z1, z2, z2} just {z3}, {z2, z3}, {z1, z2, z3}, and so on.The conclusion of the top of the discussion was thaty(t) = etx1, is a solution when {x1} is a chain.y(t) = etx1 + t etx2, is a solution when {x1, 1! x2} is a chain.y(t) = etx1 + t etx2 + t2 etx3, is a solution when {x1, 1! x2, , 2! x3} is a chain.The progression continues toy(t) = etx1 + t etx2+ t2 etx3 + t3 etx4 + ... + tm2 etx + tm1 etx,

    is a solution when {x1, 1! x2, 2! x3, 3! x4, ..., (m2)! x, (m1)! x},is a chain of generalized eigenvectors.In light of the preceding calculations, all that must be done is to provide the properscaling for each of the chains arising from the set C = {z1, z2, ..., z}.The progression for the solutions is given byy(t) = etz, for chain {z}y(t) = etz + (1 1!) t etz, for chain {z, 1!(1 1!) z}y(t) = etz + (1 1!) t etz+ (1 2!) t2 etz,for chain {z, 1!(1 1!) z, 2!(1 2!) z}y(t) = etz + (1 1!) t etz+ (1 2!) t2 etz + (1 3!) t3 etz,for chain {z, 1!(1 1!) z, 2!(1 2!) z, 3!(1 3!) z},and so on until,y(t) = etz1 + (1 1!) t etz2+ (1 2!) t2 etz3 + ... + (1 (j1)!) t j1 etz,for the chain of generalized eigenvectors,{z1, 1!(1 1!) z2, 2!(1 2!) z3, ..., (j2)!(1 (j2)!) x, (j1)!(1 (j1)!) z}.What is left to show is that when all the solutions constructed from the chain sets,as described, are considered, they form a fundamental set of solutions.To do this it has to be shown that there are n of them and that they arelinearly independent.Reiterating, for a given eigenvalue there is a collection of s, with s depending on i,disjoint chains of generalized eigenvectorsCi,1 = {1z1, 1z2, ...,1z}, Ci,2 = {2z1, 2z2, ...,2z},

  • 3.5. MOTIVATION OF THE PROCEDURE 31

    ..., Ci,js(i) = {s(i)z1, s(i)z2, ...,s(i)z},that when combined form a basis set for N, . The total number of vectorsin this set will be j1(i) + j2(i) + ... + js(i) = k.Thus the total number of all such basis vectors and so solutions isk1 + k2 + ... + k = n.Each solution is one of the forms y(t) = etx1, y(t) = etx1 + t etx2,y(t) = etx1 + t etx2 + t2 etx3, y(t) = etx1 + t etx2 + t2 etx3 + ....Now each basis vector v, for j = 1, 2, ..., n; of the combined set ofgeneralized eigenvectors, occurs as x1 in one of the expressions immediatelyabove precisely once. That is, for each j, there is one y(t) = etv + ...Since y(0) = e0v = v, the set of solutions are linearly independent at t = 0.

    Revisiting the powers of a matrix

    As a notational convenience A, = I.Note that A = I + A, . and apply the binomial theorem.

    As = (I +A;1)s =

    sXr=0

    s

    r

    srA;r

    Assume is an eigenvalue of A, and let { x1, x2, ..., x }be a chain of generalized eigenvectors such that x1 N, \ N, - ,x = A, x ,. x 0 , and A, x = 0.Then x = A, x1, for r = 0, 1, ..., m-1.

    Asx1 =

    sXr=0

    s

    r

    srA;rx1 =

    sXr=0

    s

    r

    srxr+1

    So for s m 1

    Asx1 =sX

    r=0

    s

    r

    srxr+1

    and for s m 1, since A, x1 = 0,

    Asx1 =

    m1Xr=0

    s

    r

    srxr+1

    3.5.8 Ordinary linear dierence equationsOrdinary linear dierence equations are equations of the sort:y = a y + by = a y + b y + cor more generally,y = ay + ay + ... + a2y + a1y + a0with initial conditionsy0, y1, y2, ..., y, y.A case with a1 = 0 can be excluded, since it represents an equation of less degree.They have a characteristic polynomialp(x) = xm axm1 axm2 ... a2x a1.

  • 32 CHAPTER 3. GENERALIZED EIGENVECTOR

    To solve a dierence equation it is rst observed, if y and z are both solutions,then (y z) is a solution of the homogeneous equation:y = ay + ay + ... + a2y + a1y.So a particular solution to the dierence equation must be found together withall solutions of the homogeneous equation to get the general solution for thedierence equation. Another observation to make is that, if y is a solution tothe inhomogeneous equation, thenz = y yis also a solution to the homogeneous equation.So all solutions of the homogeneous equation will be found rst.When is a root of p(x) = 0, then it is easily seeny = n is a solution to the homogeneous equation sincey ay ay ... a2y a1y,becomes upon the substitution y = n,n an1 an2 ... a2nm + 1 a1nm= nm(m am1 am2 ... a2 a1)= nmp() = 0.When is a repeated root of p(x) = 0, theny = nn1 is a solution to the homogeneous equation sincenn1 a(n1)n2 a(n2)n3 ... a2(nm + 1)nm a1(nm)nm 1= (nm)nm 1(m am1 am2 ... a2 a1)+ nm 1(mm1 (m1)am2 (m2)am3 ... 2a3 a2)= (nm)nm 1p() + nm 1p() == 0.After reaching this point in the calculation the mystery is solved. Just notice when is a root of p(x) = 0 with multiplicity k, then for s = 1, 2, ..., k1ds(nmp())/ds = 0.Referring this back to the original equationn an1 an2 ... a2nm + 1 a1nmit is seen thaty = ds(n)/dsare solutions to the homogeneous equation. For example, if is a root ofmultiplicity 3, then y = n(n1)n2 is a solution. In any case this givesmlinearly independent solutions to the homogeneous equation.To look for a particular solution rst consider the simpliest equation.y = a y + b.It has a particular solution y, given byy, = 0, y, = b, y, = (1 + a)b, ..., y, = (1 + a + a2 + ... + an1)b, ..., .Its homogeneous equation y = a y has solutions y = any0.So z = y y = anbcan be telescoped to gety = (y y) + (y y) + ... + (y2 y1) + (y1 y0) + y0= z + z + ... + z1 + z0 + y0= (1 + a + a2 + ... + an1)b ,the particular solution with y0 = 0.Now, returning to the general problem, the equationy = ay + ay + ... + a2y + a1y + a0.When y, is a particular solution with y, = 0, thenz = y, y,is a solution to the homogeneous equation with z0 = y, .So z = y, y,can be telescoped to gety, = (y, y,) + (y, y,) + ... + (y, y,) + (y, y,) + y,= z + z + ... + z1 + z0Consideringy, = ay, + ay, + ... + a2y, + a1y, + a0.and rewriting the equation in the z

  • 3.5. MOTIVATION OF THE PROCEDURE 33

    z + z + ... + z1 + z0= (a) ( z + z + ... + z1 + z0) + (a) ( z + z + ... + z1 + z0)+ (a) ( z + z + ... + z1 + z0)+ + (a3) ( z1 + z0) + (a2) ( z0) + (a0)andz= (a 1) z + (a + a 1) z + (a + a + a 1) z+ + (a + a + ... + a4 + a3 1) z1 + (a + a + ... + a3 + a2 1) z0+ (a0).

    Since a solution of the homogeneous equation can be found for any initial conditionsz0, z1, z2, ..., z, z.reasoning conversely nd such z satisfying the equation,just before and dene y, by the relationy, = 0, y, = z + z + ... + z1 + z0

    One choice is, for example, z = a0, z0 = z1 = z2 = ... = z = 0.This solution solves the problem for all initial values equal to zero.The general solution to the inhomogeneous equation is given byy = y, + 1 w(1) + 2 w(2) + ... + w(m1) + w(m)wherew(1), w(2), ..., w(m1), w(m)are a basis for the homogeneous equation, and1, 2, ..., , are scalars.

    Example

    y = 8 y 25 y + 38 y 28 y + 8 y + 1with initial conditionsy0 = 0, y1 = 0, y2 = 0, y3 = 0, and y4 = 0.The characteristic polynomial for the equation isp(x) = x5 8x4 + 25x3 38x2 + 28x 8 = (x 1)2(x 2)3.

    The homogeneous equation has independent solutionsw1 = 1n = 1, w2 = n1n1 = n, andw3 = 2n, w4 = n2n1, w5 = n(n1)2n2.The solution to the homogeneous equationz = 3 w1 w2 + 3 w3 2 w4 + w5satises the initial conditionsz4 = 1, z0 = z1 = z2 = z3 = 0.A particular solution can be found byy, = 0, y, = z + z + ... + z1 + z0 .

    Calculating sums:w1 = w1 + w1 + ... + w11 + w10 = n .w2 = w2 + w2 + ... + w21 + w20 = (n1)n / 2 .w3 = w3 + w3 + ... + w31 + w30 = 2n 1 .Sums of these kinds are found by dierentiating (xn 1) / (x 1).w4 = w4 + w4 + ... + w41 + w40 = (n2)2n1 + 1 .w5 = w5 + w5 + ... + w51 + w50 = (n2 5n + 8)2n2 2 .

    Now,

  • 34 CHAPTER 3. GENERALIZED EIGENVECTOR

    y, = 3 w1 w2 + 3 w3 2 w4 + w5solves the initial value problem of this example.At this point it is worthwhile to notice that all the terms that are combinations ofscalar multiples of basis elements can be removed. These are any multiples of1, n, 2n, n2n1, and n22n2.So instead the particular solution next, may be preferred.y, = n2 .This solution has non zero initial values, which must be taken into account.y0 = 0, y1 = 1 2, y2 = 2, y3 = 9 2, and y4 = 8.

    3.6 Notes[1] Beauregard & Fraleigh (1973, p. 310)[2] Nering (1970, p. 118)[3] Golub & Van Loan (1996, p. 316)[4] Bronson (1970, p. 189)[5] Nering (1970, p. 118)[6] Bronson (1970, pp. 190,202)[7] Golub & Van Loan (1996, p. 316)[8] Bronson (1970, p. 189)[9] Bronson (1970, pp. 194-196)[10] Nering (1970, p. 118)[11] Herstein (1964, p. 261)[12] Beauregard & Fraleigh (1973, p. 310)[13] Nering (1970, pp. 122,123)[14] {{harvtxt|Bronson|1970|pp=189-209

    3.7 References Anton, Howard (1987), Elementary Linear Algebra (5th ed.), New York: Wiley, ISBN 0-471-84819-0 Axler, Sheldon (1997). Linear Algebra Done Right (2nd ed.). Springer. ISBN 978-0-387-98258-8. Beauregard, Raymond A.; Fraleigh, John B. (1973), A First Course In Linear Algebra: with Optional Introduc-tion to Groups, Rings, and Fields, Boston: Houghton Miin Co., ISBN 0-395-14017-X

    Bronson, Richard (1970), Matrix Methods: An Introduction, New York: Academic Press, LCCN 70097490 Burden, Richard L.; Faires, J. Douglas (1993), Numerical Analysis (5th ed.), Boston: Prindle, Weber andSchmidt, ISBN 0-534-93219-3

    Golub, Gene H.; Van Loan, Charles F. (1996), Matrix Computations (3rd ed.), Baltimore: Johns HopkinsUniversity Press, ISBN 0-8018-5414-8

    Harper, Charlie (1976), Introduction to Mathematical Physics, New Jersey: Prentice-Hall, ISBN 0-13-487538-9

    Herstein, I. N. (1964), Topics In Algebra, Waltham: Blaisdell Publishing Company, ISBN 978-1114541016 Kreyszig, Erwin (1972), Advanced Engineering Mathematics (3rd ed.), New York: Wiley, ISBN 0-471-50728-8

    Nering, Evar D. (1970), Linear Algebra and Matrix Theory (2nd ed.), New York: Wiley, LCCN 76091646

  • Chapter 4

    Generalized singular value decomposition

    In linear algebra, the generalized singular value decomposition (GSVD) is the name of two dierent techniquesbased on the singular value decomposition. The two versions dier because one version decomposes two (or more)matrices (much like higher order PCA) and the other version uses a set of constraints imposed on the left and rightsingular vectors.

    4.1 Higher order versionThe generalized singular value decomposition (GSVD) is a matrix decomposition more general than the singularvalue decomposition. It is used to study the conditioning and regularization of linear systems with respect to quadraticsemi-norms.Let F = R , or F = C . Given matrices A 2 Fmn and B 2 Fpn , their GSVD is given by

    A = U1[X; 0]Q

    and

    B = V 2[X; 0]Q

    where U 2 Fmm; V 2 Fpp , and Q 2 Fnn are unitary matrices, and X 2 Frr is non-singular, wherer = rank([A; B]) . Also, 1 2 Fmr is non-negative diagonal, and 2 2 Fpr is non-negative block-diagonal,with diagonal blocks; 2 is not always diagonal. It holds that T1 1 = d21; : : : ; 2rc and T2 2 = d21 ; : : : ; 2rc ,and that T1 1 + T2 2 = Ir . This implies 0 i; i 1 . The ratios i = i/i are called the generalizedsingular values ofA andB . IfB is square and invertible, then the generalized singular values are the singular values,and U and V are the matrices of singular vectors, of the matrix AB1 . Further, if B = I , then the GSVD reducesto the singular value decomposition, explaining the name.

    4.2 Weighted versionThe weighted version of the generalized singular value decomposition (GSVD) is a constrained matrix decompo-sition with constraints imposed on the left and right singular vectors of the singular value decomposition.[1][2][3] Thisform of the GSVD is an extension of the SVD as such. Given the SVD of an mn real or complex matrix M

    M = UV

    where

    35

  • 36 CHAPTER 4. GENERALIZED SINGULAR VALUE DECOMPOSITION

    UWuU = V WvV = I:

    Where I is the Identity Matrix and where U and V are orthonormal given their constraints ( Wu and Wv ). Addi-tionally, Wu and Wv are positive denite matrices (often diagonal matrices of weights). This form of the GSVD isthe core of certain techniques, such as generalized principal component analysis and Correspondence analysis.The weighted form of the GSVD is called as such because, with the correct selection of weights, it generalizes manytechniques (such as multidimensional scaling and linear discriminant analysis)[4]

    4.3 ApplicationsThe GSVD has been successfully applied to signal processing and big data, e.g., in genomic signal processing.[5][6]These applications also inspired a higher-order GSVD (HO GSVD)[7] and a tensor GSVD.[8]

    4.4 See also C. C. Paige, and M. A. Saunders: Towards a Generalized Singular Value Decomposition, SIAM J. Numer.Anal., Volume 18, Number 3, June 1981.

    Gene Golub, and Charles Van Loan, Matrix Computations, Third Edition, Johns Hopkins University Press,Baltimore, 1996, ISBN 0-8018-5414-8

    Hansen, Per Christian, Rank-Decient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inver-sion, SIAM Monographs on Mathematical Modeling and Computation 4. ISBN 0-89871-403-6

    LAPACK manual

    4.5 References[1] Jollie I.T. Principal Component Analysis, Series: Springer Series in Statistics, 2nd ed., Springer, NY, 2002, XXIX, 487

    p. 28 illus. ISBN 978-0-387-95442-4

    [2] Greenacre, Michael (1983). Theory and Applications of Correspondence Analysis. London: Academic Press. ISBN 0-12-299050-1.

    [3] Abdi. H., & Williams, L.J. (2010). Principal component analysis.. Wiley Interdisciplinary Reviews: ComputationalStatistics, 2: 433459. doi:10.1002/wics.101.

    [4] Abdi, H. (2007). Singular Value Decomposition (SVD) and Generalized Singular Value Decomposition (GSVD). In N.J.Salkind (Ed.): Encyclopedia of Measurement and Statistics. Thousand Oaks (CA): Sage. pp. 907-912.

    [5] O. Alter, P. O. Brown and D. Botstein (March 2003). Generalized Singular Value Decomposition for Comparative Analy-sis ofGenome-Scale ExpressionDatasets of TwoDierent Organisms. PNAS 100 (6): 33513356. doi:10.1073/pnas.0530258100.

    [6] C. H. Lee,* B. O. Alpert,* P. Sankaranarayanan and O. Alter (January 2012). GSVD Comparison of Patient-MatchedNormal and Tumor aCGH Proles Reveals Global Copy-Number Alterations Predicting Glioblastoma Multiforme Sur-vival. PLoS One 7 (1): e30098. doi:10.1371/journal.pone.0030098. Highlight.

    [7] S. P. Ponnapalli, M. A. Saunders, C. F. Van Loan and O. Alter (December 2011). A Higher-Order Generalized SingularValue Decomposition for Comparison of Global mRNA Expression fromMultiple Organisms. PLoS One 6 (12): e28072.doi:10.1371/journal.pone.0028072. Highlight.

    [8] P. Sankaranarayanan, T. E. Schomay, K. A. Aiello and O. Alter (April 2015). Tensor GSVD of Patient- and Platform-Matched Tumor andNormal DNACopy-Number Proles Uncovers ChromosomeArm-Wide Patterns of Tumor-ExclusivePlatform-Consistent Alterations Encoding for Cell Transformation and Predicting Ovarian Cancer Survival. PLoS One 10(4): e0121396. doi:10.1371/journal.pone.0121396. AAAS EurekAlert! Press Release and NAE Podcast Feature.

  • Chapter 5

    Gershgorin circle theorem

    In mathematics, the Gershgorin circle theorem may be used to bound the spectrum of a square matrix. It wasrst published by the Soviet mathematician Semyon Aranovich Gershgorin in 1931. The spelling of S. A. Gersh-gorins name has been transliterated in several dierent ways, including Gergorin, Gerschgorin, Gershgorin andHershhorn/Hirschhorn.

    5.1 Statement and proofLet A be a complex n n matrix, with entries aij . For i 2 f1; : : : ; ng let Ri =

    Pj 6=i jaij j be the sum of the

    absolute values of the non-diagonal entries in the i -th row. Let D(aii; Ri) be the closed disc centered at aii withradius Ri . Such a disc is called a Gershgorin disc.Theorem: Every eigenvalue of A lies within at least one of the Gershgorin discs D(aii; Ri)Proof: Let be an eigenvalue of A and let x = (xj) be a corresponding eigenvector. Let i {1, , n} be chosen sothat |xi| = maxj |xj|. (That is to say, choose i so that x is the largest (in absolute value) number in the vector x) Then|xi| > 0, otherwise x = 0. Since x is an eigenvector, Ax = x , and thus:

    Xj

    aijxj = xi 8i 2 f1; : : : ; ng:

    So, splitting the sum, we get

    Xj 6=i

    aijxj = xi aiixi:

    We may then divide both sides by xi (choosing i as we explained, we can be sure that xi 0) and take the absolutevalue to obtain

    j aiij =P

    j 6=i aijxjxi

    Xj 6=i

    aijxjxi X

    j 6=ijaij j = Ri

    where the last inequality is valid because

    xjxi 1 forj 6= i:

    Corollary: The eigenvalues of A must also lie within the Gershgorin discs Cj corresponding to the columns of A.Proof: Apply the Theorem to AT.Example For a diagonal matrix, the Gershgorin discs coincide with the spectrum. Conversely, if the Gershgorindiscs coincide with the spectrum, the matrix is diagonal.

    37

  • 38 CHAPTER 5. GERSHGORIN CIRCLE THEOREM

    5.2 DiscussionOne way to interpret this theorem is that if the o-diagonal entries of a square matrix over the complex numbers havesmall norms, the eigenvalues of the matrix cannot be far from the diagonal entries of the matrix. Therefore, byreducing the norms of o-diagonal entries one can attempt to approximate the eigenvalues of the matrix. Of course,diagonal entries may change in the process of minimizing o-diagonal entries.

    5.3 Strengthening of the theoremIf one of the discs is disjoint from the others then it contains exactly one eigenvalue. If however it meets another discit is possible that it contains no eigenvalue (for example, A =

    0 14 0

    or A =

    1 21 1

    ). In the general case the

    theorem can be strengthened as follows:Theorem: If the union of k discs is disjoint from the union of the other n k discs then the former union containsexactly k and the latter n k eigenvalues of A.Proof: Let D be the diagonal matrix with entries equal to the diagonal entries of A and let

    B(t) = (1 t)D + tA:

    We will use the fact that the eigenvalues are continuous in t , and show that if any eigenvalue moves from one of theunions to the other, then it must be outside all the discs for some t , which is a contradiction.The statement is true for D = B(0) . The diagonal entries of B(t) are equal to that of A, thus the centers of theGershgorin circles are the same, however their radii are t times that of A. Therefore the union of the correspondingk discs of B(t) is disjoint from the union of the remaining n-k for all t. The discs are closed, so the distance of thetwo unions for A is d > 0 . The distance for B(t) is a decreasing function of t, so it is always at least d. Since theeigenvalues of B(t) are a continuous function of t, for any eigenvalue (t) of B(t) in the union of the k discs itsdistance d(t) from the union of the other n-k discs is also continuous. Obviously d(0) d , and assume (1) lies inthe union of the n-k discs. Then d(1) = 0 , so there exists 0 < t0 < 1 such that 0 < d(t0) < d . But this means(t0) lies outside the Gershgorin discs, which is impossible. Therefore (1) lies in the union of the k discs, and thetheorem is proven.

    5.4 ApplicationThe Gershgorin circle theorem is useful in solving matrix equations of the form Ax = b for x where b is a vector andA is a matrix with a large condition number.In this kind of problem, the error in the nal result is usually of the same order of magnitude as the error in the initialdata multiplied by the condition number of A. For instance, if b is known to six decimal places and the conditionnumber of A is 1000 then we can only be condent that x is accurate to three decimal places. For very high conditionnumbers, even very small errors due to rounding can be magnied to such an extent that the result is meaningless.It would be good to reduce the condition number of A. This can be done by preconditioning: A matrix P such that P A1 is constructed, and then the equation PAx = Pb is solved for x. Using the exact inverse of A would be nice butnding the inverse of a matrix is generally very dicult.Now, since PA I where I is the identity matrix, the eigenvalues of PA should all be close to 1. By the Gershgorincircle theorem, every eigenvalue of PA lies within a known area and so we can form a rough estimate of how goodour choice of P was.

    5.5 ExampleUse the Gershgorin circle theorem to estimate the eigenvalues of:

  • 5.6. SEE ALSO 39

    This diagram shows the discs in yellow derived for the eigenvalues. The rst two disks overlap and their union contains two eigen-values. The third and fourth disks are disjoint from the others and contain one eigenvalue each.

    A =

    266410 1 0 10:2 8 0:2 0:21 1 2 11 1 1 11

    3775:Starting with row one, we take the element on the diagonal, aii as the center for the disc. We then take the remainingelements in the row and apply the formula:

    Xj 6=i

    jaij j = Ri

    to obtain the following four discs:

    D(10; 2); D(8; 0:6); D(2; 3); and D(11; 3):

    Note that we can improve the accuracy of the last two discs by applying the formula to the corresponding columns ofthe matrix, obtainingD(2; 1:2) and D(11; 2:2) .The eigenvalues are 9.8218, 8.1478, 1.8995, 10.86

    5.6 See also For matrices with non-negative entries, see PerronFrobenius theorem.

  • 40 CHAPTER 5. GERSHGORIN CIRCLE THEOREM

    Metzler matrix Doubly stochastic matrix Muirheads inequality Hurwitz matrix

    5.7 References Gerschgorin, S. "ber die Abgrenzung der Eigenwerte einer Matrix. Izv. Akad. Nauk. USSR Otd. Fiz.-Mat.Nauk 6, 749754, 1931

    Varga, R. S. Gergorin and His Circles. Berlin: Springer-Verlag, 2004. ISBN 3-540-21100-4. Errata. Richard S. Varga 2002Matrix Iterative Analysis, Second ed. (of 1962 Prentice Hall edition), Springer-Verlag. Golub, G. H.; Van Loan, C. F. (1996). Matrix Computations. Baltimore: Johns Hopkins University Press. p.320. ISBN 0-8018-5413-X.

    5.8 External links Gershgorins circle theorem at PlanetMath.org. Eric W. Weisstein. "Gershgorin Circle Theorem. From MathWorldAWolfram Web Resource. Semyon Aranovich Gershgorin biography at MacTutor

  • Chapter 6

    GoldenThompson inequality

    In physics andmathematics, theGoldenThompson inequality, proved independently byGolden (1965) and Thompson(1965), says that for Hermitian matrices A and B,

    tr eA+B tr eAeBwhere tr is the trace, and eA is the matrix exponential. This trace inequality is of particular signicance in statisticalmechanics, and was rst derived in that context.Bertram Kostant (1973) used the Kostant convexity theorem to generalize the GoldenThompson inequality to allcompact Lie groups.

    6.1 References Bhatia, Rajendra (1997), Matrix analysis, Graduate Texts in Mathematics 169, Berlin, New York: Springer-Verlag, ISBN 978-0-387-94846-1, MR 1477662

    J.E. Cohen, S. Friedland, T. Kato, F. Kelly, Eigenvalue inequalities for products of matrix exponentials, Linearalgebra and its applications, Vol. 45, pp. 5595, 1982. doi:10.1016/0024-3795(82)90211-7

    Golden, Sidney (1965), Lower bounds for the Helmholtz function, Phys. Rev., Series II 137: B1127B1128,doi:10.1103/PhysRev.137.B1127, MR 0189691

    Kostant, Bertram (1973), On convexity, the Weyl group and the Iwasawa decomposition, Annales Scien-tiques de l'cole Normale Suprieure. Quatrime Srie 6: 413455, ISSN 0012-9593, MR 0364552

    D. Petz, A survey of trace inequalities, in Functional Analysis and Operator Theory, 287298, Banach CenterPublications, 30 (Warszawa 1994).

    Thompson, Colin J. (1965), Inequality with applications in statistical mechanics, Journal of MathematicalPhysics 6: 18121813, doi:10.1063/1.1704727, ISSN 0022-2488, MR 0189688

    6.2 External links Tao, T. (2010), The GoldenThompson inequality

    41

  • Chapter 7

    Graded (mathematics)

    For other uses of graded, see Grade.

    In mathematics, the term graded has a number of meanings, mostly related:In abstract algebra, it refers to a family of concepts:

    An algebraic structure X is said to be I -graded for an index set I if it has a gradation or grading, i.e. adecomposition into a direct sumX = i2IXi of structures; the elements ofXi are said to be homogeneousof degree i.

    The index set I is most commonly N or Z , and may be required to have extra structure depending on thetype of X .

    Grading by Z2 (i.e. Z/2Z ) is also important. The trivial ( Z - or N -) gradation has X0 = X;Xi = 0 for i 6= 0 and a suitable trivial structure 0 . An algebraic structure is said to be doubly graded if the index set is a direct product of sets; the pairsmay be called bidegrees (e.g. see spectral sequence).

    A I -graded vector space or graded linear space is thus a vector space with a decomposition into a direct sumV = i2IVi of spaces. A graded linear map is a map between graded vector spaces respecting their gradations.


Recommended