Cambridge Part IB Linear Algebra Alex Chan

8/10/2019 Cambridge Part IB Linear Algebra Alex Chan

1/82

AAA

Part IB of the Mathematical Tripos

of the University of Cambridge

Michaelmas 2012

Linear Algebra

Lectured by:

Prof. I. Grojnowski

Notes by:

Alex Chan

Comments and corrections should be sent to [email protected].

This work is licensed under a Creative CommonsAttribution-NonCommercial-ShareAlike 3.0 Unported License.

The following resources are notendorsed by the University of Cambridge.

Printed Friday, 11 January 2013.
http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/mailto:[email protected]


2/82

Course schedule

Definition of a vector space (over R or C), subspaces, the space spanned by a subset.Linear independence, bases, dimension. Direct sums and complementary subspaces. [3]

Linear maps, isomorphisms. Relation between rank and nullity. The space of linear

maps fromU toV, representation by matrices. Change of basis. Row rank and columnrank. [4]

Determinant and trace of a square matrix. Determinant of a product of two matricesand of the inverse matrix. Determinant of an endomorphism. The adjugate matrix. [3]

Eigenvalues and eigenvectors. Diagonal and triangular forms. Characteristic and min-imal polynomials. Cayley-Hamilton Theorem over C. Algebraic and geometric multi-plicity of eigenvalues. Statement and illustration of Jordan normal form. [4]

Dual of a finite-dimensional vector space, dual bases and maps. Matrix representation,rank and determinant of dual map. [2]

Bilinear forms. Matrix representation, change of basis. Symmetric forms and their linkwith quadratic forms. Diagonalisation of quadratic forms. Law of inertia, classificationby rank and signature. Complex Hermitian forms. [4]

Inner product spaces, orthonormal sets, orthogonal projection, V = W W. Gram-Schmidt orthogonalisation. Adjoints. Diagonalisation of Hermitian matrices. Orthogo-nality of eigenvectors and properties of eigenvalues. [4]


3/82

Contents

1 Vector spaces 3

1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Linear maps and matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 101.5 Conservation of dimension: the Rank-nullity theorem . . . . . . . . . . . 161.6 Sums and intersections of subspaces . . . . . . . . . . . . . . . . . . . . 21

2 Endomorphisms 25

2.1 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3 Jordan normal form 35

3.1 Eigenvectors and eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . 353.2 Cayley-Hamilton theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3 Combinatorics of nilpotent matrices . . . . . . . . . . . . . . . . . . . . 453.4 Applications of JNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4 Duals 51

5 Bilinear forms 55

5.1 Symmetric forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575.2 Anti-symmetric forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 Hermitian forms 67

6.1 Inner product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696.2 Hermitian adjoints for inner products . . . . . . . . . . . . . . . . . . . 72


4/82


5/82

3

1 Vector spaces

1.1 Definitions

5 OctWe start by fixing a field, F. We say that F is a field if:

Fis an abelian group under an operation called addition, (+), with additive iden-tity 0;

F\{0} is an abelian group under an operation calledmultiplication, (), with mul-tiplicative identity 1;

Multiplication is distributive over addition; that is, a (b + c) = ab+ ac for alla,b,c F.

Fields weve encountered before include the reals R, the complex numbers C, the ring of

integers modulop, Z/p= Fp, the rationals Q, as well as Q(

3 ) =

a + b

3: a, b Q

,...

Everything we will discuss works over any field, but its best to have R and C in mind,since thats what were most familiar with.

Definition. Avector spaceover F is a tuple (V, +, )consisting of a setV , opera-tions +: V V V (vector addition) and : F VV (scalar multiplication)such that

(i) (V, +) is an abelian group, that is:

Associative: for all v1, v2, v3 V, (v1+ v2) + v3= v1+ (v2+ v3); Commutative: for allv1, v2 V, v1+ v2= v2+ v1; Identity: there is some (unique)0

Vsuch that, for all v

V, 0 + v=

v= v + 0;

Inverse: for allv V, there is some u V withu + v= v+ u= 0.This inverse is unique, and often denotedv.

(ii) Scalar multipication satisfies

Associative: for all 1, 2 F, v V, 1 (2 v) = (12) v; Identity: for allv V, the unit 1 F acts by 1 v= v; distributes over+V: for all F,v1, v2 V,(v1+ v2) =v1+v2; +Fdistributes over : for all1, 2 F,v V,(1+ 2)v= 1v+2v;

We usually say the vector space V rather than (V, +, ).
http://en.wikipedia.org/wiki/Vector_spacehttp://en.wikipedia.org/wiki/Field_(mathematics)


6/82

4| Linear Algebra

Lets look at some examples:

Examples 1.1.

(i){0} is a vector space.(ii) Vectors in the plane under vector addition form a vector space.

(iii) The space ofn-tuples with entries in F, denoted Fn = {(a1, . . . , an): ai F}with component-wise addition

(a1, . . . , an) + (b1, . . . , bn) = (a1+ b1, . . . , an+ bn)

and scalar multiplication

(a1, . . . , an) = (a1, . . . , an)

Proving that this is a vector space is an exercise. It is also a special case ofthe next example.

(iv) LetXbe any set, and FX ={

f :X

F

}be the set ofal l functions X

F.

This is a vector space, with addition defined pointwise:

(f+ g)(x) =f(x) + g(x)

and multiplication also defined pointwise:

( f)(x) =f(x)

if F, f , g FX, x X. IfX= {1, . . . , n}, then FX = Fn and we have theprevious example.

Proof thatFX is a vector space.

As + in F is commutative, we have

(f+ g)(x) =f(x) + g(x) =g(x) + f(x) = (g+ f)(x),

so f+g = g+ f. Similarly, f in F associative implies f+ (g+ h) =(f+ g) + h, and that (f)(x) = f(x) and 0(x) = 0.

Axioms for scalar multiplication follow from the relationship betweenand+ in F. Check this yourself!

(v) C is a vector space over R.

Lemma 1.2. LetVbe a vector space overF.

(i) For all F, 0 = 0, and for allv V, 0 v= 0.(ii) Conversely, if v= 0 and F has = 0, thenv= 0.

(iii) For allv V,1 v= v.Proof.

(i) 0 = (0 + 0) = 0 + 0 = 0 = 0.0 v= (0 + 0) v= 0 v+ 0 v = 0 v= 0.

(ii) As F, = 0, there exists 1 F such that 1 = 1, so v = (1) v =1 ( v), hence if v= 0, we get v = 1 0 = 0 by (i).

(iii) 0 = 0

v= (1 + (

1))

v= 1

v+ (

1

v) =v+ (

1

v) =

1

v=

v.

We will write v rather than v from now on, as the lemma means this will not causeany confusion.


7/82

5

1.2 Subspaces

Definition. LetVbe a vector space over F. A subsetU V is avector subspace(or just a subspace), writtenU V, if the following holds:

(i) 0

U;

(ii) Ifu1, u2 U, thenu1+ u2 U;(iii) Ifu U, F, then u U.

Equivalently, U is a subspace if U V, U = (U is non-empty) and for allu, v U, , F, u + v U.

Lemma 1.3. If V is a vector space overF and U V, thenU is a vector space overF under the restriction of the operations+ and onV to U. (Proof is an exercise.)

Examples 1.4.

(i){0} andVare always subspaces ofV .(ii){(r1, . . . , rn, 0, . . . , 0): ri R} Rn+m is a subspace ofRn+m.

(iii) The following are all subspaces of sets of functions:

C1(R) ={

f : R R | fcontinuous and differentiable} C(R) = {f : R R | f continuous} RR = {f : R R} .

Proof. f, g continuous impliesf+ g is, andfis, for R; the zero functionis continuous, so C(R) is a subspace ofR

R

, similarly for C

1

(R).(iv) LetXbe any set, and write

F[X] = (FX)fin={

f :X F | f(x) = 0 for only finitely many x X} .This is the set of finitely supported functions, which is is a subspace ofFX.

Proof that this is a subspace. f(x) = 0 = f(x) = 0, so if f (FX)fin,then so is f. Similarly,

(f+ g)1 (F\{0}) f1(F\{0}) g1(F\{0})

and if these two are finite, so is the LHS.

Special case. Consider the case X= N, so

F[N] = (FN)fin ={

(0, 1, . . .) | only finitely many i are non-zero}

.

We write xi for the function which sends i 1, j 0 ifj= i; that is, forthe tuple (0, . . . , 0, 1, 0, . . .) in the ith place. Thus

F[N] ={

i| only finitely many i non-zero}

.
http://en.wikipedia.org/wiki/Linear_subspacehttp://en.wikipedia.org/wiki/Linear_subspace


8/82

6| Linear Algebra

Note that we can do better than a vector space here; we can define multipli-cation by (

i xi) (

jxj)

=

i j xi+j.This is still in F[N]. It is more usual to denote this F[x], the polynomials in xover F (and this is a formal definition of the polynomial ring).

1.3 Bases

8 OctDefinition. SupposeVis a vector space over F, andS Vis a subset ofV. Thenvis alinear combinationof elements ofSif there is somen >0 and1, . . . , n F,v1, . . . , vn S such thatv = 1v1+ + nvn or ifv = 0.WriteS for the spanofS, the set of all linear combinations of elements ofS.

Notice that it is important in the definition to use only finitely many elements infinite

sums do not make sense in arbitrary vector spaces.We will see later why it is convenient notation to say that 0 is a linear combination ofn= 0 elements ofS.

Example 1.5. = {0}.

Lemma 1.6.

(i)S is a subspace ofV.(ii) IfW V is a subspace, andS W, thenS W; that is,S is the smallest

subset ofV containingS.

Proof. (i) is immediate from the definition. (ii) is immediate, by (i) applied to W.

Definition. We say that SspansV ifS =V.

Example 1.7. The set {(1, 0, 0), (0, 1, 0), (1, 1, 0), (7, 8, 0)} spans W = {(x,y ,z) | z = 0} R3.

Definition. Letv1, . . . , vnbe a sequence of elements inV. We say they arelinearlydependentif there exist 1, . . . , n F, not all zero, such that

ni=1

i vi = 0,

which we call a linear relation among the vi. We say that v1, . . . , vn are linearlyindependentif they are not linearly dependent; that is, if there is no linear relationamong them, or equivalently if

ni=1

i vi = 0 = i = 0 for all i.

We say that a subset S

V is linearly independent if every finite sequence ofdistinct elements in S is linearly independent.
http://en.wikipedia.org/wiki/Linear_spanhttp://en.wikipedia.org/wiki/Linear_combinationhttp://en.wikipedia.org/wiki/Polynomial_ring


9/82

7

Note that ifv1, . . . , vnis linearly independent, then so is every reordering v(1), . . . , v(n).

Ifv1, . . . , vn are linearly independent, and vi1 , . . . , vik is a subsequence, then thesubsequence is also linearly independent.

If some vi = 0, then 1 0 = 0 is a linear relation, so v1, . . . , vn is not linearlyindependent.

If vi = vj for some i= j, then 1 vi + (1) vj = 0 is a linear relation, so thesequence isnt linearly independent.

If|S|


10/82

8| Linear Algebra

Examples 1.9.

(i) By convention, the vector space{0}hasas a basis.(ii) S= {e1, . . . , en}, where ei is a vector of all zeroes except for a one in the ith

position, is a basis ofFn called the standard basis.

(iii) F[x] = F[N] = (FN

) fin has basis {1, x , x2, . . .}.More generally, F[X] has

{x| x X

}as a basis, where

x(y) =

1 ifx = y,

0 otherwise,

so F[X] is, formally, the set of linear combinations of elements ofX.

For amusement: F[N] FN, and 1, x , x2, . . . are linearly independent inFN as they are linearly independent in F[N], but they do not span FN, as(1, 1, 1, . . .) F[N].Show that if a basis ofFN exists, then it is uncountable.

Lemma 1.10. A setS is a basis ofV if and only if every vectorv Vcan be writtenuniquely as a linear combination of elements ofS.

Proof. () Writing v as a linear combination of elements ofS for every v V meansthatS= V. Uniquely means that, in particular, 0 can be written uniquely, and so Sis linearly independent.

() Ifv= ni=1 i vi = ni=1 ivi, where vi Sand i= 1, . . . , n, thenni=1(i i) vi =0, and since the vi are linearly independent, i = i for all i.

Observe: ifS is a basis of V,|

S|= d and

|F

|= q j, then det A =a11 . . . ann.

Proof. From the definition of determinant:

det A= Sn () a1,(1) . . . an,(n).If a product contributes, then we must have(i) ifor alli= 1, . . . , n. Hence(1) = 1,(2) = 2, and so on until (n) =n. Thus the only term that contributes is the identity,= id, and det A= a11 . . . ann.

Lemma 2.3. det AT = det A, where(AT)ij =Aji is the transpose.

Proof. From the definition of determinant, we have

det AT =Sn

() a(1),1 . . . a(n),n

= Sn

()ni=1

a(i),i

Nown

i=1 a(i),i =n

i=1 ai,1(i), since they contain the same factors but in a differentorder. We relabel the indices accordingly:

=Sn

()n

k=1

ak,1(k)

Now since is a group homomorphism, we have ( 1) =() = 1, and thus () =(1). We also note that just as runs through{1, . . . , n}, so does 1. We thus have

=Sn

()n

k=1

ak,(k)= det A.


29/82

27

Writingvi for the ith column ofA, we can consider A as ann-tuple of column vectors,A= (v1, . . . , vn). ThenMatn(F) = Fn Fn, and det is a function Fn Fn F.Proposition 2.4. The function det:Matn(F) F ismultilinear; that is, it is linearin each column of the matrix separately, so:

det(v1, . . . , i vi, . . . , vn) =i det(v1, . . . , vi, . . . , vn)det(v1, . . . , v

i+ v

i, . . . , vn) = det(v1, . . . , v

i, . . . , vn) + det(v1, . . . , v

i, . . . , vn).

We can combine this into the single condition

det(v1, . . . , iv

i +

i v

i , . . . , vn) =

i det(v1, . . . , v

i , . . . , vn)

+ i det(v1, . . . , vi , . . . , vn).

Proof. Immediate from the definition: det Ais a sum of termsa1,(1), . . . , an,(n), each ofwhich contains only one factor from the ith column: a1(i),i. If this term is

i a1(i),i+

i a1(i),i, then the determinant expands as claims.

Example 2.5. If we split a matrix along a single column, such as below, thendet(A) = det A + det A.

det

1 7 13 4 12 3 0

= det1 3 13 2 1

2 1 0

+ det1 4 13 2 1

2 2 0

Observe how the first and third columns remain the same, and only the secondcolumn changes. (Dont get confused: note that det(A+B)= det A+ det B forgeneral A and B .)

Corollary 2.6. det(A) =

n

A.Proof. This follows immediately from the definition, or from applying the result ofproposition 2.4 multiple times.

Proposition 2.7. If two columns ofA are the same, thendet A= 0.

Proof. Suppose vi and vj are the same. Let = (i j) be the transposition in Sn whichswapsi and j . ThenSn= An

An, whereAn= ker : Sn {1}. We will prove the

result by splitting the sum

det A=

Sn()

n

i=1a(i),i

into a sum over these two cosets for An, observing that for all An, () = 1 and() = 1.Now, for all An we have

a1,(1) . . . an,(n)= a1,(1) . . . an,(n),

as if(k) {i, j}, then (k) =(k), and if(k) =i, thenak,(k)= ak,(i)= ak,j =ak,i = ak,(k),

and similarly if(k) =j . Hence

det A=An

ni=1

a(i),i ni=1

a(i),i = 0.


30/82

28| Linear Algebra

Proposition 2.8. IfI is the identity matrix, thendet I= 1

Proof. Immediate.

Theorem 2.9

These three properties characterise the functiondet.

Before proving this, we need some language.

Definition. A function f : Fn Fn F is a volume formon Fn if(i) It is multilinear, that is, if

f(v1, . . . , i vi, . . . , vn) =i f(v1, . . . , vi, . . . , vn)

f(v1, . . . , vi+ vi , . . . , vn) =f(v1, . . . , vi, . . . , vn) + f(v1, . . . , v

i , . . . , vn).

We saw earlier that we can write this in a single condition:

f(v1, . . . , i vi+ iv

i , . . . , vn) =i f(v1, . . . , vi, . . . , vn)

+ if(v1, . . . , vi , . . . , vn).

(ii) It is alternating; that is, whenever i =j and vi = vj, then f(v1, . . . , vn) = 0.

Example 2.10. We have seen that det: Fn Fn F is a volume form. It isa volume form f withf(e1, . . . , en) = 1 (that is, det I= 1).

Remark. Lets explain the name volume form. LetF

=R

, and consider the volume of arectangular box with a corner at 0 and sides defined by v1, . . . , vn in Rn. The volume of

this box is a function ofv1, . . . , vn that almost satisfies the properties above. It doesntquite satisfy linearity, as the volume of a box with sides defined byv1, v2, . . . , vn is thesame as that of the box with sides defined by v1, . . . , vn, but this is the only problem.(Exercise: check that the other properties of a volume form are immediate for voluems ofrectangular boxes.) You should think of this as saying that a volume form gives asignedversion of the volume of a rectangular box (and the actual volume is the absoulutevalue). In any case, this explains the name. Youve also seen this in multi-variablecalculus, in the way that the determinant enters into the formula for what happens tointegrals when you change coordinates.

Theorem 2.11: Precise form

The set of volume forms forms a vector space of dimension 1. This line is calledthe determinant line.

Proof.24 Oct It is immediate from the definition that volume forms are a vector space. Lete1, . . . , en be a basis ofV with n = dim V. Every element ofV

n is of the formai1 ei,

ai2 ei, . . . ,

ain ei

,


31/82

29

with aij F (that is, we have an isomorphism of sets Vn Matn(F)). So if f is avolume form, then

f

n

i1=1ai11 ei1 , . . . ,

n

in=1ainn ein

=n

i1=1ai11 f

ei1 ,n

i2=1ai21 ei2 , . . . ,

n

in=1ainn ein

= =

1i1,...,inn

ai11 . . . ainn f(ei1 , . . . , ein),

by linearity in each variable. But as f is alternating, f(ei1 , . . . , ein) = 0unlessi1, . . . , inis 1, . . . , nin some order; that is,

(i1, . . . , in) =(

(1), . . . , (n))

for some Sn.Claim. f(e(1), . . . , e(n)) =() f(e1, . . . , en).

Given the claim, we get that the sum above simplifies toSn

a(1),1 . . . a(n),n (w) f(e1, . . . , en),

and so the volume form is determined by f(e1, . . . , en); that is, dim({vol forms})1. But det : Matn(F) F is a well-defined non-zero volume form, so we must havedim({vol forms}) = 1.Note that we have just shown that for any volume form

f(v1, . . . , vn) = det(v1, . . . , vn) f(e1, . . . , en).

So to finish our proof, we just have to prove our claim.

Proof of claim. First, for any v1, . . . , vn V, we show thatf(. . . , vi, . . . , vj, . . .) = f(. . . , vj, . . . , vi, . . .),

that is, swapping the ith and jth entries changes the sign. Applying multilinearity isenough to see this:

f(. . . , vi+ vj, . . . , vi+ vj, . . .)=0 as alternating

=f(. . . , vi, . . . , vi, . . .)=0 as alternating

+ f(. . . , vj, . . . , vj, . . .)=0 as alternating

+ f(. . . , vi, . . . , vj, . . .) + f(. . . , vj, . . . , vi, . . .).

Now the claim follows, as an arbitrary permutation can be written as a product of

transpositions, and(w) = (1)# of transpositions

.Remark. Notice that if Z/2 F is not a subfield (that is, if 1 + 1= 0), then for amultilinear form f(x, y) to be alternating, it suffices that f(x, y) =f(y, x). This isbecause we have f(x, x) = f(x, x), so2f(x, x) = 0, but2 = 0and so21 exists, givingf(x, x) = 0. If2 = 0, then f(x, y) =f(y, x) for any f and the correct definition ofalternating isf(x, x) = 0.

If that didnt make too much sense, dont worry: this is included for mathematicalinterest, and isnt essential to understand anything else in the course.

Remark. If Sn, then we can attach to it a matrix P() GLn by

P()ij = 1 if1i= j,0 otherwise.


32/82

30| Linear Algebra

Exercises 2.12. Show that:

(i) P(w)has exactly one non-zero entry in each row and column, and that entryis a 1. Such a matrix is called a permutation matrix.

(ii) P(w) ei = ej, hence

(iii) P :Sn GLn is a group homomorphism;(iv) (w) = det P(w).

Theorem 2.13

LetA, B Matn(F). Then det AB = det A det B.

Slick proof. Fix A Matn(F), and consider f : Matn(F) F taking f(B) = det AB.We observe that f is a volume form. (Exercise: check this!!) But then

f(B) = det B f(e1, . . . , en).But by the definition,

f(e1, . . . , en) =f(I) = det A.

Corollary 2.14. IfA Matn(F) is invertible, thendet A1 = 1/ det A.

Proof. Since AA1 =I, we have

det A det A1 = det AA1 = det I= 1,

by the theorem, and rearranging gives the result.

Corollary 2.15. IfP GLn, then

det(P AP1) = det Pdet A det P1 = det A.

Definition. Let : VV be a linear map. Define det F as follows: chooseany basis b1, . . . , bn ofV, and let A be the matrix of with respect to the basis.Setdet = det A, which is well-defined by the corollary.

Remark. Here is a coordinate free definition ofdet .

Pick fany volume form for V, f= 0. Then

(x1, . . . , xn) f(x1, . . . , xn) = (f )(x1, . . . , xn)

is also a volume form. But the space of volume forms is one-dimensional, so there issome F with f = f, and we define

= det

(Though this definition is independent of a basis, we havent gained much, as we neededto choose a basis to say anything about it.)

Proof 2 ofdet AB = det A det B. We first observe that its true if B is an elementary

column operation; that is, B = I+ Eij. Thendet B = 1. But

det AB = det A + det A,


33/82

31

whereA isA except that the ith andj th column ofA are the same as the j th columnofA. But then det A = 0 as two columns are the same.

Next, ifB is the permutation matrix P((i j)) =sij, that is, the matrix obtained fromthe identity matrix by swapping the ith andj th columns, thendet B = 1, butA sij isA with its ith andj th columns swapped, sodet AB = det A det B.

Finally, ifB is a matrix of zeroes with r ones along the leading diagonal, then ifr = n,then B =I and det B = 1. Ifr < n, then det B = 0. But then ifr < n, AB has somecolumns which are zero, so det AB = 0, and so the theorem is true for these B also.

Now anyB Matn(F)can be written as a product of these three types of matrices. SoifB = X1 Xr is a product of these three types of matrices, then

det AB = det(

(AX1 Xm1)

Xm)

= det(AX1 Xm1)det Xm= = det A det X1 det Xm= = det A det(X1 Xm)= det A det B.

Remark. That determinants behave well with respect to row and column operations isalso a useful way for humans (as opposed to machines!) to compute determinants.

Proposition 2.16. LetA Matn(F). Then the following are equivalent:(i) A is invertible;

(ii) det A = 0;

(iii) r(A) =n.

Proof. (i) = (ii). Follows since det A1

= 1/ det A.(iii) = (i). From the rank-nullity theorem, we have

r(A) =n ker = {0} A invertible.Finally we must show (ii) = (iii). Ifr(A) < n, then ker ={0}, so there is some = (1, . . . , n)

T Fn such that A = 0, and k= 0 for some k . Now put

B =

1 11 2

. . . ...k...

. . .

n 1

Then det B =k= 0, but AB is a matrix whose kth column is 0, so det AB = 0; thatis, det A= 0, since k= 0.This is a horrible and unenlightening proof that det A = 0implies the existence ofA1.A good proof would write the matrix coefficients ofA1 in terms of(det A)1 and thematrix coefficients ofA. We will now do this, after some showing some further propertiesof the determinant.

We can compute det Aby expanding along any column or row.


34/82

32| Linear Algebra

Definition. Let Aij be the matrix obtained from A by deleting the ith row andthe j th column.

Theorem 2.17

(i) Expand along thej th column:

det A= (1)j+1 a1jdet A1j + (1)j+2 a2jdet A2j + + (1)j+n anjdet Anj

= (1)j+1

a1jdet A1j a2jdet A2j + a3jdet A3j + (1)n+1 anjdet Anj

(the thing to observe here is that the signs alternate!)

(ii) Expanding along theith row:

det A=n

j=1 (1)i+j aijdet Aij.The proof is boring book keeping.

Proof. Put in the definition ofAij as a sum over w Sn1, and expand. We can tidythis up slightly, by writing it as follows: write A = (v1 vn), sovj =

i aijei. Then

det A= det(v1, . . . , vn) =ni=1

aijdet(v1, . . . , vj1, ei, vj+1, . . . , vn)

=n

i=1 (1)j1 det (ei, v1, . . . , vj1, vj+1, . . . , vn) .as (1 2 . . . j) = (1)j1 (in class we drew a picture of this symmetric group element,and observed it had j 1 crossings.) Now ei = (0, . . . , 0, 1, 0, . . . , 0)T, so we pick up(1)i1 as the sign of the permutation (1 2 . . . i)that rotates the 1st throughith rows,and so get

i

(1)i+j2 det

1 0 Aij

=i

(1)i+j det Aij.

Definition. For A Matn(F), the adjugate matrix, denoted by adj A, is thematrix with

(adj A)ij = (1)i+j det Aji .

Example 2.18.

adj

a bc d

=

d bc a

, adj

1 1 20 2 11 0 2

= 4 2 31 0 1

2 1 2

.

Theorem 2.19: Cramers rule

(adj A) A= A (adj A) = (det A) I.
http://en.wikipedia.org/wiki/Cramer's_rulehttp://en.wikipedia.org/wiki/Adjugate_matrix


35/82

33

Proof. We have

((adj A) A

)jk

=ni=1

(adj A)ji aik =ni=1

(1)i+j det Aijaik

Now, if we have a diagonal entry j =k then this is exactly the formula for det A in (i)above. Ifj= k, then by the same formula, this is det A , where A is obtained fromA by replacing its jth column with the kth column ofA; that is A has the j and kthcolumns thesame, sodet A = 0, and so this term is zero.

Corollary 2.20. A1 = 1

det Aadj A ifdet A = 0.

The proof of Cramers rule only involved multiplying and adding, and the fact that theysatisfy the usual distributive rules and that multiplication and addition are commutative.A set in which you can do this is called a commutative ring. Examples include theintegersZ, or polynomials F[x].

So weve shown that if A Matn(R), where R is any commutative ring, then thereexists an inverseA1 Matn(R)if and only ifdet Ahas an inverse inR: (det A)1 R.For example, an integer matrix AMatn(Z) has an inverse with integer coefficients ifand only ifdet A= 1.Moreover, the matrix coefficients ofadj A are polynomials in the matrix coefficients ofA, so the matrix coefficients ofA1 are polynomials in the matrix coefficients ofA andthe inverse of the function det A (which is itself a polynomial function of the matrixcoefficients ofA).

Thats very nice to know.


36/82


37/82

35

3 Jordan normal form

In this chapter, unless stated otherwise, we take V to be a finite dimensional vectorspace over a field F, and : V V to be a linear map. Were going to look at whatmatrices look like up to conjugacy; that is, what the map looks like, given the freedom

to choose a basis for V.

3.1 Eigenvectors and eigenvalues

Definition. A non-zero vectorv V is aneigenvectorfor : V V if(v) =v,for some F. Then is called the eigenvalueassociated with v , and the set

V={

v V :(v) =v}is called the eigenspace of for, which is a subspace ofV .

We observe that ifI :V V is the identity map, then

V = ker(I : V V).

So if v V, then v is a non-zero vector if and only if ker(I )={0}, which isequivalent saying that I is not invertible. Thus

det(I ) = 0,

by the results of the previous chapter.

Definition. Ifb1, . . . , bn is a basis ofV , and A Matn(F) is a matrix of, thench(x) = det(xI ) = det(xI A)

is the characteristic polynomialof.

The following properties follow from the definition:

(i) The general form is

ch(x) = chA(x) = det

x a11 a12 a1na21 x a22 . . .

...

... . . . . . . ...an1 x ann

F[x].

Observe thatchA(x) F[x]is a polynomial in x, equal toxn plus terms of smallerdegree, and the coefficients are polynomials in the matrix coefficients aij.

For example, ifA =

a11 a12a21 a22

then

chA(x) =x2 x (a11+ a22) + (a11a22 a12a21)

=x2 x. tr A + det A.


38/82

36| Linear Algebra

(ii) Conjugate matrices have the same characteristic polynomials. Explicitly:

chPAP1(x) = det

xI P AP1

= det

P(xI A) P1

= det(xI A)= chA(x).

(iii) For F, ch() = 0 if and only ifV={

v V :(v) =v} = {0}; that is, ifis an eigenvalue of. This gives us a way to find the eigenvalues of a linear map.

Example 3.1. IfA is upper-triangular withaii in the ith diagonal entry, then

chA(x) = (x a11) (x ann) .

It follows that the diagonal terms of an upper triangular matrix are its eigenvalues.

Definition. We say that chA(x) factorsif it factors into linear factors; that is, if

chA(x) =r

i=1(x i)ni ,

for some ni N, i F andi=j for i =j .

Examples 3.2. If we take F = C, then the fundamental theorem of algebra saysthat every polynomial f C[x] factors into linear terms.In R, consider the rotation matrix

A=

cos sin sin cos

,

then we have characteristic polynomial

chA(x) =x2 2x cos + 1,

which factors if and only ifA = I and = 0 or .

Definition. IfF is any field, then there is some bigger field F, thealgebraic closure

ofF, such that F F and every polynomial inF[x]factors into linear factors. Thisis proved next year in the Galois theorycourse.

Theorem 3.3

IfA is an n n matrix over F, thenchA(x)factors if and only ifA is conjugate toan upper triangular matrix.

In particular, this means that if F = F, such as F = C, then every matrix isconjugate to an upper triangular matrix.

We can give a coordinate free formulation of the theorem: if : V V is a linear map,thench(x)factors if and only if there is some basisb1, . . . , bn ofV such that the matrixof with respect to the basis is upper triangular.


39/82

37

Proof. () IfAis upper triangular, thenchA(x) =

(x aii), so done.() Otherwise, set V = Fn, and (x) = Ax. We induct ondim V. Ifdim V = n = 1,then we have nothing to prove.

As ch(x) factors, there is some F such that ch() = 0, so there is some Fwith a non-zero eigenvector b1. Extend this to a basisb1, . . . , bn ofV .

Now conjugate A by the change of basis matrix. (In other words, write the linear map, x Ax with respect to this basis bi rather than the standard basis ei).We get a new matrix A=

0 A

,

and it has characteristic polynomial

ch A(x) = (x ) chA(x).

So ch(x)factors implies thatchA(x) factors.Now, by induction, there is some matrix P GLn1(F) such that P AP1 is uppertriangular. But now

1P

A1P1

=

P AP1

,

proving the theorem.

Aside: what is the meaning of the matrix A? We can ask this question more generally.Let : V Vbe linear, andW Va subspace. Choose a basisb1, . . . , brofW, extendit to be a basis ofV (addb

r+1, . . . , b

n).

Then(W) Wif and only if the matrix of with respect to this basis looks likeX Z0 Y

,

where X is r r and Y is (n r) (n r), and it is clear that W

: W W hasmatrix X, with respect to a basis b1, . . . , br ofW.

Then our question is: What is the meaning of the matrix Y?

The answer requires a new concept, the quotient vector space.

Exercise 3.4. Consider V as an abelian group, and consider the coset groupV /W = {v+ W :v V}. Show that this is a vector space, that br+1+W , . . . , bn+Wis a basis for it, and : V V induces a linear map : V /W V /W by(v+ W) =(v) + W(you need to check this is well-defined and linear), and thatwith respect to this basis, Y is the matrix of.

Remark. LetV =W W ; that is, W W = {0},W + W =V, and suppose that(W)W and (W)W . We write this as = , where :W W , :W W are the restrictions of.In this special case the matrix of looks even more special then the above for any basis

b1, . . . , br ofW

and br+1, . . . , bn ofW

we have Z= 0 also.


40/82

38| Linear Algebra

Definition. The traceof a matrix A = (aij), denoted tr(A), is given by

tr(A) =i

aii,

Lemma 3.5. tr(AB) = tr(BA).

Proof. tr(AB) =

i(AB)ii =

i,jaijbji =

j(BA)jj = tr(BA).

Corollary 3.6. tr(P AP1) = tr(P1P A) = tr(A).

So we define, if : V V is linear,tr() = tr(A), whereAis a matrix of with respectto some basis b1, . . . , bn, and this doesnt depend on the choice of a basis.

Proposition 3.7. Ifch(x) factors as(x 1) (x n) (repetition allowed), then(i) tr =

i i;

(ii) det = i i.Proof. As ch factors, there is some basis b1, . . . , bn ofV such that the matrix ofA isupper triangular, the diagonal entries are 1, . . . , n, and were done.

Remark. This is true whatever F is. Embed F F (for example, R C), and chAfactors as (x 1) (x n). Now 1, . . . , n F, not necessarily in F.Regard AMatn(F), which doesnt change tr A or det A, and we get the same result.Note that

i i and

i i are in F even though i F.

Example 3.8. Take A =

cos sin

sin cos

. Eigenvalues are ei, ei, so

tr A= ei + ei = 2 cos det A= ei ei = 1.

Note that

chA(x) = (x 1) (x n)

=xn

i

i

xn1 +

i


41/82

39

For an upper triangular matrix, the diagonal entries are the eigenvalues. What is themeaning of the upper triangular coefficients?

This example shows there issomeinformation in the upper triangular entries of an upper-triangular matrix, but the question is how much? We would like to always diagonaliseA, but this example shows that it isnt always possible. Lets understand when it ispossible. 31 Oct

Proposition 3.9. Ifv1, . . . , vk are eigenvectors with eigenvalues1, . . . , k, andi=jif i =j, thenv1, . . . , vk are linearly independent.

Proof 1. Induct on k. This is clearly true whenk = 1. Now if the result is false, thenthere are ai F s.t.

ki=1 ai vi = 0, with some ai= 0, and without loss of generality,

a1= 0. (In fact, all ai= 0, as if not, we have a relation of linearly dependence among(k 1) eigenvectors, contradicting our inductive assumption.)Apply to

ki=1 ai vi = 0, to get

ki=1

i ai vi = 0.

Now multiplyk

i=1 ai vi by1, and we get

ki=1

1 ai vi = 0.

Subtract these two, and we get

ki=2

(i 1) =0

ai vi = 0,

a relation of linear dependence among v2, . . . , vk, soai = 0 for all i, by induction.

Proof 2. Suppose

ai vi = 0. Apply , we get

i ai vi = 0; apply 2, we get

2i ai vi = 0, and so on, sok

i=1 ri ai vi = 0 for all r 0. In particular,

1 1

1 k...

...

k11 k1k

a1 v1

...

..

.akvk

= 0.

Lemma 3.10 (The Vandermonde determinant). The determinant of the above matrixis

i


42/82

40| Linear Algebra

Definition. A map is diagonalisableif there is some basis for V such that thematrix of : V V is diagonal.

Corollary 3.11. The mapis diagonalisable if and only ifch(x)factors into

ri=1(x i)ni,

anddim Vi =ni for alli.

Proof. () is diagonalisable means that in some basis it is diagonal, with ni copiesofi in the diagonal entries, hence the characteristic polynomial is as claimed.

()i Vi is a direct sum, by the proposition, sodim

(i Vi

)=

dim Vi =

ni,

and by our assumption, n = dim V. Now in any basis which is the union of basis for theVi , the matrix of is diagonal.

Corollary 3.12. IfA is conjugate to an upper triangular matrix withi as the diagonal

entries, and the i are distinct, then A is conjugate to the diagonal matrix with i inthe entries.

Example 3.13.

1 70 2

is conjugate to

1 00 2

.

The upper triangular entries contain no information. That is, they are an artefactof the choice of basis.

Remark. IfF = C, then the diagonalisable A are dense in Matn(C) = Cn2 (exercise).

In general, ifF= F, then diagonalisable A are dense in Matn(F) = Fn2 , in the sense of

algebraic geometry.

Exercise 3.14. IfA=

1 an. . . ...0 n

, thenAei = ei+ j


43/82

41

3.2 Cayley-Hamilton theorem

Let : V Vbe a linear map, and Va finite dimensional vector space over F.Theorem 3.15: Cayley-Hamilton theorem

Every square matrix over a commutative ring (such as R or C) satisfieschA(A) = 0.

Example 3.16. A=

2 31 2

and chA(x) =x

2 4x + 1, sochA(A) =A2 4A + I.Then

A2 =

7 124 7

,

which does equal 4A I.Remark. We have

chA(x) = det(xI A) = detx a11 a1n... . . . ...

an1 x ann

=xn e1xn1 + en,so we dont get a proof by saying ch() = det( ) = 0. This just doesnt makesense. However, you can make it make sense, and our second proof will do this.

Proof 1. IfA =

1

. . .

0 n

, then chA(x) = (x 1) (x n).

Now ifA were in fact diagonal, then

chA(A) = (A 1I) (A nI) =

0 0

0

00

0

0

0 0

= 0But even when Ais upper triangular, this is zero.

Example 3.17. For example,

0

0

0

is still zero.

Here is a nice way of writing this:

Let W0 ={0}, Wi =e1, . . . , ei V. Then ifA is upper trianglar, then AWi Wi,and even(A iI) Wi Wi1. So (A nI) Wn Wn1, and so

(A n1I) (A nI) Wn (A n1I) Wn1 Wn2,

and so on, untiln

i=1(A iI) Wn W0= {0};

that is, chA(A) = 0.


44/82

42| Linear Algebra

Now ifF = F, then we can choose a basis for V such that : V V has an upper-triangular matrix with respect to this basis, and hence the above shows chA(x) = 0;that is, chA(A) = 0 for all A Matn(F).Now, ifF F, then as Cayley-Hamilton is true for all A Matn(F), then it is certainlystill true for A

Matn(F)

Matn(F).

Definition. The generalised eigenspace with eigenvalue is given by

V =

v V :( I)dimV (v) = 0

= ker (I )dimV :V V.

Note thatV V.

Example 3.18. LetA =

. . .

0

.

Then (I A) ei has stuff involving e1, . . . , ei1, so (I A)dimV ei = 0 for all i,as in our proof of Cayley-Hamilton (or indeed, by Cayley-Hamilton).

Further, if =, then

I A=

. . .0

,and so

(I A)n = ( )n

. . .0 ( )n

has non-zero diagonal terms, so zero kernel. Thus in this case,V =V, V = 0 if= , and in general V = 0 ifch()= 0, that is, ker(A I)N ={0} for allN 0.

2 NovTheorem 3.19

IfchA(x) =

ri=1(x i)ni , with the i distinct, then

V=r

i=1

Vi ,

and dim Vi =ni. In other word, choose any basis ofVwhich is the union of thebases of theVi . Then the matrix of is block diagonal. Moreover, we can choosethe basis of eachV so that each diagonal block is upper triangular, with only oneeigenvalue on its diagonals.

We say different eigenvalues dont interact.

Remark. Ifn1 = n2 =

= nr = 1 (and so r = n), then this is our previous theorem

that matrices with distinct eigenvalues are diagonalisable.


45/82

43

Proof. Consider

hi(x) =j=i

(x j

)nj = ch(x)(x i)ni .

Then defineWi = Im (hi(A): V V) V.Now Cayley-Hamilton implies that

(A iI)ni hi(A) =chA(A)

= 0,

that is,Wi ker(A iI)ni ker(A iI)n =Vi.

We want to show that

(i)

i W

i =V;

(ii) This sum is direct.

Now, the hi are coprime polynomials, so Euclids algorithm implies that there arepolynomials fi F[x] such that

ri=1

fi hi = 1,

and sor

i=1

hi(A) fi(A) =I End(V).

Now, ifv V, then this gives

v=

ri=1

hi(A) fi(A) v Wi

,

that is,r

i=1 Wi =V. This is (i).

To see the sum is direct: if0 =r

i=1 wi, wi Wi , then we want to show that eachwi= 0. Buthj(wi) = 0, i =j as wi ker(A iI)ni , so (i) gives

wi =r

i=1

hi(A) fj(A) wi = fi(A) hi(A) (wi),

so apply fi(A) hi(A) tori=1 wi = 0 and get wi = 0.

Definei = fi(A) hi(A) =hi(A) fi(A).

We showed that i : V V hasIm i = W

i Vi and i|Wi =identity, and so 2i =i,that is, i is the projection to W

i . Compare with hi(A) = hi, which has hi(V) =Wi Vi , hi| Vi an isomorphism, but not the identity; that is, fi| Vi =h1i | Vi .This tells us to understand what matrices look like up to conjugacy, it is enough tounderstand matrices with a single eigenvalue , and by subtracting Ifrom our matrixwe may as well assume that eigenvalue is zero.

Before we continue investigating this, we digress and give another proof of Cayley-Hamilton.


46/82

44| Linear Algebra

Proof 2 of Cayley-Hamilton. Let: V Vbe linear, Vfinite dimensional over F. Picka basise1, . . . , enofV, so(ei) =

jaji ej, and we have the matrix A = (aij). Consider

I

AT =

a11 an1...

. . . ...

a1n ann

Matn(End(V)),

where aij F End(V) by regarding an element as the operation of scalar mul-tiplication V V, v v. The elements of Matn(End(V)) act on Vn by the usualformulas. So

I ATe1...

en

=0...

0

by the definition ofA.

The problem is, it isnt clear how to define det : Matn(End(V))

End(V), as the

matrix coefficients (that is, elements ofEnd(V)) do not commute in general. But thematrix elements of the above matrix do commute, so this shouldnt be a problem.

To make it not a problem, consider IAT Matn(F[]); that is,F[]are polynomialsin the symbol . This is a commutative ring and now det behaves as always:

(i) det(I AT) = chA() F[] (by definition);(ii) adj(I AT) (I AT) = det(I AT) I Matn(F[]), as weve shown.

This is true for any B Matn(R), where R is a commutative ring. Here R = F[],B = I AT.Make F[] act on V, byi ai i :vi ai i(v), so

(I AT)

e1...en

=0...

0

.Thus

0 = adj(I AT) (I AT)

e1...en

=0

= det(I AT)

e1...en

=chA() e1...

chA() en

.

So this says that chA(A) ei = chA() ei = 0 for all i, so chA(A) : V V is the zeromap, as e1, . . . , en is a basis ofV ; that is, chA(A) = 0.

This correct proof is as close to the nonsense tautological proof (just set x equal toA) as you can hope for. You will meet it again several times in later life, where it iscalled Nakayamas lemma.


47/82

45

3.3 Combinatorics of nilpotent matrices

5 NovDefinition. If: VV can be written in block diagonal form; that is, if thereare some W , W V such that

(W

) W

(W

) W

V =W

W

then we say that is decomposableand write

= =W

:W W =W

:W W

We say that is the direct sumof and .

Otherwise, we say that is decomposable.

Examples 3.20.

(i) = 0 00 0 = (0: F F) (0: F F)(ii) =

0 10 0

: F2 F2 is indecomposable, because there is a unique -stable

linee1.(iii) If : V V, then ch(x) =

ri=1(x i)ni , for i=j ifi =j .

ThenV =r

i=1 Vi decomposes into pieces i =

Vi

:Vi Vi suchthat each has only one eigenvalue, .

This decomposition is precisely the amount of information in ch(x). So

to further understand what matrices are up to conjugacy, we will need newinformation.

Observe that is decomposable if and only if I is, and I haszero as its only eigenvalue.

Definition. The map is nilpotent ifdimV = 0 if and only ifker dimV = V ifand only ifV0 =V if and only ifch(x) =x

dimV. (The only eigenvalue is zero.)

Theorem 3.21

Let be nilpotent. Then is indecomposable if and only if there is a basisv1, . . . , vn such that

(vi) =

0 ifi = 1,

vi1 ifi >1,

that is, if the matrix of is

Jn=

0 1 0. . .

. . .. . . 1

0 0

This is the Jordan block of sizen with eigenvalue 0.


48/82

46| Linear Algebra

Definition. The Jordan block of sizen, eigenvalue, is given by

Jn() =I+ Jn.

Theorem 3.22: Jordan normal form

Every matrix is conjugate to a direct sum of Jordan blocks. Morever, these areunique up to rearranging their order.

Proof. Observe that theorem 3.22 = theorem 3.21, if we show that Jn is indecom-posable (and theorem 3.21 = theorem 3.22, existence).[Proof of Theorem 3.21,] PutWi = v1, . . . , vi.Then Wi is the only subspace W of dim i such that (W) W, and Wni is not acomplement to it, as Wi Wni = Wmin(i,ni).

Proof of Theorem 3.22, uniqueness. Suppose : V V,nilpotent and= ri=1 Jki .Rearrange their order so thatki kj fori j, and group them together, so

ri=1 miJi.

There are mi blocks of size i, and

mi = #{

ka| ka= i}

. ()

Example 3.23. If (k1, k2, . . . ) = (3, 3, 2, 1, 1, 1), then n = 11, and m1 = 3, m2 =1, m3 = 2, and ma = 0 for a > 3. (It is customary to omit the zero entries whenlisting these numbers).

Definition. LetPn ={(k1, k2, . . . , kn)Nn | k1k2 kn0,

ki =n}be the set ofpartitions ofn. This is isomorphic to the set {m: N N |i im(i) =n} as above.

We represent k Pn by a picture, with a row of length kj for each j (equivalently, withmi rows of length i). For example, the above partition (3, 3, 2, 1, 1, 1) has picture

k=

X X XX X XX X

XXX,

Now define kT, ifk Pn, the dual partition, to be the partition attached to the trans-posed diagram.

In the above example kT = (6, 3, 2).

It is clear that kdetermines kT. In formulas:

kT = (m1+ m2+ m3+ + mn, m2+ m3+ + mn, . . . , mn)


49/82

47

Now, let : V V, and = ri=1 Jki = ri=1 miJi as before. Observe thatdim ker = #of Jordan blocks= r =

ni=1

mi = (kT)1

dim ker 2 = #of Jordan blocks of sizes 1 and 2 =

ni=1

mi+

ni=2

mi = (kT

)1+ (kT

)2

...

dim ker n = #of Jordan blocks of sizes 1, . . . , n=n

k=1

ni=k

mi =ni=1

(kT)i.

That is,dim ker , . . . , dim ker n determine the dual partition to the partition ofn intoJordan blocks, and hence determine it.

It follows that the decomposition =n

i=1 miJi is unique.

Remark. This is a practical way to compute JNF of a matrixA. First computerchA(x) =ri=1(x i)ni , then compute eigenvalues withker(AiI), ker(A iI)2 , . . . , ker(A iI)n.

Corollary 3.24. The number of nilpotent conjugacy classes is equal to the size ofPn.

Exercises 3.25.

(i) List all the partitions of{1, 2, 3, 4, 5}; show there are 7 of them.(ii) Show that the size ofPn is the coefficient ofxn in

i11

1

xi

= (1 + x + x2 + x3 + )(1 + x2 + x4 + x6 + )(1 + x3 + x6 + x9 + )

=k1

i=0

xki

7 NovTheorem 3.26: Jordan Normal Form

Every matrix is conjugate to a direct sum of Jordan blocks

Jn() =

1 0

. . . . . .

. . . 10

Proof. It is enough to show this when : V V has a single generalised eigenspacewith eigenvalue , and now replacing by I, we can assume that is nilpotent.Induct onn = dim V. The case n= 1 is clear.

ConsiderV = Im = (V). ThenV =V, as is nilpotent, and(V ) (V) =V ,and | V :V V is obviously nilpotent, so induction gives the existence of a basis

e1, . . . , ek1 Jk1 ek1+1, . . . , ek1+k2 Jk2

. . . . . . , ek1+...+kr Jkrsuch that | V is in JNF with respect to this basis.


50/82

48| Linear Algebra

BecauseV = Im , it must be that the tail end of these strings is in Im ; that is, thereexist b1, . . . , br V\V such that (bi) =ek1++ki , as ek1++ki(V ). Notice theseare linearly independent, as if

i bi = 0, then

i (bi) =

i ek1++ki = 0.

But ek1 , . . . , ek1+...+kr are linearly independent, hence 1 = = r = 0. Even better:{ej, bi|j k1+ . . . + kr, 1 i r} are linearly independent. (Proof: exercise.)Finally, extende1, ek1+1, . . . , ek1+...+kr1+1

basis ofker Im

to a basis ofker , by adding basis vectors.

Denote these by q1, . . . , q s. Exercise. Show{ej, bi, qk} are linearly independent.Now, the rank-nullity theorem shows that dim Im + dim ker = dim V. Butdim Im is the number of the ei, that is k1+ +kr, and dim ker is the number of Jordanblocks, which is r + s, (r is the number of blocks of size greater than one, s the numberof size one), which is the number ofbi plus the number ofqk.

So this shows that ej, bi, qk are a basis ofV , and hence with respect to this basis,

= Jk1+1 Jkr+1 J1 J1 s times

3.4 Applications of JNF

Definition. Suppose : V V. The minimum polynomial of is a monicpolynomial p(x)of smallest degree such that p() = 0.

Lemma 3.27. Ifq(x) F[x] andq() = 0, thenp

q.

Proof. Write q= pa + r, with a, r F[x] anddeg r


51/82

49

in Jordan normal form. So to finish we must compute what the powers of elements inJNF look like. But

Jn=

0 1 00 1

.. .

. . .

0 10 0

, J2n =

0 0 1 00 1

. . .0 1

00 0

, . . . , J n1n =

0 0 10

0 0

and

(I+ Jn)a =

k0

a

k

akJkn .

Now assume F = C.

Definition. exp A= n0

An

n! , A Matn(C).

This is an infinite sum, and we must show it converges. This means that each matrixcoefficient converges. This is very easy, but we omit here for lack of time.

Example 3.29. For a diagonal matrix:

exp

1 0. . .

0 n

=

e1 0. . .

0 en

and convergence is usual convergence ofexp.

Exercises 3.30.

(i) IfAB = BA, thenexp(A + B) = exp A exp B.

(ii) Henceexp(Jn+ I) =e exp(Jn)

(iii) P exp(A) P1 = exp(P AP1)

So now you know how to compute exp(A), for A Matn(C).We can use this to solve linear ODEs with constant coefficients:

Consider the linear ODEdy

dt =Ay,

for A Matn(C), y= (y1(t), y2(t), . . . , yn(t))T, yi(t) C(C).

Example 3.31. Consider

dnz

dtn + cn1

dn1z

dtn1 + + c0z = 0, ()


52/82

50| Linear Algebra

This is a particular case of the above, where A is the matrix

0 10 1

. . .0 1

c0 c1 . . . cn1

.

To see this, consider what Ay = y means. Set z = y1, then y2 = y1 = z

,

y3= y2= z

, , yn= yn1=

dn1zdtn1

and () is the last equation.

There is a unique solution of dydt =Ay with fixed initial conditions y(0), by a theoremof analysis. On the other hand:

Exercise 3.32. exp(At) y(0) is a solution, that is

d

dt (exp(At) y(0)) =A exp(At) y(0)Hence it is the unique solution with value y(0).

Compute this whenA = I+ Jn is a Jordan block of size n.


53/82

51

4 Duals

This chapter really belongs after chapter 1 its just definitions and intepretations ofrow reduction.

9 Nov

Definition. LetVbe a vector space over a field F. Then

V = L(V, F) = {linear functions V F}

is the dual spaceofV.

Examples 4.1.

(i) Let V = R3. Then (x,y ,z) x y is in V.(ii) If V = C([0, 1]) =

continuous functions [0, 1] R, then f 10 f(t) dt is

inC([0, 1]).

Definition. Let Vbe a finite dimensional vector space over F, and v1, . . . , vn bea basis ofV . Then define vi V by

vi(vj) =ij =

0 ifi =j,

1 ifi = j,

and extend linearly. That is, vi

jjvj

=i.

Lemma 4.2. The setv1, . . . , vn is a basis forV

, called thebasis dual to ordual basis

for v1, . . . , vn. In particular, dim V

= dim V.

Proof. Linear independence: if

i vi = 0, then0 =

(i v

i

)(vj) =j, so j = 0 for

all j . Span: if V, then we claim

=n

j=1

(vj) vj .

As is linear, it is enough to check that the right hand side applied to vk is(vk). Butj(vj) v

j (vk) =

j(vj) jk = (vk).

Remarks.

(i) We know in general that dim L(V, W) = dim V dim W.(ii) IfVis finite dimensional, then this shows that V=V, as any two vector spaces

of dimensionn are isomorphic. But they are not canonicallyisomorphic (there isno natural choice of isomorphism).

If the vector space V has more structure (for example, a group G acts upon it),thenV and V are not usually isomorphic in a way that respects this structure.

(iii) IfV =F[x], thenV FN by the isomorphism V ((1), (x), (x2), . . .),

and conversely, ifi F, i = 0, 1, 2, . . . is any sequence of elements ofF, we getan element ofV by sending ai xi ai i (notice this is a finite sum).Thus V and V are not isomorphic, since dim V is countable, but dimFN is un-countable.


54/82

52| Linear Algebra

Definition. Let V and W be vector space over F, and a linear map V W, L(V, W). Then we define :W V L(W, V), by setting() = :V F.(Note: linear, linear implieslinear, and so V as claimed, if W.)

Lemma 4.3. LetV, Wbe finite dimensional vector spaces, with

v1, . . . , vn a basis ofV, andw1, . . . , wm a basis forW;

v1, . . . , vn the dual basis ofV

, andw1, . . . , wm the dual basis forW

;

If is a linear map VW, andA is the matrix of with respect to vi, wj , thenATis the matrix of :W V with respect to wj , vi.Proof. Write (wi ) =

nj=1 cji v

j , so cij is a matrix of

. Apply this to vk:

LHS =

((wi )

)(vk) =w

i ((vk)) =w

i

( akw

)=aik

RHS =cki,

that is, cji = aij for all i, j.

This was the promised interpretation ofAT.

Corollary 4.4.

(i) () =;

(ii) ( + ) = + ;

(iii) det = det

Proof. (i) and (ii) are immediate from the definition, or use the result (AB)T =BTAT.

(iii) we proved in the section on determinants where we showed thatdet AT = det A.

Now observe that (AT)T

=A. What does this mean?

Proposition 4.5.

(i) Consider the map V V = (V) takingv v, where v() =(v) if V.Then v V, and the map V V is linear and injective.

(ii) Hence if V is a finite dimensional vector space over F, then this map is an iso-morphism, so V

V canonically.

Proof.(i) We first show v V, that is v: V F, is linear:

v(a11+ a22) = (a11+ a22)(v) =a1 1(v) + a2 2(v)

=a1v(1) + a2v(2).

Next, the map V V is linear. This is because

(1v1+ 2v2)() = (1v1+ 2v2)

=1 (v1) + 2 (v2)

= 1v1+ 2v2 ().Finally, ifv = 0, then there exists a linear function : V F such that (v) = 0.


55/82

53

(Proof: extend v to a basis, and then define on this basis. Weve only provedthat this is okay when V is finite dimensional, but its always okay.)

Thus v() = 0, so v= 0, and V V is injective.(ii) Immediate.

Definition.

(i) IfU V, then define

U ={

V | (U) = 0} = { V | (u) = 0 u U} V.This is the annihilatorofU, a subspace ofV, often denoted U.

(ii) IfW V, then defineW =

{v V| (v) = 0 W

} V.

This is often denoted W.

Example 4.6. IfV = R3, U=

(1, 2, 1)

, then

U =

3

i=1

ai ei V | a1+ 2a2+ a3= 0

=21

0

, 01

2

.Remark. IfV is finite dimensional, andW V, then under the canonical isomorphismV V, we have W W, where W V and (W) (V). Proof is an exercise.

Lemma 4.7. LetVbe a finite dimensional vector space withU V. Thendim U+ dim U = dim V.

Proof. Consider the restriction map Res : V U taking |U. (Note thatRes =, where : U V is the inclusion.)Thenker Res =U, by definition, and Res is surjective (why?).

So the rank-nullity theorem implies the result, as dim V = dim V.

Proposition 4.8. Let V, W be a finite dimensional vector space over F, with L(V, W). Then

(i) ker(

:W

V

) = (Im )

( W

);(ii) rank() = rank(); that is, rank AT = rank A, as promised;

(iii) Im = (ker ).

Proof.

(i) Let W. Then ker = 0 (v) = 0v V (Im ).

(ii) By rank-nullity, we have

rank = dim W dim ker = dim W

dim(Im ), by (i),

= dim Im , by the previous lemma,

= rank , by definition.


56/82

54| Linear Algebra

(iii) Let Im , and then = for some W. Now, let v ker . Then(v) =(v) = 0, so (ker ); that is, Im (ker ).But by (ii),

dim Im = rank() = rank = dim V dim ker = dim (ker )

by the previous lemma; that is, they both have the same dimension, so they areequal.

Lemma 4.9. LetU1, U2 V, andV finite dimensional. Then(i) U1

(U1 ) U1 under the isomorphismV V.(ii) (U1+ U2)

=U1 U2 .(iii) (U1 U2) =U1 + U2 .

Proof. Exercise!


57/82

55

5 Bilinear forms

12 NovDefinition. LetVbe a vector space over F. Abilinear formon V is a multilinearformV V F; that is, : V V F such that

(v, a1w1+ a2w2) =a1 (v, w1) + a2 (v, w2)

(a1v1+ a2v2, w) =a1 (v1, w) + a2 (v2, w).

Examples 5.1.

(i) V = Fn, (

(x1, . . . , xn), (y1, . . . , yn))

=n

i=1 xi yi, which is the dot productofF = Rn.

(ii) V = Fn, A Matn(F). Define (v, w) =vTAw. This is bilinear.(i) is the special case when A = I. Another special case is A = 0, which isalso a bilinear form.

(iii) TakeV =C([0, 1]), the set of continuous functions on [0, 1]. Then

(f, g) 10

f(t) g(t) dt

is bilinear.

Definition. The set of bilinear forms ofV is denoted

Bil(V) = { : V V F bilinear}

Exercise 5.2. Ifg GL(V), Bil(V), then g : (v, w) (g1v, g1w) is abilinear form. Show this defines a group action ofGL(V) on Bil(V). In particular,show thath(g) = (hg), and youll see why the inverse is in the definition ofg.

Definition. We say that , Bil(V)are isomorphicif there is someg GL(V)such that = g; that is, if they are in the same orbit.

Q: What are the orbits ofGL(V)on Bil(V); that is, what is the isomorphism classes ofbilinear forms?

Compare with:

L(V, W)/ GL(V) GL(W) {i N | 0 i min(dim V, dim W)} with rank . Here (g, h) = hg1.

L(V, V)/ GL(V) JNF. Hereg= gg1 and we require F algebraically closed. Bil(V)/ GL(V) ??, with(g )(v, w) =(g1v, g1w).

First, lets express this in matrix form. Letv1, . . . , vn be a basis for V, where V is afinite dimensional vector space over F, and Bil(V). Then

i xi vi,jyjvj = i,jxi yj(vi, vj)So if we define a matrixA byA = (aij),aij =(vi, vj), then we say thatA is the matrixof the bilinear form with respect to the basis v1, . . . , vn.


58/82

56| Linear Algebra

In other words, the isomorphism V Fn induces an isomorphism Bil(V)

Matn F, aij =(vi, vj).Now, let v 1, . . . , v

n be another basis, with v

j =

ipijvi. Then

(va, v

b) = ipia vi,jpjb vj = i,jpia (vi, vj)pjb = PTAPab

So if P is the matrix of the linear map g1 : V V, then the matrix of g =(g1(), g1())is PTAP.So the concrete version of our question what are the orbits Bil(V)/ GL(V) is whatare the orbits ofGLn on Matn(F)for this action?

Definition. Suppose Q acts onA by QAQT. We say thatA and B arecongruentifB = QAQT for some Q GLn.

We want to understand when two matrices are congruent.

Recall that ifP, QGLn, then rank(P AQ) = rank(A). Hence takingQ= PT, we getrank(P APT) = rank A, and so the following definition makes sense:

Definition. If Bil(V), then the rankof , denoted rank or rk is the rankof the matrix of with respect to some (and hence any) basis ofV .

We will see later how to give a basis independent definition of the rank.

Definition. A form Bil(V)is symmetric if(v, w) =(w, v) for all v, w

V. In terms of the matrix Aof

, this is requiring AT =A. anti-symmetricif(v, v) = 0for allv V, which implies(v, w) = (w, v)

for all v, w V. In terms of the matrix, AT = A.

From now on, we assume that char F = 2, so1 + 1 = 2 = 0 and 1/2 exists.

Given , put

+(v, w) = 12

(v, w) + (w, v)

(v, w) = 12

(v, w) (w, v) ,which splits a form into symmetric and anti-symmetric components, and = + + .

Observe that if is symmetric or anti-symmetric, then so is g = (g(), g()), or inmatrix form,Ais symmetric or anti-symmetric if and only ifP APT is, since(P APT)T =P ATPT.

So to understand Bil(V)/ GL(V), we will first understand the simpler question of clas-sifying symmetric and anti-symmetric forms. Set

Bil(V) ={

Bil(V) | (v, w) = (v, w) v, w V} = 1.So Bil+(V) is the symmetric forms, and Bil is the antisymmetric forms.

So our simpler question is to ask, What is Bil

(V)/ GL(V)?Hard exercise: Once youve finished revising the course, go and classify Bil(V)/GL(V).


59/82

57

5.1 Symmetric forms

Let Vbe a finite dimensional vector space over F and charF= 2. If Bil+(V) is asymmetric form, then define Q : V Fas

Q(v) =Q(v) =(v, v).

We have

Q(u + v) =(u + v, u + v)

=(u, u) + (v, v) + (u, v) + (v, u)

=Q(u) + Q(v) + (u, v) + (v, u)

Q(u) =(u,u)

=2 (u, u)

=2 Q(u).

Definition. A quadratic formon V is a function Q : V F such that(i) Q(v) =2Q(v);

(ii) SetQ(u, v) = 12

Q(u + v) Q(u) Q(v); thenQ: VV F is bilinear.

Lemma 5.3. The map Bil+(V) {quadratic forms onV}, Q is a bijection;Q Q is its inverse.Proof. Clear. We just note that

Q(v, v) = 12 (

Q(2v) 2 Q(v))= 12 (4 Q(v) 2 Q(v)) =Q(v),

as Q(u) =2Q(u).

Remark. Ifv1, . . . , vn is a basis ofV with(vi, vj) =aij, then

Q(

xivi)

=

aijxixj =xTAx,

that is, a quadratic form is a homogeneous polynomial of degree 2 in the variablesx1, . . . , xn.

Theorem 5.4

Let V be a finite dimensional vector space over F and Bil+(V) a symmetricbilinear form. Then there is some basis v1, . . . , vn ofV such that (vi, vj) = 0 ifi =j . That is, we can choose a basis so that the matrix of is diagonal.

Proof. Induct on dim V. Now dim V = 1 is clear. It is also clear if(v, w) = 0 for allv, w V.So assume otherwise. Then there exists a w V such that (w, w) = 0. (As if (w, w) = 0 for all w V; that is, Q(w) = 0 for all w V, then by the lemma,(v, w) = 0 for all v , w V.)

To continue, we need some notation. For an arbitrary Bil(V), U V, defineU =

{v V :(u, v) = 0 for all u U} .


60/82

58| Linear Algebra

Claim.w w =V is a direct sum.[Proof of claim] As (w, w) = 0, w w, sow w = 0, and the sum is direct.Now we must showw + w =V.Let v

V . Consider v

w. We want to find a such that v

w

w

, as then

v= w+ (v w) shows v w + w.Butv w w (w, v w) = 0 (w, v) = (w, w); that is, set

= (v, w)

(w, w).14 Nov

Now letW = w, and =|W:WW F the restriction of. This is symmetricbilinear, so by induction there is some basis v2, . . . , vn ofW such that (vi, vj) =i ijfor i F.Hence, as (w, vi) =(vi, w) = 0 ifi2, put v1 =w and we get that with respect tothe basis v1, . . . , vn, the matrix of is

(w, w) 02

. . .

0 n

.Warning. The diagonal entries are not determined by , for example, consider

a1. . .

an

1. . .

n

a1. . .

an

T

=

a21 1. . .

a2n n

,

that is, rescaling the basis element vi to avi changes Q(aivi) =a2 Q(vi).

Also, we can reorder our basis equivalently, take P =P(w), the permutation matrixofw Sn, and note PT =P(w1), so

P(w) A P(w)T =P(w) A P(w)1.

Furthermore, its not obvious that more complicated things cant happen, for example,

P

2

3

PT =

5

30

ifP =

1 31 2

.

Corollary 5.5. Let V be a finite dimensional vector space over F, and suppose F isalgebraically closed (such asF = C). Then

Bil+(V) {i: 0 i dim V} ,

under the isomorphism taking rank .Proof. By the above, we can reorder and rescale so the matrix looks like

1. . .

10

. . .

0

as

i is always in F.


61/82

59

That is, there exists a basis ofQ such that

Q

ni=1

xi vi

= ri=1

x2i ,

wherer = rank Q n.Now let F = R, and : V V R be bilinear symmetric.By the theorem, there is some basis v1, . . . , vn such that (vi, vj) = i ij. Replace vibyvi/

|i| ifi= 0 and reorder the baasis, we get is represented by the matrixIp Iq0

,for p, q 0, that is, with respect to this basis

Q n

i=1

xi vi = p

i=1

x2ip+qi=p+1

x2i .

Note that rank = p + q.

Definition. The signatureof is sign = p q.

We need to show this is well defined, and not an artefact of the basis chosen.

Theorem 5.6: Sylvesters law of inertia

The signature does not depend on the choice of basis; that is, if is representedby Ip Iq

0

wrtv1, . . . , vn and byIp Iq

0

wrtw1, . . . , wn,thenp = p and q= q.

Warning: tr(PTAP) = tr(A), so we cant prove it that way.

Definition. Let Q : VR be a quadratic form on V , where V is a vector spaceover R, andU V.We say Q is positive semi-definite on U if for all u U, Q(u) 0. Further, ifQ(u) = 0 u= 0, then we say thatQ is positive definiteonU.IfU =V, then we just say that Q is positive (semi) definite.

We define negative (semi) definite to meanQ is positive (semi) definite.

Proof of theorem. Let P =

v1, . . . , vp

. So ifv =

pi=1 ivi P, Q(v) =

i

2i 0,

and Q(v) = 0

v= 0, soQ is positive definite on P.

Let U = vp+1, . . . , vp+q, . . . , vn, so Q is negative semi-definite on U. And now let Pbe any positive definite subspace.


62/82

60| Linear Algebra

Claim. P U= {0}.Proof of claim. Ifv P , thenQ(v) 0; ifv U,Q(u) 0. so ifv P U,Q(v) = 0.But ifP is positive definite, so v = 0. Hence

dim P + dim U= dim(P + U)

dim V =n,

and sodim P dim V dim U= dim P,

that is,p is the maximum dimension of any positive definite subspace, and hence p =p.Similarly, qis the maximum dimension of any negative definite subspace, so q =q.

Note that (p, q) determine (rank, sign), and conversely, p = 12(rank + sign) and q =12(rank sign). So we now have

Bil+(Rn)/ GLn(R)

{(p, q): p, q 0, p + q n

} {(rank, sgn)}.

Example 5.7. LetV = R2, and Q

x1x2

=x21 x22.

Consider the line L =e1+ e2, Q(1

) = 12, so this is positive definite if

|| 1.In particular,p = q= 1, but there are many choices of positive and negative definitesubspaces of maximal dimension. (Recall that lines in R2 are parameterised bypoints on the circle R {}).


63/82

61

Example 5.8. Compute the rank and signature of

Q(x,y ,z) =x2 + y2 + 2z2 + 2xy+ 2xz 2yz.

Note the matrix A ofQ is

1 1 11 1 11 1 2

, that is Qxyz

= (x y z)Axyz

.(Recall that we for an arbitrary quadratic form Q, its matrix A is given by

Q(

i xivi)

=

i,jaijxi xj =

i aii x2i +

i


64/82

62| Linear Algebra

NextR3 R3 R1andC3 C3 C1, giving1 0 22 1

.Then swap R2, R3, and C2, C3, giving1 1 2

2 0

.ThenR3 R3 + 2 R2 andC3 C3 + 2 C2, giving1 1

4

.Finally, rescale the last basis vector, giving1 1

1

.That is, if we put

P = (I E12) (I E13) P((2 3)) (I 2 E23)1 1

12

,then

PTAP =

1 11

.Method 2: we could just try to complete the square

Q(x,y ,z) = (x + y+ z)2 + z2 4yz = (x + y+ z)2 + (z 2y)2 4y2

Remark. We will see in Chapter 6 that sign(A) is the number of positive eigenvaluesminus the number of negative eigenvalues, so we could also compute it by computingthe characteristic polynomial ofA.

5.2 Anti-symmetric forms

We begin with a basis independant meaning of the rank of an arbitrary bilinear form.

Proposition 5.9. LetVbe a finite dimensional vector space overF. Then

rank = dim V dim V = dim V dimV,

whereV ={

v V :(V, v) = 0} andV = {v V :(v, V) = 0}.Proof.16 Nov Define a linear mapBil(V)

L(V, V),

L withL(v)(w) =(v, w). First

we check that this is well-defined: (v, ) linear implies L(v)V, and (, w) linearimplies L(v+

v ) = L(v) + L(v

); that is, L is linear, and L L(V, V).


65/82


66/82

64| Linear Algebra

But(w) =(w, ): V F,

and so (w) = 0 if and only if w V; that is, ker = W V, proving theproposition.

Remark. If you are comfortable with the notion of a quotient vector space, considerinstead the mapV (W/W V), v (, v) and show it is well-defined, surjectiveand has W as the kernel.

Example 5.12. IfV = R2, and Q

x1x2

=x21 x22, then A =

1

1

.

Then ifW =

11

, W =Wand the proposition says 1 + 1 0 = 2.

Or if we let V = C2, Qx1

x2 = x21+x

22 , so A =

1

1, and set W = 1

ithenW =W.

Corollary 5.13. W

:W W F is non-degenerate if and only ifV =W W.

Proof. ()W

is non-degenerate means that for allw W\{0}, there is some w Wsuch that (w, w )= 0, so ifwW W, w= 0, then for all w W, (w, w ) = 0,a contradiction, and so W W = {0}. Now

dim(W+ W) = dim W+ dim W dim V,

by the proposition, so W+ W =V (and also is non-degenerate on all ofV clearly).

() Clear by our earlier remarks thatWW = 0if and only ifW

is non-degenerate.

Theorem 5.14

Let Bil(V) be an anti-symmetric bilinear form. Then there is some basisv1, . . . , vn ofV such that the matrix of is

0 11 0 0

0 11 0

. . .

0 11 0

0. . .

0 0

In particular, rank is even! (F

is arbitrary.)

Remark. If Bil(V), then W = W for all W V.


67/82

65

Proof. We induct on rank , ifrank = 0, then = 0 and were done.

Otherwise, there are some v1, v2 V such that (v1, v2) = 0. If v2 = v1 then(v1, v2) = (v1, v1) = 0, as is anti-symmetric; so v1, v2 are linearly independent.Change v2 to v2/(v1, v2).

So now (v1, v2) = 1. Put W =v1, v2, then W has matrix 0 11 0 , which isnon-degenerate, so the corollary givesV =W W. And now induction gives the basisofW, v3, . . . , vn, of the correct form, and v1, . . . , vn is our basis.

So weve shown that there is an isomorphism

Bil(V)/ GL(V)

2i: 0 i 12dim V

.

taking rank .

Remark. A non-degenerate anti-symmetric form is usually called a symplectic form.Let Bil(V)be non-degenerate,rank = n = dim V (even!). Put L= v1, v3, v5, . . .,with v1, . . . , vn as above, and then L

=L. Such a subspace is called Lagrangian.

IfU L, then U L =L, and so U U. Such a subspace is called isotropic.

Definition. If Bil(V), the isometriesof are

Isom ={

g GL(V): g = }=

g GL(V): (g1v, g1w) =(v, w) v, w V

= X GLn(F): X AXT =A ifA is a matrix of.This is a group.

Exercise 5.15. Show thatIsom(g) =g Isom()g1, and so isomorphism bilinearforms have isomorphic isometry groups.

If Bil+(V), is non-degenerate, we often write O(), theorthogonal group of forthe isometry group of .

Example 5.16. Suppose F = C. If

Bil+(V), and is non-degenerate, then is isomorphic to the standard quadratic form, whose matrix A= I, and so Isom is conjugate to the group

Isom(A= I) =

X GLn(C): X XT =I

=On(C),

which is what we usually call the orthogonal group.

IfF = R, then

Op,q(R) =

X| X

Ip

Iq

XT =

Ip

Iq

.

are the possible isometry groups of non-degenerate symmetric forms.


68/82

66| Linear Algebra

For any field F, if Bil(F)is non-degenerate, thenIsom is called thesymplecticgroup, and it is conjugate to the group

Sp2n(F) =

X :X JXT =J

,

whereJis the matrix given by

J=

0 11 0 0

0 11 0

. . .

0 0 1

1 0


69/82

67

6 Hermitian forms

19 NovA non-degenerate quadratic form on a vector space over Cdoesnt behave like an innerproduct on R2. For example,

ifQx1x2 =x21+ x22 then we have Q1i = 1 + i2 = 0.

We dont have a notion of positive definite, but there is a modification of a notion of abilinear form which does.

Definition. Let V be a vector space over C; then a function : V V C iscalledsesquilinear if

(i) For all v V, (, v), u (u, v)is linear; that is

(1 u1+ 2 u2, v) =1(u1, v) + 2(u2, v)

(ii) For allu, v1, v2 V, 1, 2 C,(u, 1 v1+ 2 v2) =1 (u, v1) + 2 (u, v2),

wherez is the complex conjugate ofz .

It is called Hermitianif it also satisfies

(iii) (v, w) =(w, v)for all v , w V.Note that (i) and (iii) imply (ii).

LetVbe a vector space over C, and : V

V

C a Hermitian form. Define

Q(v) =(v, v) =(v, v)

by (iii), so Q : V R.Lemma 6.1. We haveQ(v) = 0for allv Vif and only if(v, w) = 0for allv, w V.Proof. We have.

Q(u v) =(u v, u v) =(u, u) + (v, v) (u, v) (v, u)=Q(u) + Q(v) 2 (u, v),

as z + z= 2

(z). Thus

Q(u + v) Q(u v) = 4 (u, v),Q(u + iv) Q(u iv) = 4I(u, v),

that is, Q: V R determines : V V C ifQ is Hermitian:(u, v) = 14

Q(u + v) + i Q(u + iv) Q(u v) i Q(u iv) .

Note thatQ(v) =(v,v) = (v, v) = ||2 Q(v).

If: V V C is Hermitian, and v1, . . . , vn is a basis ofV , then we write A= (aij),aij =(vi, vj), and we call this the matrix of with respect to v1, . . . , vn.Observe that AT =A; that is, A is a Hermitian matrix.


70/82


71/82


72/82

70| Linear Algebra

Corollary 6.7 (Triangle inequality). For allv, w V,|v+ w| |v| + |w|.Proof. As youve seen many times before:

|v+ w|2 = v+ w, v+ w

= |v|2

+ 2v, w + |w|2

v2+ 2 |v| |w| + w2 (by lemma)

=(|v| + |w|)2 .

21 NovGiven v1, . . . , vn with

vi, vj

= 0 if i= j, we say that v1, . . . , vn are orthogonal. If

vi, vj

=ij, then we say thatv1, . . . , vn are orthonormal.

So v1, . . . , vn orthogonal and vi = 0 for all i implies that v1, . . . , vn are orthonormal,where vi = vi/ |vi|.Lemma 6.8. If v1, . . . , vn are non-zero and orthogonal, and if v =ni=1 i vi, theni= v, vi / |vi|2.Proof.v, vk =

ni=1 i vi, vk =k vk, vk, hence the result.

In particular, distinct orthonormal vectors v1, . . . , vn are linearly independent, sincei i vi = 0 implies i = 0.

As, is Hermitian, we know there is a basis v1, . . . , vn such that the matrix of, is

Ip

Iq0

.

As, is positive definite, we know that p = n, q = 0, rank = dim V; that is, thismatrix is In. So we know there exists an orthonormal basis v1, . . . , vn; that is V= Rn,withx, y = i xi yi, or V= Cn, withx, y = i xi yi.Here is another constructive proof that orthonormal bases exist.

Theorem 6.9: Gram-Schmidt orthogonalisation

Let V have a basis v1, . . . , vn. Then there exists an orthonormal basis e1, . . . , ensuch thatv1, . . . , vk = e1, . . . , ekfor all 1 k n.

Proof. Induct onk . For k = 1, sete1= v1/ |v1|.Suppose weve founde1, . . . , ek such thate1, . . . , ek = v1, . . . , vk. Define

ek+1= vk+1

1ik

vk+1, ei ei.

Thusek+1, ei = vk+1, ei vk+1, ei = 0 ifi k.

Also ek+1= 0, as ifek+1 = 0, then vk+1 e1, . . . , ek=v1, . . . , vk which contradictsv1, . . . , vk+1 linearly independent.

So putek+1= ek+1/ |ek+1|, and thene1, . . . , ek+1 are orthonormal, ande1, . . . , ek+1 =v1, . . . , vk+1.


73/82

71

Corollary 6.10. Any orthonormal set can be extended to an orthonormal basis.

Proof. Extend the orthonormal set to a basis; now the Gram-Schmidt algorithm doesntchangev1, . . . , vk if they are already orthonormal.

Recall that ifW

V, W = W = {v V| v, w = 0 w W}.

Proposition 6.11. IfW V, V an inner product space, thenW W =V.Proof 1. If, is positive definite on V, then it is also positive definite on W, andthus, |W is non-degenerate. IfF = R, then, is bilinear, and weve shown thatW W = V when the form, |W is non-degenerate. If F = C, then exactly thesame proof for sesquilinear forms shows the result.

Proof 2. Pick an orthonormal basisw1, . . . , wr for W, and extend it to an orthonormalbasis for V, w1, . . . , wn.

Now observe thatwr+1, . . . , wn = W. Proof () is done. For (): ifni=1 i wi

W, then take, wi, i r, and we get i = 0 for i r. So V =W W.Geometric interpretation of the key step in the Gram-Schmidt algorithm

Let V be an inner product space, with W V and V = W W. Define a map :VW, the orthogonal projection onto W, defined as follows: ifvV, then writev= w+ w , where w W and w W uniquely, and set (v) =w.This satisfies |W = id: W W, 2 = and linear.Proposition 6.12. IfWhas an orthonormal basise1, . . . , ek and : V W as above,then

(i) (v) = ki=1 v, ei ei;(ii) (v) is the vector inW closest to v; that is,

v (v) |v w| for allwW,with equality if and only ifw= (v).

Proof.

(i) IfvV, then put w = ki=1 v, ei ei, and w =v w. So wW, and we wantw W. But

w , ei

= v, ei v, ei = 0 for all i, 1 i k,

so indeed we have w

W, and (v) =w by definition.

(ii) We havev (v) W, and ifw W, (v) w W, then

|v w|2 =(v (v))+ ((v) w)2

=v (v)2 + (v) w2 + 2 v (v)

W, (v) w

W

=0

,

and so|v w|2 v (v)2, with equality if and only if (v) w = 0; that is,if(v) =w.


74/82

72| Linear Algebra

6.2 Hermitian adjoints for inner products

LetV andWbe inner product spaces over F and : V Wa linear map.Proposition 6.13. There is a unique linear map :W V such that for allv V,w

W, (v), w = v, (w). This map is called theHermitian adjoint.

Moreover, ife1, . . . , en is an orthonormal basis ofV, and f1, . . . , f m is an orthonormalbasis forW, andA = (aij)is the matrix of with respect to these bases, thenAT is thematrix of.

Proof. If: W V is a linear map with matrix B = (bij), then(v), w

=

v, (w)

for all v , w

if and only if

(ej), fk

=

ej, (fk)

for all 0 j n, 0 k m.But we have

akj =

aijfi, fk

=

(ej), fk

=

ej, (fk)

=

ej,

bikei

=bjk ,

that is, B= AT. Now define to be the map with matrix AT.

Exercise 6.14. IfF = R, identifyV V byv v, ,W W byw w, ,

and then show that is just the dual map.

More generally, if : V Wdefines a linear map over F, Bil(V), Bil(W),both non-degenerate, then you can define the adjoint by ((v), w) =(v, (w))

for all v V, w W, and show that it is the dual map.23 Nov

Lemma 6.15.

(i) If, : V W, then( + ) = + .(ii) () =.

(iii) =.

Proof. Immediate from the properties ofA AT.

Definition. A map : V Vis self-adjoint if = .

Ifv1, . . . , vnis an orthonormal basis for V, andAis the matrix of, thenis self-adjoint

if and only ifA = AT

.

In short, ifF = R, thenAis symmetric, and ifF = C, thenA is Hermitian.

Theorem 6.16

Let : V V be self-adjoint. Then(i) All the eigenvalues of are real.

(ii) Eigenvectors with distinct eigenvalues are orthogonal.

(iii) There exists an orthogonal basis of eigenvectors for . In particular, isdiagonalisable.


75/82

73

Proof.

(i) First assume F = C. Ifv = v for a non-zero vector v and C, then

v, v = v,v =

v, v

= v,v = v,v = v, v ,

as is self-adjoint. Since v = 0, we havev, v = 0 and thus = .IfF = R, then letA = AT be the matrix of; regard it as a matrix over C, whichis obviously Hermitian, and then the above shows that the eigenvalue for A is real.

Remark. This shows that we should introduce some notation so that we can phrasethis argument without choosing a basis. Here is one way: let V be a vectorspace over R. Define a new vector space, VC = ViV, a new vector spaceover R of twice the dimension, and make it a complex vector space by sayingthat i (v+ iw) = (w + iv), so dimR V = dimC VC. Now suppose the matrixof : V V is A. Then show the matrix of C : VC VC is also A, whereC(v+ iw) =(v) + i (w).

Now we can phrase (i) of the proof using VC: show R implies that we canchoose a -eigenvector v VC to be in V VC.

(ii) If(vi) =i vi, i = 1, 2, where vi= 0 and 1=2, then

1 v1, v2 = v1, v2 = v1, v2 =2 v1, v2 ,

as = , so ifv1, v2 = 0, then 1= 2= 2, a contradiction.(iii) Induct on dim V. The case dim V = 1 is clear, so assume n = dim V > 1. By

(i), there is a real eigenvalue , and an eigenvector v1 V such that (v1) =v1.Thus V =

v1

v1

as V is an inner product space. Now put W =

v1

.

Claim. (W) W; that is, ifx, v1 = 0, then (x), v1 = 0.Proof. We have

(x), v1

=

x, (v1)

=

x, (v1)

= x, v1 = 0.

Also, |W :W Wis self-adjoint, as

(v), w

=

v, (w)

for all v , w V, andso this is also true for all v, w W. Hence by inductionW has an orthonormalbasis v2, . . . , vn, and so v1, v2, . . . , vn is an orthonormal basis for V .

Definition. LetVbe an inner product space over C. Then thegroup of isometries

of the form, , denotedU(V), is defined to beU(V) = Isom(V) =

: V V| (v), (w) = v, w v, w V

=

GL(V) | (v), w = v, 1w v, w V ,

puttingw =(w). Now we note that : V Van isometry implies that is anisomorphism. This is because v = 0 if and only if|v| = 0, and is an isometry, sowe have|v| = |v| = 0, and so is injective.

= GL(V) | 1 = .

This is called the unitary group.


76/82

74| Linear Algebra

IfV = Cn, and, is the standard inner productx, y = i xi yi, then we writeUn= U(n) =U(C

n) =

X GLn(C) | XT X=I

.

So an orthonormal basis (that is, a choice of isomorphism V Cn) gives us an isomor-

phismU(V)

Un.Theorem 6.17

Let V be an inner product space over C, and : V V an isometry; that is, =1, and U(V). Then

(i) All eigenvalues of have|| = 1; that is, they lie on the unit circle.(ii) Eigenvectors with distinct eigenvalues are orthogonal.

(iii) There exists an orthonormal basis of eigenvectors for ; in particular isdiagonalisable.

Remark. If V is an inner product space over R, then Isom , = O(V), the usualorthogonal group, also denoted On(R). If we choose an orthonormal basis forV, then O(V)ifA, the matrix of, has ATA= I.Then this theorem applied to A considered as a complex matrix shows that A is diag-onalisable over C, but as all the eigenvalues ofA have|| = 1, it is not diagonalisableover R unles the only eigenvalues are1.

Example 6.18. The matrix

cos sin sin cos =A O(2)is diagonalisable over C, and conjugate to

ei

ei

,

but not over R, unless sin = 0.

Proof.

(i) If(v) =v, for v non-zero, then

v, v = v,v = (v), v = v, (v) = v, 1(v) = v, 1v =1 v, v ,and so =

1and = 1.

(ii) If(vi) =i vi, for v non-zero and i=j:

i

vi, vj

=

(vi), vj

=

vi, 1(vj)

=

1j

vi, vj

=j

vi, vj

,

and so i=j implies

vi, vj

= 0.

(iii) Induct onn = dim V. IfV is a vector space over C, then a non-zero eigenvector

v1 exists with some eigenvalue , so (v1) =v1.

Put

Date post:	02-Jun-2018
Category:	Documents
Upload:	chung-chee-yuen
View:	225 times
Download:	0 times

Cambridge Part IB Linear Algebra Alex Chan

Documents