Davis Circulant Matrices

PREFACE

"Mathematics," wrote Alfred North Whitehead, "is the most powerful technique for the understanding of pattern and for the analysis of the relations of pat- terns." In its pursuit of pattern, however, mathematics itself exhibits pattern; the mathematics on the printed page often has visual appeal. Spatial arrange- ments embodied in formulae can be a source of mathe- matical inspiration and aesthetic delight.

The theory of matrices exhibits much that is visually attractive. Thus, diagonal matrices, symmetric matrices, (0, 1) matrices, and the like are attractive independently of their applications. In the same category are the circulants. A circulant matrix is one in which a basic row of numbers is repeated again and again, but with a shift in position. Circulant matrices have many connections to problems in physics, to image processing, to probabil- ity and statistics, to numerical analysis, to number theory, to geometry. The built-in periodicity means that circulants tie in with Fourier analysis and group theory.

A different reason may be advanced for the study of circulants. The theory of circulants is a relative- ly easy one. Practically every matrix-theoretic question for circulants may be resolved in "closed form." Thus the circulants constitute a nontrivial but simple set of objects that the reader may use to practice, and ultimately deepen, a knowledge of matrix theory.

Writers on matrix theory appear to have given circulants short shrift, so that the basic facts are

vii

viii Preface Preface ix

rediscovered over and over again. This book is inten- ded to serve as a general reference on circulants as well as to provide alternate or supplemental material for intermediate courses in matrix theory. The reader will need to be familiar with the geometry of the complex plane and with the elementary portions of matrix theory up through unitary matrices and the diagonalization of Hermitian matrices. In a few places the Jordan form is used.

This work contains some general discussion of matrices (block matrices, Kronecker products, the UDV theorem, generalized inverses). These topics have been included because of their application to circulants and because they are not always available in general books on linear algebra and matrix theory. More than 200 problems of varying difficulty have been included.

It would have been possible to develop the theory of circulants and their generalizations from the point of view of finite abelian groups and group matrices. However, my interest in the subject has a strong numerical and geometric base, which pointed me in the direc- tion taken. The interested reader will find references to these algebraic matters.

Closely related to circulants are the Toeplitz matrices. This theory and its applications constitute a world of its own, and a few references will have to suffice. The bibliography also contains references to applications of circulants in physics and to the solution of differential equations.

I acknowledge the help and advice received from Professor Emilie V. Haynsworth. At every turn she has provided me with information, elegant proofs, and encouragement.

I have profited from numerous discussions with Professors J. H. Ahlberg and Igor Najfeld and should like to thank them for their interest in this essay. Philip R. Thrift suggested some important changes.

Thanks are also due to Gary Rosen for the Calcomp plots of the iterated n-gons and to Eleanor Addison for the figures. Katrina Avery, Frances Beagan, Ezoura Fonseca, and Frances Gajdowski have helped me enormous- ly in the preparation of the manuscript, and I wish to thank them for this work, as well as for other help rendered in the past.

The Canadian Journal of Mathematics has allowed me to reprint portions of an article of mine and I would like to acknowledge this courtesy.

Finally, I would like to thank Beatrice Shube for inviting me to join her distinguished roster of scientific authors and the staff of John Wiley and Sons for their efficient and skillful handling of the manuscript.

Philip J. Davis

Providence, Rhode Island April, 1979

CONTENTS

Notation

Chapter 1 An Introductory Geometrical Application

xiii

1

1.1 Nested triangles, 1 1.2 The transformation a, 4 1.3 The transformation o , iterated with

different values of s, 10 1.4 Nested polygons, 12

Chapter 2 Introductory Matrix Material

2.1 Block operations, 16 2.2 Direct sums, 21 2.3 Kronecker product, 22 2.4 Permutation matrices, 24 2.5 The Fourier matrix, 31 2.6 Hadamard matrices. 7 7 . - . 2.7 Trace, 40 2.8 Generalized inverse. 40 2.9 Normal matrices, quadratic forms,

and field of values, 59

Chapter 3 Circulant Matrices 66

3.1 Introductory properties, 66 3.2 Diagonalization of circulants, 72 3.3 Multiplication and inversion of circulants, 85 3.4 Additional properties of circulants, 91 3.5 Circulant transforms, 99 3.6 Convergence questions, 101

xi

xii Contents

Chapter 4 Some Geometric Applications of Circulants

Circulant quadratic forms arising in geometry, 108 The isoperimetric inequality for isosceles polygons, 112 Quadratic forms under side conditions, 114 Nested n-gons, 119 Smoothing and variation reduction, 131 Applications to elementary plane geometry: n-gons and Kr-grams, 139 The special case: circ(s, t, 0, 0, ..., 01, 146 Elementary geometry and the Moore-Penrose inverse, 148

Chapter 5 Generalizations o m c u ? a n c s : 9-Circulancs and Block Circblancs 155

5.1 g-circulants, 155 5.2 0-circulants, 163 5.3 PD-matrices, 166 5.4 An equivalence relation on il, 2, ..., n], 171 5.5 Jordanization of g-circulants, 173 5.6 Block circulants, 176 5.7 Matrices with circulant blocks, 181 5.8 Block circulants with circulant blocks, 184 5.9 Further generalizations, 191

Chapter 6 Centralizers and Circulants 192

6.1 The leitmotiv, 192 6.2 Systems of linear matrix equations. The

centralizer, 192 6.3 t algebras, 203 6.4 Some classes Z(Po, PT), 206

6.5 Circulants and their generalizations, 208 6.6 The centralizer of J; magic squares, 214 6.7 Kronecker products of I, n , and J, 223 6.8 Best approximation by elements of

centralizers, 224

Appendix

Bibliography

Index of Authors

Index of Subjects

C the complex number field

'rnx n the set of m x n matrices whose elements are in C

transpose of A - A conjugate of A

A* conjugate transpose of A

A B B direct (Kronecker) product of A and B

A 0 B Hadamard (element by element) product of A and B

A' Moore-Penrose generalized inverse of A

r (A) rank of A

If A is square,

det(A) determinant of A

tr (A) trace of A

h(A) eigenvalues of A: individually or as a set

A -1 inverse of A

p ( A ) spectral radius of A xiii

ma

4

10

z

H

o

@m

x

X

ti

(D c

Lou

II II

11 II

II II

II II

II II

r.1

0

Q n

a

xa

=I

a

n

h .

m r-

N

sti

+ o

rtr.

r.

r-

I .

C

a

lo

wr

. c

Yr

"a

r"

ti

n.

3(

Dt

ir

tT

U

x

ti

Q

r-

Q n

r

.-

as

o

r~

g

hl

r-

a-

w-

-h

w

rt

~

oo

Qr

Co

o3

r.

II

*

ti

m-

-

-I

D

rt

Y w

r-

I c

3

Y

n

(D

ti

rt

OH

$P

.

r-- p

. .

rt

P.

(D

WI

It

iO

rt

0

0

C

. .

n

h0

ti

3

'

H

<N

.

n -

aim

m

r-

3

a

o

(~

rt

I

..

X

.

. 0

-

h

0

ti.

?

3

0

- i-1

-.

r-

0

. r

-

. -

z2

e;

-w

rt

- ZT

r

. ..

0 -

w .

mc

C

0

.P

rt

Ln

z3

-,

r .

. C

-

w3

c

0

1C

O

3.

'(

0

r"

. C

X

.

0

.rt

ti

- (D

.

- 2

.

. -

J

I1 3

(0

rt

rt

r"

X

z .

" ;.

x" h

0 -

rtn

w

~i

r

0

11 (D

-

L-

r-

c

ti

0 -

m 9

n m

r.

m u.

"

wo

or

o

N

OW

..

.

. .

..

.,

.

..

..

w

ro

o0

3

I P

*m

vL

1'0

r.w

o o

o

r.

r. r

. c

a 3

3

P

**

ww

m

w tit

3 *

ti

PO

ti0

w

r:(:

r

.m

d *

w

* ti* w

z*

Q z

mT

rm

* r.

ID

ti

;:lo

=m

C

I. r'

w r

m r

a 1

Pr

.N

ID

Q

0

3.

m r

z-z

0

m

30

e

m H

T

r

?"

m*

0

c

iT

r.

3

ID

P 2

m

m

r.w

m

Q

ti*

T

c ID

ti

ID

ti

P

w 9

w w

*

w3

CI

P.

31

4 w

0 '<

P 1

3

mm

.

*.

T

w

P- 2

mm

ti

0

OX

t3

*ti

ffl 7

3 *

mw

T

ti

** ID

<

T

m

00

3

w

r-

w3

=a

w

91

ti

t3 r

. I

ma

I

2 An I n t r o d u c t o r y Geomet r ica l A p p l i c a t i o n Nested T r i a n g l e s 3

( 4 1 Given a T7, t h e r e i s a un ique t r i a n g l e T, - whose midpoin t t r i a n g l e it is.

( 5 ) The a r e a o f T2 i s minimum among a l l t r i a n g l e s

T, t h a t a r e i n s c r i b e d i n T, and whose ver - L L

t i c e s d i v i d e t h e s i d e s o f T1 i n a f i x e d r a t i o , c y c l i c a l l y .

( 6 ) If t h e midpoin t t r i a n g l e o f T7 i s T?, and - - s u c c e s s i v e l y f o r T4, T5, ..., t h i s n e s t e d

s e t o f t r i a n g l e s converges t o t h e c e n t e r o f g r a v i t y o f T, w i th geomet r ic r a p i d i t y .

A

[By t h e c e n t e r o f g r a v i t y ( c . 9 . ) o f a t r i- a n g l e whose v e r t i c e s have r e c t a n g u l a r coor- d i n a t e s (x i , y i ) , i = 1, 2 , 3 , i s meant t h e

p o i n t 1 /3(x1 + x 2 + x3, y1 + y2 + y 3 ) . l

F i g u r e 1 .2 .2

PROBLEMS

1. Prove t h a t t h e t r i a n g l e s Tn a r e a l l s i m i l a r . .. 2. Prove t h a t t h e medians of Tn, n = 2 , 3 , ..., l i e

a l o n g t h e medians o f T ~ . -

3. Prove t h a t t h e c .9 . o f T_, n = 2 , 3 , ..., co in -

Prove t h a t a r e a T n + l

= 1 / 4 a r e a T . n

Prove t h a t t h e p e r i m e t e r of Tn+l = 1/2 p e r i m e t e r o f T". .. Conclude, on t h i s b a s i s , t h a t Tn converges t o c . g . TI (F igu re 1 . 1 . 2 ) .

- Desc r ibe t h e s i t u a t i o n when T1 is a r i g h t

t r i a n g l e ; when T, i s e q u i l a t e r a l .

Given a t r i a n g l e T1, c o n s t r u c t a t r i a n g l e To such

t h a t T1 is i t s midpoin t t r i a n g l e .

The midpo in t t r i a n g l e of T1 d i v i d e s T1 i n t o f o u r - - s u b t r i a n g l e s . Suppose t h a t T, d e s i g n a t e s one of - t h e s e , s e l e c t e d a r b i t r a r i l y . Now l e t Tn de s ig -

n a t e t h e sequence of t r i a n g l e s t h a t r e s u l t from an i t e r a t i o n of t h i s p r o c e s s . Prove t h a t Tn

converges t o a p o i n t . Prove t h a t eve ry p o i n t i n s i d e TI and on i ts s i d e s i s t h e l i m i t of an

a p p r o p r i a t e sequence T . n Sys t ema t i ze , i n some way, t h e s e l e c t i o n p r o c e s s i n Problem 9.

I f two t r i a n g l e s have t h e same a r e a and t h e same p e r i m e t e r a r e t h e y n e c e s s a r i l y congruen t?

L e t P be an a r b i t r a r y p o i n t l y i n g i n t h e tr j Tl = AA B C L e t T2 = o (T ) be de te rmined 1 1 1' 1

.. F i g u r e 1 .1 .3 c i d e s w i t h t h e c .g . of T1.

4 An Introductory Geometrical Application The Transformation o 5

n Fisure 1.1.3. Determine the rate at which U (T,) Write

< I converges to P.

1.2 THE TRANSFORMATION U

As a first generalization, consider the following transformation a of the triangle T1. Select a

nonnegative number s: 0 < s < 1, and set

Let A2, B2, C2 be the points on the sides of the

triangle T1 such that

In this equation A A designates the length of the 1 2 llne segment from A to A2, and so on. Thus the 1 points A C divide the sides of Tl into the 2' B2' 2 ratio s/t, working consistently in a counterclockwise fashion. (See Figure 1.2.1.)

(1.2.3) T2 = AA2B2C2 = 0 (TI)

and in general

( 1 . 2 3 Tn+, = o(Tn)

n = o (T1), n = 1, 2, 3, ... .

Figure 1.2.2 illustrates the sequence T for s = t = 3/4. n

Figure 1.2.2

The transformation a depends, of course, on the parameter s, and we shall write os when it is necessary to distinguish the parameter.

To analyze this situation, one might work with vectors, but it is particularly convenient in the case of plane figures to place the triangle T in the com- 1 . plex plane. We write z = x + iy, 2 = x - ly, i = fl, and designate the coordinates of Tn systematically by . . z z z Write, for simplicity, zll - zl, z = - In' 2n' 3n' 21 z2, z31 = z3. The transformation a operating succes-

sively on T 1' T2' ..., is therefore given by

- k- .

li

h.

N

a

om

PI

3

ea

-

I-N

a s

r

C.

m

w +

n

r.n

G

h

rD

<

o

r

- m

l w art

e

t-N

h

OF

t-

V h

- +

rt

Dim

0

tim

I1

0

- 01

Y 3 G

F

NP

0 3

N

3 r-

I-

OD

ir

D,

- c133

N

a rt.0

rj

0

rt.

3' 3

s.

r.

m

m.

LO

.. 3

Y

3

WH

3 7

H

m

wr

tt

m r

t D

i.

3

3

E

nm

LO

rtr.

m

o

SP

r

rt

wr

r

tr

s rt

7

0

w m

*

or

.

*L

O

. 3

a

rt

Yr

3.

-

3'F

-

C.

rD D

i e

em

rt

s

r-w

a

w 0

0

rt

n

r-.

C.0

3

0

mm

a

rt

.

o

a

- r

0

e

r

Or

t

.3

0

ow

1

0-

5

Dir

t .

rt r

-

II rD

0

Pe

n

-.

w r9

w rt

r.

r-

m

m

m-

m

rt

e

G

so

n

r

3

03

0

-

rD rD

3-

0

rt

h

w 3

s r. L

O D

iLQ

LO

51

1 rt

r

m

2 a r .

0

+ a

o r-

-

w m3

m

rt

N

m

r.

N -

r.

J 3

3

mw

N

14

hm

rt

LO

+ rt

C

.ID

-

s

Dim

h

-

N

0

Di

W

(3

Crt

- N

-

--

+-

N

mr

m

W

rn

w

+N

N

-

N-

P

I

N

N N

+

IW

I

-w

+

W

-

NI

-

07

LO

NO

W

N

r

rt -

rt

N

I +

+ H

+

W

+ ET

+

rD N

I

m

m

rt

rt

'IN

N

-

NN

N

m

w-

-

- -

c<

-

wN

c-

-

-

-*

e

'I

+ N

l N

m I-

$ --

I--

I- -

N

- w

+

N

N I

r

NI

+

N

-

+ N

N

I +

N -

W

NI

N W

N

N

-

8 An Introductory Geometrical Application The Transformation o 9

so that

1 2 . 1 4 lim z . = 0 for i = 1, 2, 3. n+m I t n

We have therefore proved the following theorem.

Theorem 1.2.1. Let 0 < s < 1 be fixed and let T.. be I,

the sequence of nested triangles given by Tn =

o " ( ~ ~ ) , n = 1, 2, . . . . Then T n .converges to c.g. (T1).

The function V(T) is a simple example of a Lyapunov function for a system of di.fference equations. The c.g. is known as the limit set of the process.

It is also of interest to see how the area of T1

changes under o. Designate the area by p(T1). Assum-

ing, as we have, that z, = x, + iylr z2 = x2 + iy2, A A

z3 = x + iy are the vertices of T1 in counterclock- 3 3

wise order, we have

Theorem 1.2.2. min o.s.1 ,,J (o (TI) ) occurs uniquely when

s = 1/2 and equals (1/4)u(T1).

Proof. The minimum value of g(s) = 1 - 3s + 3s 2 - occurs uniquely when s = 1/2 and equals 1/4.

PROBLEMS

1. Interpret the transformation o geometrically when s is real but does not satisfy 0 < s < 1. What does o do when s = l?

2. Interpret the transformation o geometrically when s and t are complex.

3. In this case, find a formula for V(o(T 1 ) . - 1 4. Let V(T designate the polar moment of inertia of

1 T1 about its center of gravity, regarding T1 as a

lamina of unit density. Prove that

5 . Let o(T ) have vertices A2, B2, C2. Then the lines 1 A1B2, B1C2, CIA2 are concurrent if and only if s =

t = 1/2 . (Use Ceva's theorem.)

6. Let T be an equilateral triangle. Then for any s, o (T) is equilateral. Interpret this as an eigen- S value property of

\ t 0 s ' Thus the equilateral triangles are "eigenfigures"

10 An Introductory Geometrical Application Different Values of s 11

of o. Generalize. @: Let the vertices of T Then T in counterclockwise order be zl, z2, z3.

is equilateral if and only if zl + wz + w z3 = 0, where w = exp(2ni/3).

2

1.3 THE TRANSFORMATION a, ITERATED WITH DIFFERENT VALUES OF S

As observed, the transformation o depends on the selection of the parameter s. Let us indicate this by writing o . Begin with the triangle T and form

S 1

Now iterate this, using different values of the parameter s. We obtain

so that, in general.

We then have from (1.2.10)

Whether or not V(Tn) converges to 0 depends on the - behavior of the infinite product Ilc=19(~k) -

m 2 IIkxl(l - 3Sk + 3Sk).

2 Let pk = 3sk - 3sk = 3s (1 - sk). Then m w

k IIkz1g(sk) = IIkzl(l - pk). Assuming that 0 < sk < 1,

we have 0 < pk < 3/4. As is well known, if iF=l~k < m ,

n" then limn,,lIk=l(l-pk) exists and is not zero. On the - other hand, if Ikz1pk - -, then lirnn,,n~=,(l-pk) = 0.

(See, e.g., Knopp, 1928, pp. 219-221.) Thus we must

investigate the convergence of 1" s (1-sk) To this end, for 0 < s < 0, introduce k=l k

k

2 kX!E. lz=l(~k - sk) < if and only if 'Irn s* < -. k=l k

Proof. - s

2 k o < s - s = Sk(l - Sk) < and 5 min(Sk, 1-S ) = sg

k k k 1 - Sk

.n c* 2 Hence Ik=ls{ C rn implies tk=l(sk - sk) < m. On the

other hand,

m m 2 Hence lk=l~i = - implies lk=l(sk - s k ) = a.

This leads to

Theorem 1.3.1

(b) If lL=l~{ ( then

In Case (a), as before, limn+,-T = c.g. ( T ~ ) In n

case (b), one conjectures that [T I approach a non- n

trivial limiting triangle T, (see Figure 1.3.1). We

shall return to this point in Section 3.6 for a more complete analysis.

PROBLEMS

1. Let s = l/(k + 1l2, k = 1, 2, . . . . Compute, k

12 An Introductory Geometrical Application 1 Nested Polygons 13

Each side of P is now divided in length into the ratio s/t, 0 < s < 1, t = 1 - s, proceeding cyclically - counterclockw~se. The points of division form the vertices of a new polygon &[P). (See Figure 1.4.1.) We wish to discuss what happens when this transformation is iterated.

Figure 1.3.1

approximately, limk_,l~ (Tk) /b (TI) . 2. Do the same with s = exp(-pk), p > 0, k =

1, 2, ... . k

Figure 1.4.1 1.4 NESTED POLYGONS

t

Let pn = an (P~), let the vertices of Pn have the

coordinates z z l,nr 2.n' ...' and for Simplicity -

p,n' write z 1,1 = zl, ..., z - z . The transformation

P,l P o may obviously be written in matrix form as

We pass now from triangles to polygons. Let z,, z,, A -

..., z be ordered vertices of a polygon P (assumed to P

be located in the complex plane). We make no restric- tions on the complex numbers zk, so that P may be con-

vex or nonconvex, simply covered or not; furthermore, the points z, are not necessarily distinct so that the .. polygon may have -"multiple vertices." All geometric constructions described below are to be interpreted appropriately with this in mind. We shall also call such a figure a p-gon. We shall assume, however, that the center of gravity of P, l/p(zl + . - . + z ) , is at the origin. This means that P

-

14 An Introductory Geometrical ~pplication Nested Polygons 15

If one writes

and abbreviates the p x p matrix in the right hand of (1.4.2) by G, then

( 1 4 . 2 Zn+l = GZn; Z1 = a given initial vector.

This is a linear autonomous system of difference equations, that is, G i s independent of n. The solution of this iteration is

Thus the limitinq behavior of P _ (i.e., Z-) as n + = I1 I I

depends substantially on the behavior of G" as n + -. The matrix G is a circulant matrix; that is, in

each successive row the elements move to the right one position (with wraparound at the edges). It is also true that the matrix G is a nonnegative. doubly stochastic, irreducible, and normal matrix. In this essay we emphasize the circulant aspect of G. We post- pone further discussion of the p-yon problem until we have somewhat developed the theory of circulants.

PROBLEMS

1. Let G = (gij) be a P * p matrix. Let the p-gon Z1

be transformed into the p-gon Z2 linearly by means

of Z2 = G Z 1 What are necessary and sufficient

conditions on G that it ?reserve centers of

sravitv? Express as an eiqenvalue-vector condition. -

2. Let G (as in Problem 1) satisfy G~ = I for some positive integer k. Describe the geometric situa-

- '3n+l - G3Z3n'

- '3n+2 - G1Z3n+l' for n = 0, 1, . . . . z - 3n+3 - G2Z3n+2'

Find a formula for Z . n

Generalize this section to space p-gons (in three dimensions).

Develop analytical apparatus for generalizing this section to nested polyhedra. In particular, let T1 be a tetrahedron. Let T2 be the tetrahedron whose vertices are the c.9.'~ of the faces of T1. Iterate this.

REFERENCES

Convergence of nested polygons: Berlekamp et al.; Rosenman; Huston; Schoenberg Ill.

p-gons in a general setting: Bachmann and Schmidt; Davis 111, 121.

Liapunov functions, limit sets: LaSalle [2].

tion upon iteGation.

3. Suppose that Z is given and that 0

2 HNTRODUCTOWY MATERIAL

MATRIX

2.1 BLOCK OPERATIONS

It is very often convenient in both theoretical and computer work to partition a matrix into submatrices. This can be done in numerous ways as suggested by this example:

Each submatrix or block can be labeled by subscripts, and we can display the original matrix with submatrices or blocks for its elements. The general form of a partitioned matrix therefore is

Dotted lines, bars, commas are all used in an obvious way to indicate partitions. The size of the blocks must be such that they all fit together properly.

16

Block Operations 17

This means that the number of rows in each A.. 1 3

must be the same for each i and the number of columns must be the same for each j. The size of A. . is

11 - therefore m. x n. for certain integers m. and n.. He

1 I 1 3 indicate this by writing

No. of columns m m2 ... m e No. of rows

A12 . . . "1

2 . 1 1 ' A =

A square matrix A of order n is often partitioned symmetrically. Suppose that n = nl + n2 + - - . + nr with n. > 1. Partition A as

1 -

A12 . . . (2.1.2) A = ( ;ll

Arl Ar2 ... Arr A1r)

where size Aij = ni x nj. The diagonal blocks Aii are square matrices of order n..

1

Example. X X X

X X X

X X X

X X X

X X X X X X

X X X t X X X

is a symmetric partition of a 6 6 matrix.

Square matrices are often built up, or compounded, of square blocks all of the same size.

18 Introductory Matrix Material

: X : X 1 ; X X X X X

X X X I X X X

X X X I X X X

If a square matrix A of order nk is composed of n x n square submatrices all of order k, it is termed an (n, k) matrix. Thus the matrix depicted above is a (2, 3) matrix.

Subject to certain conformability conditions on the blocks, the operations of scalar product, transpose, conjugation, addition, and multiplication are carried out in the same way when expressed in block notation as when they are expressed in element notation. This means -

Here T designates the transpose and * the conjugate transpose.

Block Operations 19

where C . . = hA. B .. 1 I Ir=l lr r]

In (2.1.6) the size of each A , . must be the size of the corresponding B i i . 1 3

- 2 In (2.1.71, designate the size of A . . by a. x 6 .

13 1 3 and the size of B.. by Yi X 6 . . Then, if B = Yr for

1 I I r 1 - < r - < R, the product AirBrj can be formed and pro-

duces an a+ x 6 ; matrix, independently of r. The sum A J

can then be found as indicated and the C . . are ai x 6 . l:! 3 matrices and together constitute a partition. Note

that the rule for forming the blocks C;; of the matrix

product is the same as when A . . and B. . are single numbers. 1 I 1 I

Example. If A and B are n x n matrices and if

then

WN

V

x

m : a

u

N

r.x

c

rn

<w

n

II

ma

rtrt

vw

I

- "3

g

o

r I-

uti

u

I '4

a u

r

I-Up

ID 0

- 2

z m

s

wr

t 0

ti

ci

a

mr

.

3~

.

n

mw

x

o m

a

m

N X +

!"= .d

* w

r

YO

Ir

0t

h Y

II

w3

a

o r

tw

- mti

rt

ma

r.ti

z

z

r. m

m

r.

N

N

3-

n

x

x

r.

*ID

rtN

r

m

:

3w

v

I

N

N z

2 ;?? a

m

X

z

Q

CJ

I

n

2 5;

33

- N

rt0

.

Yc

i

CI

--

--

e

N

ma

o

um

d

rn

m~

w~

r

- -

--- -

----- z

ID

v ti

0

H

-H

m

m

a

m

m

- P

*

P

0 v

o

m:

z:

- a -a

a

w n

on

ti

rt

ID

W

w ti

nti

rt

ID

-0

Y

o

m

m

ma

9

mc

c

WD

0

11

ti

mm

m

- v

rt

ti

ci

00

0

10

C

C

IDID

ID

tt

Yr

t

rt

07

3'

ti

w

P

rt

rt

z. rt

a

*Y

ID

r

m

rt

rt

ID

0-P

Y

C.

0

10

nm

m

ID

-

m

3 <

I1 0

w m

r

c n

m

01


PROBLEPIS

1. Let A = A1 O A2 O ... O Ak. Prove that det A = P Iltz1 det Ai and that for integer p, A' = A' 1 O A2 O

... o AP. K

2. Give a linear algebra interpretation of the direct sum along the following lines. Let V be a finite- dimensional vector space and let L and M be sub- spaces. Write V = L @ M if and only if every vector x E V can be written uniquely in the form x = y + z with y E L , z E M. Show that V = L O M if and only if

(a) dim V = d i m L + dim M, L n M = {Ol. (b) if {x l,...,xLl and {yl, ..., ym) are bases for

L and M, then ixl .... ,xL,Y1, . - . #Yml is a basis for V.

3. The fundamental theorem of rank-canonical form for square matrices tells us that if A is a n x n matrix of rank r, then there exist nonsingular matrices P, Q such that PAQ = Ir @ On-r. Verify this formulation.

2.3 KRONECKER PRODUCT

Let A and B be m n n and p x q respectively. Then the Kronecker product (or tensor, or direct product of A and B) is that mp x nq matrix defined by

Important properties of the Kronecker product are as follows (indicated operations are assumed to be defined) :

(1) (aA) B B = A B (aB) = o(A 8 B); o scalar.

( 2 ) ( A + B ) B C = ( A 0 C ) + ( B 0 C ) .

(3) A B ( B + C) = ( A 0 B) + ( A B C ) .

(4) A 0 (B 0 C) = (A 8 B) 0 C.

Kronecker Product 23

( 5 ) (A B B) ( C 0 D) = (AC) B BD.

( 6 ) = A B B. ( 7 ) ( A 0 B ) ~ = B B ~ ; ( A 0 B)* = A * B B*.

We now assume that A and B are square and of orders m and n. Then

( 9 ) tr(A B B) = (tr(A)) (tr(B)).

(10) If A and B are nonsingular, so is A 0 B and

(A €4 ~ 1 - l = A-I B B-I.

(11) det(A 0 B) = (det ~ ) ~ ( d e t B ) ~ .

(12) There exists a permutation matrix P (see Section 2.4) depending only on m, n, such that B 0 A = P*(A B B)P.

(13) Let p(x, y) designate the polynomial

Let @(A; B) designate the mn x mn matrix

Tham the eigenvalues of D (A; B) are @(Art us), r = 1 , 2, ..., m, s = 1, 2, ..., n where A and u c are the eigenvalues of A r - - and B respectively. In particular, the eigenvalues of A 0 B are h y ) ~ < , r = 1, 2,

PROBLEMS

- 1. Show that Im B In - Imn.

2. Describe the matrices I B A, A B I.

3. If A is m x m and B is n x n, then A 8 B = (A B In) (Im 0 B) = (Im B B ) (A 0 In).

2 4 Introductory Matrix Material

4. If A and B are upper (or lower) triangular, then so is A @ B.

5. If A @ B f 0 is diagonal, so are A and B.

6. Let A and B have orders m, n respectively. Show that the matrix (Im 4 B) + (A 8 In) has the

eigenvalues .Ar + p , i = 1, 2, ..., m, j = S

1, 2, ..., n, where hr and us are the eiqenvalues of A and B. This matrix is often called the Kronecker sum of A a11d B.

7. Let A and B be of orders m and n. If A and B both are (1) normal, ( 2 ) Hermitian, (3) positive definite, (4) positive semidefinite, and ( 5 ) unitary, then A @ B has the corresponding property. See Section 2.9.

8. Kronecker powers: Let = A @ A and, in

general, A[~+'] = A 0 A[~]. Prove that A [k+9.1 =

Alkl @ AIZl.

9 . Prove that (AB) l k l = Ark] B[~]. T

10. Let Ax = Ax and By = uy. x = (xl. ..., xn) . T T T

Define z by zT = [xly , x2y , .. . , xmy I. Prove that (A @ B) Z = AuZ.

2.4 PERMUTATION MATRICES

By a permutation a of the set N = 11, 2, ..., nl is meant a one-to-one mapping of N onto itself. Includ- ing the identity permutation there are n! distinct permutations of N. One can indicate a typical permutation by

u(n) = i n

which is often written as

Permutation Matrices 25

The inverse permutation is designated by a-l. Thus -1 a (ik) = k.

Let Ei designate the unit (row) vector of n com-

ponents which has a 1 in the jth position and 0's elsewhere:

By a permutation matrix of order n is meant a matrix of the form

a. l,a (i) = 1, i = 1,2, ..., n,

(2.4.4) P = (a. . ) where 11 a. 1, 1 = 0 , otherwise.

The ith row of P has a 1 in the a(i)th column and 0's elsewhere. The jth column of P has a 1 in the -1

o (j ) th row and 0's elsewhere. Thus each row and each column of P has precisely one 1 in it.

Example

It is easily seen that


that is, PuA is A with its rows permuted by o. More-

over,

so that if A = (a. . ) is r n, 11

-1 That is, AP, is A with its columns permuted by o . Note also that

(2.4.91 POPT = PUT,

where the product of the permutations 0, T is applied from left to right. Furthermore,

hence

Therefore

The permutation matrices are thus unitary, forming a subgroup of the unitary group.

Permutation Matrices 27

From (2.4.6), (2.4.8) and (2.4.12) it follows that if A is n x n

so that the similarity transformation PnAP; causes a - "

consistent renumbering of the rows and columns of A by the permutation o.

Among the permutation matrices, the matrix

plays a fundamental role in the theorv of circulants. his corresponds to the forward shift-permutation o(1) = 2, a(21 = 3, ..., o(n-1) = n. ofn) = 1. that . . . is, to the cycle u = (1, 2, 3, ..., n) generating the cyclic group of order n (n is for "push"). One has

2 2 corresponding to u2 for which o (1) = 3, o (2) = 4, 2 . . . , a (n) = 2. similarly for nk and ok. The matrix

n n n corresponds to o = I, so that

Note also that

- 1 n- 1 (2.4.17) nT = n* = n = n . A particular instance of (2.4.13) is

(2.4.18) n ~ n ~ = (ai+l, j+l 1 where A = a and the subscripts are taken mod n.

11

2 8 Introductory Matrix ater rial I Permutation Matrices 29

Here is a second instance. Let L = (Al, A2, ..., 'n) T.

Then, for any permutation matrix P 0 '

(2.4.19) Po (diag LIP* o = diag(PoL).

A second permutation matrix of importance is

which corresponds to the permutation o(1) = 1, o(2) = n, o(3) = n - 1, ..., o(j) = n - j + 2, ..., o(n) = 2. Exhibited as a product of cycles, o = (1) (2, n)

(3, n - 11, ... , (n, 2). It follows that o2 = I, hence that

Also,

-1 (2.4.22) r* = rT = r = r .

Again, as an instance of (2.4.13),

(2.4.23) 1 (diag L)l = diag(rL).

Finally, we cite the counteridentity K, which has 1's on the main counterdiagonal and 0's elsewhere:

2 - 1 One has K = K*, K = I, K = K .

Let P = Po designate an n x n permutation matrix.

Now o may be factored into a product of disjoint cycles. This factorization is unique up to the

J arrangement of factors. Suppose that the cycles in the product have lengths p 1~ P2, - . . I Pm, (pl + p2 4 . . . + pm = n). Let n designate the n matrix

p k i (2.4.14) of order pk. By a rearrangement of rows and .. ! columns, the cycles in Po can be brought into the

form of involving only contiguous indices, that is, indices that are successive inteqers. By (2.4.131,

r then, there exists a permutation matrix R of order n such that

(2.4.25) RPR* = RPR-' = n e n e ... a71 . PI p2 Pm

Since the characteristic polynomial of n~ is Pk

(-1) Pk(APk - I), it follows that the characteristic m Pk Pk polynomial of RPR*, hence of P, is Ilk=,(-l) (A - 1). .. A

The eigenvalues of the permutation matrix P are therefore the roots of unity comprised in the totality of roots of the m equations:

Example. :.et 0 be the permutation of 1, 2, 3. 4, 5, 6 for which n(1) = 5, n(2) = 1, o(3) = 6. 014) = 4. o(5) = 2, ~ ( 6 ) = 3. Then o can be factored into cycles as a = (152) (4) (36). Therefore, m = 3 and p, = 3,

p2 = 1, p3 = 2. The matrix Po is

ti

3P

'd

0

X3

F

m

em

tio

r.

w

m <

rr

rt

0 -

3r

t c

t m

m a

m

ti

zm

n

m

rt

rr

3

. s I

rr

I1

r -

rm

7

00

X

Q

he

N3

-

'd

o r

t -

rt

m -

m2

11 r'i

I

n

xn

m

Br

'd r

3s

N

G -

- o

m

tr

I

lX

EU

O

r- e

N

- m

w

0

3 r

r-

n

m r~

LU

r. . 'd

I1

rt 0

ti0

-

3s

3

w

I x

em

-

CI

OM

- t

ro

- -

m

~m

.

n

z rt

N

!a

3

'0

T

mz

3

m

+

0

(D

mr

r

ti

or

t I

m

-r

m

m s

'i

X3

e

r

ti -

0

x rt

rt

-

r.

3

s

3

II r-

II m

Me

Q

e

m

3

XP

N

m

ma

rt

n

m

r (

3

o 3

n

r r

~l

-

0

0

a r

tm

a

xm

m

mm

rt

-

3

0

rt

m

*m

u

tr

'2'

Vl

ti

C

oa

a

n

a

z," ",

a?

m

m r

- v

i r

. a

-

ti

r.

Z

14

- -

m

Q

. I

xr

t

I -

r.

>O

N

I m

rt

r.

w >

P

r

Q rr

C

I Q

m

r-a

-

m t

i3

om

-

3

0<

3b

r

< O

w

B

LU

rt-

0

C

0

C 0

rt

I z

m3

w

r

m o

mm

rt

-

m

m

r.r-

-

om

0 r

0

cm

mv

N

M

3

m w

a

I V

.

rt

0

a

wv

mr

. r

ti

m

-

w

r.

0

. ti

r

8 3

n

J

m r

.0 e

3

rr

C

I r

t

n P

o CI

7

ID

<L

<

m m

mn

a

ti m

%

'd T2

hl

m m

0

wt

io

r-

'i

e 2

"?r-

m

tirt

mrr

rr

m

wc

r.

s

*P

C

m

aw

rm

r- o

ti

m

3 rr-

o

rt

mm

o

r

. 3 3

rt

J

w~

r.

(n

n

rtr

trt

*ti s

m

0

. r.

m

XY

~

. w

e

0

ti

I m

01

WH

*

13

lu w

1

e

a a

e

tiQ

ti

r.0

as 0

r

. r. a

N

rt

ID

r

-a

<

m r

tm

(11

P

k- m

3

rt 0

01

s

m

m

re

*

ID

07

0 0

e

F. * rr r

wc

r

ms

o

r. z

r. r

t 01

3

w-

-

32 Introductory Matrix Material I The Fourier Matrix 33

n (2.5.2) (a) w = 1,

(b) ww = 1, - -1 (c) w = w ,

(d) ;k = w-k = w n-k

(e) 1 + w + w2 + ... + w n- 1 = 0.

By the Fourier matrix of order n, we shall mean the matrix F (= Fn) where

Note the star on the left-hand member. The sequence k w , k = 0, 1, ..., is periodic; hence there are only n distinct elements in F. F can therefore be written alternatively as

It is easily established that F and F* are symmetric :

(2.5.5) T F = F , F * = ( F * ) ~ = F, F = F*.

It is of fundamental importance that

Theorem 2.5.1. F is unitary:

(2.5.6) FF* = F*F = I or F-I = F* or

F$ = FF = I - 1 or F = $.

Proof. This is a result of the geometric series identity

n-l r(j-k)= 1 - w n(j-k) I w r=O j-k

l - w = ~ ~ ~ ~ j = ~ ~ '0 if j p k .

A second application of the geometrical identity yields

Theorem 2.5.2

/ 1 0 ... 0 \

4 2 Corollary. F* = r = I. F*3=F*4(~*) -1 = I F = F .

We may write the Fourier matrix picturesquely in the form

(1t may be shown that all the qth roots of I are of 9 - the form M-~DM where D = diag (ul, u2, . . . , pn), ui - 1,

and where M is any nonsingular matrix.)

Corollary. The eigenvalues of F are +1, +i, with appropriate multiklicities.

Carlitz has obtained the characteristic polynomials f lh) of F* (= F*) . They are as follows. n

n = Olmod 4 ) , 2 f(X) = ( A - 1) ( A - i) ( h + 1) ( A 4 - 1) (n/4)-1

3 4 Introductory Matrix Material I The Fourier Matrix 35 !

The discrete Fourier transform. Working with complex n-tuples, write

n S 1 (mod 4), f (A) = (A - 1) (A4 - 1) (114) (n-1) j P (1)

I Z = (zl, z2, ..., Zn) and A

A A

Z = (zl, Z2, ..., QT. The linear transformation

n ! 2(mod 4), f ( h ) = (A2 - 1) (A4 - 1) (1/4) (n-2)

n ! 3(mod 4), f(h) = ( A - i) (A' - 1)

I i where F is the Fourier matrix is known as the discrete

i Fourier transform (DFT). Its inverse is given simply by

1 (2.5.9) Z = F-li = F*2. I

(2.5.11)

The transform (2.5.8) often goes by the name of harmonic analysis or periodogram analysis, while the inverse transform (2.5.9) is called harmonic synthesis. The reasons behind these terms are as follows: suppose

n- 1 that p(z) = a. + a z + ... + anz 1s a polynomial of 1

degree 5 n - 1. It will be determined uniquely by specifying its values p(z ) at n distinct points zk, n k = 1, 2, ..., n in the complex plane. Select these

2 n- 1 points zk as the n roots of unity 1, w, w , ..., w . Then clearly

so that

(:' )= n-'I2!j (:(w) ) ,) The passage from functional values to coefficients through (2.5.11) or (2.5.8) is an analysis of the function, while in the passage from coefficient values to functional values through (2.5.10) or (2.5.9) the functional values are built up or "synthesized."

These formulas for interpolation at the roots of unity can be given another form.

By a Vandermonde matrix V(z,,, zl, ..., z ) is n-1

meant a matrix of the form

(h4 - 1) (1/4) (11-31

From (2.5.4) one has, clearly,

2 v(1, W, W , ..., Wn-l) = n1I2F*,

(2.5.13) - -2 -n-1 1/2-* ~ ( 1 , w , w , ..., w ) = n F =n1'2~.

One now has from (2.5.11)

n- 1 (2.5.14) p(z) = (1, z, ..., z ) (ao, al, ..., a )T n-1

n-1 -1/2 = (1, z, ..., z )n

F(p(l), ~ ( w ) , ..., = n -1/2 n- 1 - -2 l , z , . , z ) V ( l , w , w ,

-n-1 ..., w ) (p(l), P(w), ..., P(wn-l))T.

a n-1 p (wn-l

-


Note. In the literature of signal processing, a - sequence-to-sequence transform is known as a discrete or digital filter. Very often the transform ]such as (2.5.8)) is linear and is called a linear filter.

Fourier Matrices as Kronecker Products. The Fourier

matrices of orders 2" may be expressed as Kronecker products. This factorization is a manifestation, essentially, of the idea known as the Fast Fourier Transform (FFT) and is of vital importance in real time calculations.

Let F:n designate the Fourier matrices of order 1

2" whose rows have been permuted according to the bit reversing permutation (see Problem 6, p. 30).

Examples

F;=%(: -:),

One has

where Dd = diag(1, 1, 1, i). This may be easily

checked out. As is known, A B B = P(B @ A ) P * for some permu-

tation matrix P that depends merely on the dimensions of A and 8. We may therefore write, for some permu-

-1 tation matrix S4 (one has, in fact, S4 = S4):

(2.5.16) F; = (I2 @ F')D S (I B F;)S4 2 4 4 2

Similarly,

The Fourier Matrix 3 7

where

with

2 3 (2.5.19) D = d i a g ( l , w , w , w ) , -2ni w = exp - 16 '

Again, for an appropriate permutation matrix S - T 16 -

s-l = S16, 16

For 256 use

where the sequence 0, 8, 4, ..., 15 is the bit reversed order of 0, 1, ..., 15 and where (2.5.22) D = diag(1, w, . . . , w15), w = e 2ni/256

PROBLEMS

1. Evaluate det Fn. .. 2 . Find the polynomial P,-~(Z) of degree < n - 1 that

- takes on the values l/z at the nth roots of unity,

I , j = 1, 2, . . , n. What is the limiting behavior of pn(z) as n + m ? (de Mbre)

3. Write F = R + iS where R and S are real and i =

a. Show that R and S are symmetric and that 2 R~ + S = I, RS = SR.

4 . Exhibit R and S explicitly.

2.6 HADAMARD MATRICES

BY a Hadamard matrix of order n, H (= H,), is meant a

matrix whose elements are either +l or -1 and for which


- Thus, n 1'2~ is an orthogonal matrix.

Examples

H1 = (l),

JT F2 = H2 = (1 I),

1 1 1

- H 4 , 1

-

1 -1 1 -1

It is known that if n , 3, then the order of an Hadamard matrix must be a multiple of 4. With one possible exception, all multiples of 4 5 200 yield at least one Hadamard matrix.

Theorem 2.6.1. If A and B are Hadamard matrices of orders m and n respectively, then A B B is an Hadamard matrix of order mn.

Proof - (A B B) (A @ B ) ~ = (A @ B) ( A ~ @ B ~ ) = ( A A ~ ) (BB~)

In some areas, particularly digital signal processing, the term Hadamard matrix is limited to the

n matrices of order 2 given specifically by the recur- sion

Hadamard Matrices 3 9

These matrices have the additional property of being symmetric,

so that

The Walsh-Hadamard Transform. By this is meant the transform

where H is an Hadamard matrix.

PROBLEMS

1. Hadamard parlor game: Write down in a row any four numbers. Then write the sum of the first two, the sum of the last two; the difference of the first two, the difference of the last two to form a second row. Iterate this procedure four times. The final row will be four times the original row. Explain, making reference to H Generalize. 4'

2. Define a generalized permutation matrix P as follows. P is square and every row and every column of P has exactly one nonzero element in it. That element is either a +1 or a -1. Show that if H is an Hadamard matrix, and if P and Q are generalized permutation matrices, then PHQ is an Hadamard matrix.

3. With the notation of (2.6.2) prove that


4. Using Problem 3, show that the Hadamard transform of a vector by H can be carried out in

,n L

< n 2" additions or subtractions. - -

5. If H is an Hadamard matrix of order n, prove that 4 2 l d e t ~ I = n .

2.7 TRACE

The trace of a square matrix A = (a..) of order n is 1 I

defined as the sum of its diagonal elements:

The principal general properties of the trace are

(1) tr (aA + bB) = a tr (A) + b tr (B) . (2.) tr (AB) = tr (BA) . (3) tr A = tr(s-lAs), S nonsingular.

(4) If A . are the eigenvalues of A, then ' n tr A = li=l Xi.

(5) More generally, if p designates a polynomial

n then tr (p (A) = Ikzl P (Ak).

n 2 (6) tr (AA*] = tr (A*A) = li . 1 a . 1 = square

I 11 of Frobenius norm of A.

(7) tr(A Ci B) = tr A + tr B. (8) tr (A c3 B) = (tr A) (tr B) .

2.8 GENERALIZED INVERSE

For large classes of matrices, such as the square "singular" matrices and the rectangular matrices, no

Generalized Inverse 4 1

inverse exists. That is, there are many matrices A for which there exists no matrix B such that AB = BA = 1 . -.

In discussing the solution of systems of linear equations, we know that if A is n x n and nonsingular then the solution of the equation

where X and B are n x m matrices, can be written very neatly in matrix form as

-1 X = A B.

Although the "solution" give above is symbolic, and in general is not the most economical way of solv- ing systems of linear equations, it has important applications. However, we have so far only been able to use this idea for square nonsingular matrices. In this section we show that for every matrix A, whether square or rectangular, singular or nonsingular, there exists a unique "generalized inverse" often called the "Moore-Penrose" inverse of A, and employing it,

the formal solution X = A-'B can be given a useful interpretation. This generalized inverse has several of the important properties of the inverse of a square nonsingular matrix, and the resulting theory is able in a remarkable way to unify a variety of diverse topics. This theory originated in the 19205, but was rediscovered in the 1950s and has been developed extensively since then.

2.8.1 Right and Left Inverses

Definition. If A is an m x n matrix, a right inverse of A is an n x m matrix B such that AB = I_. Similar- ... ly a left inverse is a matrix C such that CA = I .

n

Example. If

/1 1 1 \

a right inverse of A is the matrix


since AB = I 2' However, note that A does not have a left

inverse, since for any matrix C, by the theorem on the rank of a product, r(CA) 5 r(A) = 2, so that CA #

I3. Similarly, although A is, by definition, a left

inverse of B, there exists no right inverse of B.

The following theorem gives necessary and sufficient conditions for the existence of a right or left inverse.

Theorem 2.8.1.1. An m x n matrix A has a riqht (left) inverse if and only if A has rank m(n).

Proof, We work first with riqht inverses. Assume that AB = Im. Then m = r (I m ) 5 r (A) 5 m.

Hence rlA) = m. Conversely, suppose that r(A) = m. Then A has m

linearly independent columns, and we Fan find a permutation matrix P so that the matrix A = AP has its first m column? linearly i~dependent. Now, if we can find a matrix B such that AB = APB = I, then B = PB is clearly a right inverse for A.

Therefore, we may assume, without loss of generality, that A has its first m columns linearly independent. Hence A can be written in the block form

A = (A1, A2)

where Al is an m Y m nonsingular matrix and A2 is some

m u (n - m) matrix. This can be factored to yield -1

A = A (I Q ) 1 m' (Q = A, A,) . NOW let

where B1 is m x n and B2 is (n - m) x m. Then AB = I


if and only if

AIBl + AlQB2 = I,

or if and only if

B1 + QB2 = -1 A1

or if and only if

- 1 B1 = A1 - QB2. Therefore, we have

for an arbitrary (n - m) x m matrix B2. Thus there is

a right inverse, and if n > m, it is not unique. We now prove theTheorem for a left inverse.

Suppose, again, that A is m x n and r(A) = n. Then T . A is n x m and r ( ~ ~ ) = n. BY the first part, has

T a right inverse: A ~ B = I. Hence B A = I and A has a left inverse.

Corollary. If A is n x n of rank n, then A has both a right and a left inverse and they are the same.

Proof. The existence of a right and a left - inverse for A follows immediately from the theorem. To prove that they are the same we assume

AB = I, CA = I.

Then C(AB1 = CI = C. But also,

SO that B = C. This is the matrix that is defined to be the - inverse of A, denoted by A-I.


PROBLEMS

1. Find a left inverse for (i i) . Find all the left inverses.

have a left inverse?

3. Let A be m x n and have a left inverse B. Suppose that the system of linear equations AX = C has a solution. Prove that the solution is unique and is given by X = BC.

4. Let B be a left inverse for A. Prove that ABA = A and BAB = B.

T 5. Let A be m x n and have rank n. Prove that A A is

nonsingular and that (A~A)-'A~ is a left inverse for A.

6 . let^ be m x n and have rank n. Let W be m x m

positive definite symmetric. Prove that A ~ W A is T

nonsingular and that (A W A ) - ~ A ~ W is a left inverse for A.

2.8.2 Generalized Inverses

Definition. Let A be an m x n matrix. Then an n x m matrix X that satisfies any or all of the following properties is called a generalized inverse:

(1) AXA = A,

(2) XAX = X,

(3) (AX)* = AX,

(4) (XA)* = XA.

Here the star * represents the conjugate transpose. A matrix satisfying all four of the properties above is called a Moore-Penrose inverse of A (for short: an M-P inverse). We show now that every matrix A has a

unique M-P inverse. It is denoted by A ~ . It should be remarked that the M-P inverse is often designated

Generalized Inverse 45

+ by other symbols, such as A . The notation A? is used here because (a) it is highly suggestive and (b) it comes close to one used in the APL computer language.

We first prove the following lemma on "rank factorization" of a matrix.

Lemma. If A is an m x n matrix of rank r, then A = BC, whereB is m x r, C i s r x nand r(B) = r(C) = r.

Proof. Since the rank of A is r, A has r linearly independent columns. We may assume, without loss of generality, that these are the first r columns of A, for, if not, there exists a permutation matrix P such that the first r columns of the matrix AP are the r linearly independent columns of A. But if AP can be factored as

A ,. AP = BC, r (B) = r (C) = r,

then

A = BC

where C = CP-' and r(C) = r(?) = r, since P is nonsingular.

Thus if we let B be the m x r matrix consisting of the first r columns of A, the remaining n - r columns are linear combinations of the columns of B,

of the form BQ") for some r x 1 vector ~(j). Then if we let Q be the r x (n - r) matrix,

Q = (Q (1) ... (n-r) ,

we have r n-r

A = (B, BQ) (letters over blocks indicate number of columns)

If we let

c = (Ir, Q),

we have

A = B(Ir, Q) = BC

and r(B) = r(C) = r.


We next show the existence of an M-P inverse in the case where A has full row or full column rank.

Theorem 2.8.2.1 - 1

(a) If A is square and nonsingular, set A~ = A . (b) If A is n x 1 (or 1 x n) and A # 0, set

= - A* (or A~ = - (A*A)

A*). (AA* )

(c) If A is m x n and r(A) = m, set A~ =

A*(AA*)-l. If A is m x n and r(A) = n, Set

A+ = (A*A)-l~*. -

Then A' is an M-P inverse for A. Moreover, in the case of full row rank, it is a right inverse; in the case of full column rank, it is a left inverse.

Note that (a) and (b) are really special cases of (c).

Proof. Direct calculation. Observe that if A is m x n T r ( ~ ) = m, then AA* is m x m. It is well

- 1 known that r(AA*) = m, so that (AA*) can be formed. Similarly for A*A.

We can now show the existence of an M-P inverse for any m x n matrix A.

1f A = 0, set A? = O* = 0 This is readily n,m' verified to satisfy requirements (11, (2), ( 3 ) and ( 4 ) for a generalized inverse.

If A # 0, factor A as in the lemma into the product

A = BC

where B is m x r, C is r x n and r(B) = r(C) = r. Now B has full column rank while C has full row rank, so

that B~ and C' may be found as in the previous theorem. Now set

. . A T = c?B7.

Theorem 2.8.2.2. Let A~ be defined as above. is an M-P inverse for A.

Then it

Generalized Inverse 47

-. Proof. It is easier to verify properties (3)

and (4)rst. They will then be used in proving properties (1) and (2).

. . ( 3 ) AA' = B(CC-)B~ = BIB' = BB', and since

BB: = (BB~)*, we have AAi = (rnt)*. (4) Similarly, A-A = CtC = (CiC)* =

-

(1) (-')A = (BB-)BC = BC = A. . . - -

( 2 ) (A'A)A~ = (cic)c'gi = c.B. = A+.

Now we prove that for any matrix A the M-P inverse is unique.

Theorem 2.8.2.3. Given an m x n matrix A, there is

only one matrix A~ that satisfies all four properties for the Moore-Penrose inverse.

Proof. Suppose that there exist matrices B and c satisfying

ABA = A (1) BAB = B (2) (AB) * = AB (3) (BA)* = BA (4)

Then

ACA = A,

CAC = C,

(AC)* = AC,

(CA)* = CA.

and

( 3 ) (3) and ( 2 ) = (cc*A*)(AB) - - CAB.

Therefore B = C. The integers over the equality signs show the equations used to derive the equality.

Penrose has given the following recursive method


for computing A7, which is included in case the reader would like to write a computer program.

Theorem 2.8.2.4 (the Penrose algorithm). Let A be m x n and have rank r > 0.

(a) Set B = A*A (B is n x n).

(b) Set C1 = I (C1 is n x n).

(c) Set recursively for i = 1, 2, . . . , r - 1: Ci+l = (l/i)tr(C.B)I 1 - CiB (Ci is n x n).

Then tr (C$) f 0 and A? = rCrA*/tr (CrB). Moreover, - - Cr+lB = 0. We therefore do not need to know r

beforehand, but merely stop the recurrence when we have arrived at this stage.

The proof is omitted.

Also very useful is the Greville algorithm.

Theorem 2.8.2.5. Define A = (Ak-l ak) where ak is the

kth column of A and Ak-l is the submatrix of A consis- -

ting of its first k - 1 columns. Set dk = Ak-lak and - -

ck - ak - Ak-ldk. Set b k = ck if ck # 0. If ck = 0,

set bk = (1 + d;dk)-ld~~i-l. Then

-

To start: set A; = 0 if al = 0; if not, set A; = - 1

(a;al) a;.

PROBLEMS 2 2 0

1. If A = (1 2 1) , verify that 1 2 1


1 1 I f A = (1 l), find A'.

1 2

find A'

Use Penrose's formulas to compute the inverse of the nonsingular matrix

Use Greville's algorithm. -

If c is a nonzero scalar, prove that (CAI' =

(l/C)A'. . . Prove that (Ail = A. . . Prove that (Ai)* = (A*)~.

If d is a scalar, define di by di = d-I ' ~f d # 0, di = 0 if d = 0. Let A = diag(dl, ..., dn) . Prove that A' = diag(di, . . . , d=).

A 0)i - Prove that (O - (O AT B+) 0 and (B 0 A)+ = - 0

Prove that if AT = 0. then A = O * . ~ ~~ . .

Let A = (: :) and have rank 1. Prove that

- A' = 1 A*.

1a12 + lb12 + 1c12 + /dl Let J be the J matrix of order n. Prove that -

J. = (l/n2)~.

Let S be an n x n matrix with 1's on the super- diagonal and 0's elsewhere. Find s:.

2 Let P be any projection matrix (i.e., P = P , P*

= P). Prove that P: = P.


15. prove that both AA? and A-A are projections. -

16. Prove that Ai = (A*A)~A* = A*(=*)'.

17. Prove that r (A) = r (A') = r (A~A) = tr (ATA) . 1

18. Taking A = (1, o), B = show that, in gen-

eral, (AB)' + B'A+. 19. ~f a and b are column vectors, then a? = (a*a):a*,

and (ab*)' = (a*a)' (b*b)iba*.

20. Prove that (A €3 B) = A~ €3 B ~ .

2.8.3 The UDV Theorem and the M-P Inverse

We begin by establishing a theorem that is of great utility in visualizing the action and facilitating the manipulation of rectangular (or square) matrices. This is the UDV theorem, also called the diagonal decomposition theorem or the singular value decomposition theorem.

Theorem 2.8.3.1. Let A be an m x n matrix with complex elements and of rank r. Then the exist unitary matrices U, V of orders m and n respectively such that

(2.8.3.1) A = UDV*

where

2 . 8 3 2 D = (zl O) 0

is m x n and where Dl = diag(dl, d2. ..., dr) is a nonsingular diagonal matrix of order r.

Note that the representation (2.8.3.1) can be written as U*AV = D or, changing the notation, UAV = D, and so on (since U and V are unitary).

Let A be m x n; then, as is well known, AA* is positive semidefinite Hermitian symmetric and r(AA*) =

r(A) = r(A*). Hence the eigenvalues - of - AA* are - real L

and nonnegative. Write them as dl, di, ... , d:, 0,

0, ..., 0 where the dils are positive and where there are m - r 0's in the list. The numbers dl, d2, ..., d are known as the singular values of A. r


Proof. Define D = diag(dl, d2, ..., d r ) Let - 1 U1 be m x r and consist of the (orthonormal) eigenvec-

2 2 tors of AA* corresponding to the eigenvalues dl, d2,

. . . , d2 (cf. Theorems 2.9.3 and 2.9.9). We have AA*Ul 2' = U D and UPUl = 1 1 Ir. Let U2 be the m x (m - r) matrix

whose columns consist of an orthonormal basis for the null space of A*. Then A*U2 = 0 and U p 2 =

Im-r * Write U = (U1, U2) (block notation). Then

Now, since AA*U1 = 2 2 UIDlr U*AA*U 2 1 = U5U1D1. But A*U2 =

0, so that U*A 2 = 0, hence U*U D2 = 0. Since D: is 2 1 1 nonsingular, it follows that u p 1 = UiU2 = 0. This means that

0

m-r and hence that U is unitary.

Let V1 be the n x r matrix defined by V = 1

A*ulDil. Let V2 be the n x (n - r) matrix whose n - r columns are a set of n - r orthonormal vectors for the null space of A. Thus AV2 = 0 and V*V = I

2 2 n-r ' Define V as the n x n matrix V = (V 1. V2). Now

and V*V 2 1 = V;A*U D-l = (AV~) *ul~yl = 0. It follows 1 1 that V is unitary. Finally,

-


Using UDV theorem, we can produce a very conven-

ient formula for AT.

Theorem 2.8.3.2. If A = U*DV*, where U, V, D are as above, then

where r m-r

Proof. By a direct computation, it is easy to show that the n -

x m matrix

I 0 (VD'U)A = v ( ~ ~ O)V*, the third and fourth properties

for the generalized inverse are satisfied. Also.

AA'A = (u*Dv*) . (VD-U) . (u*Dv*) = U*DD~DV* = U*DV* = A.

Similarly ATAAi = A?, proving the first two propertif

Theorem 2.8.3.3. For each A there exist polynomials and q such that


Proof. Let A be m x n and have rank r. Then by the diagonal decomposition theorem there exist unitary matrices U, V of order m and n and an m x n matrix

r n-r

D = (,1 O ) r 0 0 m-r

where Dl = diaq(dl, d2, . . . , d,), dld2..-dr # 01 , such

that A = U*DV*. Then A * = VD*U, AA* = U*DD*U, and - -

A' = VD'U. For an arbitrary polynomial p ( z ) , p(AA*) = p (U* (DD*)U) = U*p(DD*)U. Hence A*p (AA*) =

VD*p(DD*)U. Therefore for A7 to equal A*p(AA*) it is

necessary and sufficient that D~ = D*~(DD*). Equi- valently,

("1 O ) = ( D; o ) jDiDi O! 0 0 0 0 0 0

- 1 2 2 o r d k =dkp(Jdk/ ) , k = 1, 2. ..., r. ~ h u s p(/dkl ) 2

= l/(ldkl ) , k = 1, 2, .... r is necessary and sufficient. Let s designate the number of distinct values among idll, IdZI, ..., Idrl. Then by the fundamental

theorem of polynomial interpolation (see any book on interpolation, approximation, or numerical analysis) there is a unique polynomial of degree 5 s - 1 that

- 2 2 takes on the values idk/ at the s points ldkl . The second identity for A~ is proved similarly.

PROBLEMS

1. Let U and V be unitary. Prove that (UAV)' =

v*AiU*.

2. Let A be normal. Give a representation for in terms of the characteristic values of A. See Section 2.9.

3. Prove that if A is normal, AAi = A ~ A . 4. Prove that AT = A* if and only if the singular

5 4 Introductory Matrix ÿ ate rial I Generalized Inverse 55

values of A are 0 or 1. - - 1 5. Prove that A' = limt,O A*(tI + AA*) .

2.8.4 Generalized Inverses and Systems of Linear Equations

Using the properties of the generalized inverse we are able to determine, for any system of equations

whether or not the system has a solution. If it does, we can obtain a matrix equation, involving the generalized inverse, which exhibits this solution. Oddly enough, we need only the first property of a general-

(1) ized inverse. That is, we may use any matrix A ,

! such that AA(')A = A.

Definition. If A is m x n, any n x m matrix that

satisfies AA(~)A = A is called a (1)-inverse of A .

! More generally, any matrix that satisfies any combinn- tion of the,four requirements for the generalized

I ,, inverse on page 44 is designated accordingly. , . , I ,, Example. A (1, 2, 4)-inverse for A is one that Satis- i .. , :: fies conditions (I), ( 2 ) , and (4). I . I ., ; !: Theorem 2.8.4.1. Let A be m x n. The system of I I , ,, equations I 1,

has a solution if and only if B = AA(~)B, for any (1)-

inverse of A. In this case, the general solution is given by

x = A(~)B + (I - A(')A)Y

for an arbitrary n x 1 vector Y.

Proof. Let B = AA(')B. Then AX = AA(l)B is

solved by x = A(~)B. Suppose, conversely, that the system has a solution Xo: AXO = B . Then, for any

Moreover, if X = A(')B + ( I - A(~)A)Y, then with B =

AA(~)B,

Therefore any such X is a solution. To show that it is the general solution, we must

AXO - AA(l)B = B - B = 0. Now therefore R = R - A(~'AR. Hence, X - A("B + (I - A(~)A)R which is of the 0 required form with Y = R.

In the numerical utilization of this theorem one should. of course, use some standard (1)-inverse of A such as A:.

PROBLEMS

1. Show that if A is an m x n matrix and B is any (1)-inverse of A , then AB and BA are idempotent of orders m and n respectively and BAB is a (1,2)- inverse of A.

2. Show that if A is m x n (n x m), of rank m, then any (1)-inverse of A is a right (left) inverse of A, and any right (left) inverse of A is a (1,2,31- [(1,2,4)-1 inverse of A .

3. Consider two systems of equations: (1) AX = B, ( 2 ) CX = D . Find conditions such that every solution of (1) is a solution of (2).

4 . What happens in Problem 3 if B = D = O ?

5 . Prove that the matrix equation AXB = C has a . . solution if and only if ~ A ~ C B ~ B = C. In this case, the general solution is given by


for an arbitrary Y.

2.8.5 The M-P Inverse and Least Square Problems

Let A be m x n , X and B be n x 1, and consider the system of equations

If the vector B lies in the range of A, then there exists one or more solutions to this system. If the solution is not unique we might want to know which solution has minimum norm. If the vector B is not in the range of A, then there is no solution to the system, but it is often desirable to find a vector X in some way closest to a solution. To this end, for any X, define the residual vector R = AX - B and consider its Euclidean norm I / R / j = m. A least squares solution to the system is a vector Xo such that its residual has minimum norm. That is,

I / R ~ ~ 1 = 1 I A X ~ . - B / I - < I I A X - B I I for all n x 1 vectors X.

tI Theorem 2.8.5.1. The system of equations AX = B always has a least squares solution. This solution is unique if and only if the columns of A are linearly

YI! j; independent. In this case, the unique least squares

solution is given by X = A-B.

Proof. Let R(A) designate the range space of A

and by [R(A)lL designate its orthogonal complement. Then we can write B = B + B2 where B1 is in R(A) and 1 B2 is in orthogonal complement [R(A)lL. For any X,

AX is in R(A) as is Ax - B hence is orthogonal to 1' B2. Now AX - B = AX - B1 - B2. Hence, for any X,

2 2 2 2 IlAx - B I I = 1 1 ~ ~ - B1l/ + I I B ~ I ~ 2 ~~B~~~ . Therefore 1 1 B2 1 1 is a lower bound for the values

/ /AX - B / 1' and is achieved if and only if AX = B1.

Since B1 is in R(A), there is a solution Xo to AX = B 1


For this vector Xo,

2 2 2 llRo/I = /lAxo - B / I = I I B ~ I I ~ 2 IIAx - BIl

so that the lower bound is achieved. Since a unique solution to AX = B exists if and 1

only if the columns of A are linearly independent, the theorem is proved.

For any solution Xo to AX = B 1'

= A X - B = B - (B + B2) = -B I Ro 0 1 2 is in [R(A) 1 .

Therefore A*R = 0, or 0

These are the normal equations determining the least squares solution.

If the columns of A are independent, then r(A*A) = r(A) = n, so that the n x n matrix A*A is nonsingular. The least squares solution Xo is deter-

mined by A*AXO = A*B, so that X,, = (A*A)-IA*B. But,

from our previous work, A+ = (A;A)-~A*. Finally, we take up the general case.

Lemma. Let P = A A ~ , Q = ATA. Then, if x and Y are arbitrary vectors (conformable),

and

Proof. Since A = A A 7 ~ , AX = AA~AX = PZ with Z = AX. We now prove that PZ 1 (I - P)Y. This is equivalent to (Pz)*(I - P)Y = 0 or Z*P*(I - P)Y = 0. But

P* = P and p2 = (AA+A)A+ = AA+ = P. Therefore,


P*(I - P) = 0. The first equality above now follows from Pythagoras' theorem. The . second . equality can be

derived from the first using ATT = A.

Another way of phrasing this work is that P is the projection onto the range space R(A) of A while I - P is the projection onto the orthogonal complement of R(A).

Theorem 2.8.5.2. Let A be m x n and B be m x 1. Let

X = ATB. Then for any n x 1 X f Xo, we have either 0

(2) I /AX - B I I = I I A X ~ - B I I and

I Ix I I > I Ixol I -

Proof. For any X we have - AX - B = AX,- AAiB + AAiB - B

By the previous lemma,

The equality holds here if and only if A(X - Xo) = 0.

Hence if AX # AXO, inequality (1) holds. Suppose, then, that AX = AX Then A ~ A X = AiAxO . . - - - 0'

= A'AA'B = A'B = X 0' Therefore, X = X + (X - X ) = 0 0

ATB + (I - ASA)x. Hence by inequality (2) of the lemma,

so that


This theorem may be rephrased as follows. Given

the system AX = B. Then the vector ATB is either the unique least squares solution or it is the least squares solution of minimum norm.

PROBLEM

1. A is square and singular. Characterize the solu-

tion A'B.

2.9 NORMAL MATRICES, QUADRATIC FORMS, AND FIELD OF VALUES

We record here a number of important facts. By a normal matrix is meant a square matrix A for which

(2.9.1) AA* = A*A.

Examples. Hermitian, skew-Hermitian, and unitary matrices are normal. Hence real symmetric, skew- symmetric, and orthogonal matrices are also normal. All circulants are normal, as we shall see.

Theorem 2.9.1. A is normal if and only if there is a unitary U and diagonal D such that A = U*DU.

Theorem 2.9.2. A is normal if and only if there is a Polynomial p(x) such that A* = p(A).

Theorem 2.9.3. A is Hermitian if and only if there is a unitary matrix U and a real diagonal D such that A = U*DU.

Theorem 2.9.4. A is (real) symmetric if and only if there is a (real) orthogonal matrix U and a real diagonal D such that A = U*DU.


PROBLEMS

1. Prove that A is normal if and only if A = R + is where R and S are real symmetric and commute.

2. Prove that A is normal if and only if in the polar decomposition of A (A = HU with H positive semidefinite Hermitian, U unitary) one has HU = UH.

3. Let A have eigenvalues Al, ..., An. Prove that A

is normal if and only if the eiqenvalues of AA* 2 2 2

are lhll : Ih2/ , ..., lhnl .

4. Prove that A is normal if and only if the eigen-

values of A + A* are X1 + XI, A2 + P2, ..., in + a .

5 . If A is normal and p(z) is a polynomial, then p (A) is normal.

6 . If A is normal, prove that A~ is normal.

7. If A and B are normal, prove that A B B is normal.

8. Use Theorem 2.9.1 to prove Theorem 2.9.2. C

Quadratic Forms. Let M be n x n and let 2 = (zl, z2, - T ..., zn) . By a quadratic form is meant the function ?. of zl, ..., z given by n 11

It is often of importance to distinguish the quadratic form from a matrix that gives rise to it. The real and the complex cases are essentially different.

Lemma 2.9.5. Let Q be real and square and U a real

column. Then U ~ Q U = 0 for all U if and only if Q = T -Q , that is, if and only if Q is skew-symmetric.

Proof. - T T T T -

(a) Let Q = -Q . If a = U QU, a T = a = ~ ~ ~ -

Normal MatrlCeS 61

T u (-Q) u = -a. Therefore a = 0. m

(b) Let U'QU = 0 for all (real) U. Write Q = Q1 + Q2 where Q1 is symmetric and Q2 is skew-symmetric. Then, for all U

Since Q1 is symmetric, we have for some orthogonal P

and real diagonal matrix A: Q = pTAP. Therefore for 1

ail real Ul U ~ P ~ A P U = (PU)*A (PU). Write PU = (ul, ..., Un), A = diaq(X,, ..., A,). Then we have - A.

n X (G )2 = 0 for all (ul, . . . , u,), hence for all Ik=l k n (ul, .... Gn). This clearly implies Xk = 0, for k =

1, 2, ..., n. Hence Q1 = 0 and Q = Q2 = skew-symmetric.

Theorem 2.9.6. Let Q and R be real square and U be a T T real column. Then U QU = U RU for all U if and only if

Q - R is skew-symmetric. T T Proof. U QU = U RU if and only if U ~ ( Q - R)U = 0.

Corollary. Let Q be real and U be a real column. Then

1 T The matrix -(Q + Q ) is known as the symmetriza- tion of Q. 2 - We pass now to the complex case.

Lemma 2.9.7. Let M be a square matrix with complex elements and let Z be a column with complex elements. Then

for all complex Z if and only if M = 0

Proof

(a) The "if" is trivial. (b) "Only if." Write Z = X + iY, M = R + is

6 2 Introductory Matrlx Material

where X, Y, R, S are all real. Then we are given

(2.9.4) (X* - iY*) (R + i5) (X + iY) = 0 for all real X, Y.

Select Y = 0. Then X*(R + iS)x = 0 for all real X or X*RX = 0 and X*SX = 0. Therefore, by the first

T lemma, K and S must be skew-symmetric: R + R = 0,

S + ST = 0. Expanding the product on the left side of (2.9.4), we obtain

In view of the skew symmetry of R and S and the first lemma, we have x*RX = X*SX = Y*RY = Y*SY = 0. There- fore, we have for all real X, Y:

Thus, for all real X, Y, X'(R - R*)Y = 0 and Y*(S - S*)X = 0. Selecting X and Y as appropriate unit vectors (0,. , a , 1, 0, ... , O), this tells us that R - R * = 0 and s - S* = 0. But R* = R~ = -R and S* =

sT = -S, therefore R = S = 0 and M = 0.

Theorem 2.9.8. Let M and N be square matrices of order n with complex elements and suppose that

for all complex vectors Z. Then M = N.

Proof. As before, Z*MZ = Z*NZ if and only if - Z*(M - N)Z = 0.

~ormal Matrlces 6 3

Note that this theorem is false if (2.9.5) holds only for real Z.

Corollary. z*Mz is real for all complex Z if and only if M is Hermitian.

Proof. Z*MZ is real if and only if Z*MZ = - (Z*MZ)* = Z*M*Z. Hence M = M*.

Let M be a Hermitian matrix. It is called ~~- ~ ~ - - -~

positive definite if Z*MZ > 0 for all Z # 0. It is called positive semidefinite if Z*MZ 1 0 for all 2. It is called indefinite if there exist Z, # 0 and Z,

I L # 0 such that ZTMZl > 0 > ZzMZ2.

Theorem 2.9.9. Let M be a Hermitian matrix of order n with eigenvalues A,, ..., A,. Then - ..

( a ) M is positive definite if and only if Ak > 0, k = 1, 2, ..., n.

(b) M is positive semidefinite if and only if > 0, k = 1, 2, ..., n. Ik -

(c) M is indefinite if and only if there are integers j , k, j Z k, with A. > 0, Ak < 0.

I Field of Values. Let M designate a matrix of order n. The set of all complex numbers Z*MZ with I IzI I = 1 is known as the field-of values of M and is dksignated by P(M). 1 IZ11 desiqnates the Euclidean norm of Z. -

The foliohing facts, due to Hausdorff and Toeplitz, are known.

(1) 9(M) is a closed, bounded, connected, convex subset of the complex plane.

(2) The field of values is invariant under unitary transformations:

(2.9.6) Y(M) = F(U*MU), U = unitary.

(3) If ch M designates the convex hull of the eigenvalues of M, then

(2.9.7) ch M 5 F(M).

(4) If M-is normal, then .F(M) = ch M.

64 Introductory Matrix Material Normal Matrices 65

PROBLEMS

1. Show that the field of values of a 2 x 2 matrix M is either an ellipse (circle), a straight line segment, or a single point. More specifically, by Schur's theorem**, if one reduces M unitarily to upper triangular form,

then

(a) M is not normal if and only if m # 0. (a') A1 f A2. 9(M) is the interior and

boundary of an ellipse with foci at XI,

X2' length of minor axis is iml. Length 2 1/2

of major axis (/m12 + IX1 - h21 ) . - A2. 9(M) is the disk with center (a") X1 -

at hl and radius jm1/2.

(b) M is normal (m = 0). !'! , 8. .,. (b') X1 f h2. y(M) is the line segment 4 joining A and X2. 5 - 1 .. (b") hl = X 2 . 9(M) is the single point hl. Ci i l I I REFERENCES "a -n

General: Aitken, [I]; Barnett and Story; Bellman, 121; Browne; Eisele and Mason; Forsythe and Moler; Gant- macher; Lancaster, [I]; MacDuffee; Marcus; Marcus and Minc; Muir and Metzler; Newman; M. Pearl; Pullman; Suprunenko and Tyshkevich; Todd; Turnbull and Aitken.

Vandermonde matrices: Gautschi.

Discrete Fourier transforms: Aho, Hopcroft and Ullman; Carlitz; Davis and Rabinowitz; Fiduccia; Flinn and McCowan; Harmuth; Nussbaumer; Winograd; J. Pearl.

**Any square matrix is unitarily similar to an upper triangular matrix.

Hadamard matrices: Ahmed and Rao; Hall; Harmuth; Wallis, Street, and Wallis.

Generalized inverses: Ben-Israel and Greville; Meyer.

UDV theorem: Ben-Israel and Greville; Forsythe and Moler; Golub and Reinsch (numerical methods).

CIRCULANT MATRICES

3.1 INTRODUCTORY PROPERTIES

By a circulant matrix of order n, or circulant for short, is meant a square matrix of the form

1 ", $ The elements of each row of C are identical to those

of the previous row, but are moved one position to the right and wrapped around. The whole circulant is evidently determined by the first row (or column). We may also write a circulant in the form

1 . c = (c. = (ck-j+l)* subscripts mod n. ik

Notice that

Introductory Properties 6 7

so that the circulants form a linear subspace of the set of all matrices of order n. However, as we shall see subsequently, they possess a structure far richer.

Theorem 3.1.1. Let A be n x n. Then A is a circulant if and only if

The matrix n = circ(0, 1, 0, ..., 0). See (2.4.14).

Proof. Write A = (a..) and let the permutation o 11

be the cycle o = (1, 2, ..., n). Then from (2.4.13)

PoAP* = ( u ao(i) ,o(j) 1 - where, in the present instance, Po - n. But A is

evidently a circulant if and only if a.. = a 11 o(i),u(j)'

that is, if and only if nAn* = A . This is equivalent to (3.1.3) by (2.4.17).

We may express this as follows: the circulants comprise all the (square) matrices that commute with

- 1 n, or are invariant under the similarity A + nAn . Corollary. A is a circulant if and only if A* is a circulant.

Proof. Star (3.1.3).

PROBLEMS

1. What are the conditions on c2 in order that J

circ(cl, c2, ..., c ) be symmetric? Be Hermitian n symmetric? Be skew-symmetric? Be diagonal?

2. Call a square matrix A a magic square if its row sums, column sums, and principal diagonal sums are all equal. What are the conditions on ci in order , that circ(ci, c2, ..., c ) be a magic square? n

3. Prove that circ(1, 1, 1, -1) is an Hadamard matrix. It has been conjectured that there are no other

68 Circulant Matrices I Introductory Properties 6 9

circulants that are Hadamard matrices. This has been proved for orders 5 12,100. (Best result as of 1978. )

A Second Representation of Circulants. In view of the

structure of the permutation matrices nk, k = 0, 1, ..., n-1, it is clear that

Thus, from (3.1.21, C is a circulant if and only if C = p(n) for some polynomial p(z). Associate with the n-tuple y = (cl, c2, ..., c n ) the polynomial

n-1 (3.1.5) pY(z) = c1 + c 2 z + - - . + c n z . The polynomial p Y (z) will be called the representer of

the circulant. The association y ff p Y (2) is obvious-

ly linear. (Note: In the literature of signal processing the association y ++ p (l/z) is known as the

I * Y z-transform.) The' function

-. i (n-1) 8 1 (3.1.5') $(€I) = Oy(8) = c1 + c 2 ei8 + - . - + cne 5 m, -, is also useful as a representer. 3: Thus, 111

5 (3.1.6) C = circ y = p (T). Y

Inasmuch as polynomials in the same matrix commute, it follows that all circulants of the same order commute. If C is a circulant so is C*. Hence C and C* commute and therefore all circulants are normal matrices.

PROBLEMS

1. Using the criterion (3.1.3), prove that if A and B are circulants, then AB is a circulant.

2. Prove that if A is a circulant and k is a non-

negative integer, then is a circulant. If

A is nonsingular, then this holds when k is a negative integer.

3. A square matrix A is called a "left circulant" or a (-1)-circulant if its rows are obtained from the first row by successive shifts to the left of one position. Prove that A is a left circulant if and only if A = TAT (see Section 5.1).

4 . A generalized permutation matrix is a square matrix with precisely one nonzero element in each row and column. That nonzero element must be +1 - - .. - or -1. How many generalized permutation matrices of order n are there?

5. Let C be a circulant with integer elements. T Suppose that CC = I. Prove that C is a general-

ized permutation matrix.

6. Prove that a circulant is symmetric about its main counterdiagonal.

7. Let C = circ(al, a2, ..., an). Then, for integer m,

nmc = circ (a a2-,,,' -.-, an-m) Subscripts mod n.

8. By a semicirculant of order n is meant a matrix of the form

Introduce the matrix

Show that E is nilpotent. Show that C is a

70 Circulant Matrices

semicirculant if and only if it is of the form C = p(E) for some polynomial p(z).

9. Prove that if (d, n) = (greatest common divisor of d and n) = 1, then C is a circulant if and

d only if it commutes with n . Hence, in particular, if and only if it commutes with n*.

10. Let K[w] designate the ring of polynomials in w of degree < n and with complex coefficients. In K[wl the uzual rules of polynomial addition and multiplication are to hold, but higher powers are

to be replaced by lower powers using wn = 1. Prove that the mapping circ(cl, c2, ..., C n ) ++

n- 1 C1 + C W + "' +cnw 2

[or circ y ++ py (w) 1 is

a ring isomorphism:

(a) If a is a scalar, u circ y ++ ap (w). Y (b) circ y1 + circ y 2 ++

(c) (circ y ) (circ y2) f- 1 (W)P (w). Pyl Y2 ., n

11. Let circ y ++ py (w). Then (circ y)T ++ w p (w-') Y

;; '*, s Block Decomposition of Circulants; Toeplitz Matrices. ir. The square matrix T = (t,,) of order n is said to be

* $ (3.1.7) t.. = t. i, j = 1 , 2, ..., n - 1 . I 1 I 1+1, j+lr .' -,

Thus Toeplitz matrices are those that are constant along all diagonals parallel to the principal diagonal.

Example. (4 It is clear that the Toeplitz matrices of order

n form a linear subspace of dimension 2n - 1 of the space of all matrices of order n. It is clear, furthermore, that a circulant is Toeplitz but not necessarily conversely.

A circulant C of composite order n = pq is auto- matically a block circulant in which each block is

Toeplitz. The blocks are of order q, and the arrangement of blocks is p x p.

Example. The circulant of order 6 may be broken up into 3 x 3 blocks of order 2 as follows:

where

It may also be broken up into 2 x 2 blocks each of order 3.

A block circulant is not necessarily a circulant. This circulant may also be written in the form

Quite generally, if C is a circulant of order n = pq, then

where I nJ are of order p and where the A. are P ' P I

Toeplitz of order q. A general Toeplitz matrix T of order n may be - 7 .

embedded in a circulant of order 2n as ( G . See also Chapter 5.


3.2 DIAGONALIZATION OF CIRCULANTS

This will follow readily from the diagonalization of the basic circulant n.

Definition. Let n be a fixed integer 2 1. Let w = exp(2ni/n) = cos (2n/n) + i sin(2n/n), i = a. Let (3.2.1) 2 n-1) n = ( Q ) =diag(l, w, w , ..., w n

k 2k Note that rlk = diag(1, w , w , . . . , w (n-l)k) Theorem 3.2.1

Proof. From (2.5.31, the jth row of F* is

(I/&) (w T - l ) O , w(j-l)l , ..., w j 1 . Hence the

jth row of F*R is ( 1 w r - wr) = (I/&) (Wjr),

r = 0 1, . . . , - 1 . The kth column of F is (I/&) (;(k-l)r , r = 0 1, . - 1 . Thus the (j,k)th

! element of F*RF is . 1 i f j = k - 1 ,

mod n

lli am Then (3.2.2) follows.

NOW

(3.2.3) C = circ y = p (n) = p (F*~F) Y Y

Thus we arrive at the fundamental

Theorem 3.2.2. If C is a circulant, it is diagonalized by F. More precisely,

Diagonalization of Circulants 73

(3.2.4) C = F*AF

where - (3.2.5) = A, = diag(p Y (I), p Y (w), . . . , py (wn-l)). The eigenvalues of C are therefore

(Note: The eigenvalues need not be distinct.) The columns of F* are a universal set of (right)

eigenvectors for - all circulants. They may be written T as F*(O, ..., 0, 1, 0, ..., 0) .

We have conversely

Theorem 3.2.3. Let A = diag(hl,X2, ..., An); then C = F*AF is a circulant.

Proof. By the fundamental theorem of polynomial interpolation, we can find a unique polynomial r(z) of

degree 5 n - 1, r(z) = dl + d2z + . - . + dnzn-I and such that r(wJ-l) = A;, j = 1, 2, .. ., n. Now, form

J

D = circ(dl, d2, ..., d n ) It follows that D = F*AF =

C, so that C is a circulant.

With regard to the diagonalization (3.2.4), it should be observed that there is really no "natural" order for the eigenvalues of a matrix. Corresponding to every permutption of eigenvalues, there will be a unitary matrix F for which a formula analogous to (3.2.4) will be valid.

More precisely, let C = F*AF and let Pn be the - permutation matrix corresponding to the permutation a. Then C = F* (P;Pu) A (P;Po)F = (F*P;) (P,AP;) (PaF). Now

T if A = diag(Al, ..., An) and L = (A1, ..., An) , then from (2.4.19), PaAP; = diag (POL). If we now let @ be the unitary matrix F = PaF, we have


We have found it to be convenient to standardize the order of the eigenvalues in the way we have done, leading to (3.2.4).

Let us exhibit the solution of this interpolation problem more explicitly. Write

L = (:') and yT = (i') . 'n n

Then, from (25.11 and (2.5.14).

(3.2.7) yT = n -1/ZFL

and

(3.2.8) py(z) = n- 1 1 Z ..., z )FL.

'. Also,

,-, i. (3.2.9) A = n1/2diag(~*yT) * a: . It. Since F2 = and FF* = I, one also has the identity ., - 2 T (3.2.10) FyT = F (F*y ) = n-l/'r~. I% il OW MI

On the basis of the fundamental representation !. (3.2.41, it is now easy to establish that

Theorem 3.2.4. If A and B are circulants of order n

and ak are scalars, then A ~ , A*, a A + u2B, AB, 1

li,oak~k are circulants. Moreover, A and B commute. If A is nonsingular, its inverse is a circulant. With A = F*AF, A = diag(hl, ..., An) its inverse is given by

where


Since

(circ (c T - c2, . c - circ(cl. cn, c ~ - ~ , ..., c2) = r ( c ~ , c2, . .. , T

cn) 8

if we write

Y = (cl, C2' ... , Cn), we have

The determinant of a square matrix is the product of its eigenvalues. Therefore from (3.2.61,

(3.2.14) det(circ y) = det circ(cl, c2, ..., cn) n

= n (wj-l ) . j=1 Y

If

m f(z) = a0z + a z m-1 + .. . 1 + a,,,, a" # 0,

n g(z) = b0z + blz n-l + . . . + bnr bo # 0

and have roots al, ..., a ; el, ..., 8, respectively, m the resultant R(f, g) of f and g is defined by

R(f, 9) = a:g(al)g(a2) 0 . . g(am)

mn n = - 1 bof (el)f (B2) " ' f (0,)

= (-l)rn"~(g, f).

I Thus, with f (z) = zn - 1, g(z) = py(z), we have


where pl, ..., "-1 are the roots of p (2). Y

In this way, det circ is expressed as the resul-

tant of the two polynomials zn - 1 and pY (2). In the case of real elements, the representation

(3.2.14) may be simplified somewhat. Let y = (c lr C2' ..., Cn), py(z) = C1 + C z + " ' 2 + cnzn-1, w =

8 . exp (27ri/n). Then

, ,. - = wn-j w3 = exp(

, . - and therefore, with c's real,

Z If now n = 2r + 1 = odd, then : 5

n- 1 r 2 det circ y = Il p (wJ) = py (1) Il I P (wJ) 1 . j=o Y j=1 Y

Z If n = 2r + 2 = even, I

Corollary. Let y = (c,, c,, ..., c,,) have real com- - n .A

ponents. If n is odd, then Ijel ci 5 0 implies det circ y > 0.

If n is even and n = 2r + 2, then

n Proof. We have p (1) = Ijz1cj and p (-1) = - Y Y

Diagonalization of Circulants 7 7

1 (-l)lcj. Since jp (wj) 1 2 0, the odd case is Y

immediate. For the even case, note that

Conditions for det circ y > 0 or for det circ y < 0 are easily formulated.

A square matrix is called nondefective or simple if the multiplicity of each of its distinct eiqenvalues equals its geometric multiplicity. By geometric multiplicity of an eiqenvalue is meant the maximal number of (right) eigenvectors associated with that eiqenvalue. A matrix is simple, therefore, if and

n only if its right eigenvectors span C . Equivalently, a matrix is simple if and only if it is diaqonalizable. It follows from Theorem 3.2.2 that all circulants are simple.

As we have seen, all circulants are diagonalized by the Fourier matrix, and the Fourier matrix is a particular instance of a Vandermonde matrix. It is therefore of interest to ask: what are the matrices that are diagonalized by Vandermonde matrices?

Toward this end, we recall the following definition. Let

(3.2.15) $(x) = xn - a x n-1 - a x n-2 - ... n- 1 n- 2

- alx - a 0 be a monic polynomial of degree n. The companion matrix of $,

c$ r is defined by


It is well known and easily verified that the characteristic polynomial of C '$ is precisely +(x). Hence,

if a,,, alp ..., a n- 1 are the eigenvalues of C dr' we have

Theorem 3.2.5. LetV = V(ao, al, . . . , a n- 1 ) designate the Vandermonde formed with ao, ..., an-l [see (2.5.12)l. Let D = diag(ao, al, ..., a n-1 ) . Then

If the a. are distinct, V is nonsingular, which I

gives us the diagonalization

(3.2.19) 1 C$ = vov- .

Hence, for any polynomial p(z),

Proof. A direct computation shows that the first n - 1 sows of VD and of C V are identical. Now the

dr element in the (n, j) position of VD computes out to

n be aj-l. The element in the (n, j) position of C dr V

computes out to be


this reduces to an 3-1- Therefore VD = C $ V.

Since det V = lli<j (ai - ai), it follows that V is nonsingular if and only if the a. are distinct. In

I this case we can arrive at (3.2.19).

Example. If we select @ (x) = xn - 1, then C = T. dr The roots of $ are wJ, j = 0, 1, . .., n-1 and V is a scaled version of F*. Since all polynomials in C, = n

Y are circulants and vice versa, (3.2.20) reduces to (3.2.4).

Let us note another consequence of (3.2.2) which is of interest.

Let P (= Po) be the permutation matrix corres-

ponding to the permutation a. From (2.4.111 we know that PP* = P*P = I, so that P is unitary and normal. It follows from general theory that P is unitarily diagonalizable. It is often useful to be able to exhibit this diagonalization explicitly.

In Section 2.4, we arrived at the following identity. Let o be factored into the product of disjoint cycles of lengths pl, p2, ..., pm. Then, by (2.4.25),

- - ... there is a permutation matrix R such that

RPR* = n B n B ... n . p1 p2 pm

From (3.2.2),

~l = F * n F , j = 1, 2, ..., m, P P. Pj 3 3

where F and n are the Fourier and 0 matrices of P. 3

order p.. Thus if we set 3

(3.2.21) U = F O F o -. . " F , p1 p2 Pm

n = n (B ... " np2 e n , p1 pm

we have

By (3.2.15) this is an 3-1 - a . I-I I, and by (3.2.17)


RPR* = U*AU,

so that

Observe that A is diagonal and U, and hence UR are unitary.

PROBLEMS

1. If A and B are square and AB is a circulant, are A and B circulants?

2. If is a circulant, is A a circulant?

3. Diaqonalize J = circ(1, 1, ..., 1). 4. Diagonalize circ(a, a + h, a + 2h. ....

a + (n - 1)h). Find its determinant. 2 n-1)

5. Diagonalize circ(a, ah, ah , ..., ah . Find its determinant.

6. Diaqonalize circ (1, 3, 6, 10, . . . , n(n + I)/>). 7. Diagonalize A = pI + qJ. J is as in Problem 3.

Find det A.

8. In Problem 7, prove that if p > 0 and p + nq > 0, A is positive definite symmetric.

9. Diagonalize circ(1, s, 0, 0, ..., 0, S). 10. Let C be a circulant with eigenvalues Ak. Show

T that C = F*diaq(hl, An, An-,, ..., h2)F. 11. Diagonalize the checkerboard circulant

circ(0l 01 01 ... 01). 12. Diagonalize circ(001 001 001).

13. Diagonalize circ(0, 1/2, 0, 0, ... , 0, 1/2) = 1/2(7~ + n*). (Random walk on a circle. One- dimensional lattice.)

14. Analyze circ(0, p, 0, ..., 0, q), p + q = 1.

15. Prove that a circulant C is real and has eigenvalues A . if and only if A . = Xn+l-j, j =

I I 1, 2, ..., n.


Let

1 G2 = circ(5, 1 + fi, -1, 1 - A, 1, 1 - fi,

-1, 1 + a) . Show that G2 and G3 are symmetric circulants and

-

that G2G3 = G G = G2. 3 2 Let A, B be circulants of order n with eigenvalues X

A , j r ' ~ , j ' j = 1, 2, ..., n. Prove that

AB = A if and only if h B r i = 1 whenever AA f 0. . i Prove that a circulant is-~ermitian if and.only if its eigenvalues are real.

Prove that a circulant is unitary if and only if its eigenvalues lie on the unit circle.

Prove that a circulant is Hermitian positive definite if and only if its eigenvalues are positive.

Prove that circ(cl, c2, ..., cn) has all row and column sums equal to a if and only if IEZ1ck = a.

Prove that if A is normal and has all row sums equal to a, then all column sums equal a.

Prove that A is normal if and only if there exists a unitary U and a circulant C such that A = U*CU. In other words, A is normal if and only if it is the unitary transform of a circulant.

A matrix M is said to be periodic if there exists

p 1 1 such that M~ = I. Find all the circulants of order n that satisfy this equation.

Prove that det circ(x, 1, 1, 1, 1) =

Prove that det circ(a,, a,, a,, 0, 0, ..., 0) I L J n

= a; + a; - 5; - C2 where il and c2 are the *

roots of xL + a x + a a = 0. 2 1 3 Prove that

Circulant Matrices

det circ(a, a, ..., a; b, b, ..., b) (ma + nb) (a - b) m+n-1 if (m, n) = 1,

if (m, n) > 1.

Here m = number of a's, n = number of b's, and (m, n) = greatest common divisor of m and n.

Prove that

2 r-1 det circ(1, a, a , ..., a , 0, 0, ..., 0)

(0. Ore.)

Prove that

det circ(ao, al, a 2' 0, 0, . . . , 0) =a:+a;- (-1) n+s - n - s ) (aoa2) al . ( n n S

s n-2s

s=o 1 s' ~n

(0. Ore.)

The matrix circ(1, -2, 1, 0, 0, ..., 0) occurs in the theory of morphogenesis (diffusion on a circle). Diagonalize it. Generalize; for example, circ(1, -3, 3, -1, 0, ..., 01, circ(1, -4, 6, -4, 1, 0, 0, ..., 0). Let c2 = c n+l = C N-n+l = cN = 1. All other c's =

0. Find the eigenvalues of circ(cl, c2, ..., cN). (Two-dimensional lattice.)

Let p(z) be the representer of the circulant C.

Prove that C is idempotent (12' = C) if and only

if p(w3) = 0 or 1 for j = 0, 1, ..., n-1. If A is square, of order n, define per(A) as the determinantal expansion of A in n! terms where all the minus signs have been changed to plus.

a b For example, per (c = ad + bc; per(A) is called the permanent of A. Let Dn = per(J - I) with J as


in Problem 3. Prove that

(For this and applications of circulants to combinatorial problems, see Minc.)

3.2.1 Skew Circulants

A skew circulant matrix is a circulant followed by a change in sign to all the elements below the main diagonal.

Example

(3.2.1.1) scirc(a, b, c, d) =

-b -c -d a

In the same way that the theory of circulants is related to the matrix n, the theory of skew circulants is related to the matrix

r o I o ...

The main development of the theory is given in the next group of problems, and the solutions can be carried out along the lines already indicated for circulants. Skew circulants have also been called negacyclic matrices.

The notion can be extended somewhat by using the matrix

Circulant Matrices

where lkl = 1. A {k)-circulant is one which commutes with nk. For k = 1, k = -1 we obtain the circulants

and skew circulants respectively. Representations analogous to those given in the Problems are valid.

PROBLEMS

3. A is a skew circulant if and only if An = 0A.

4. The characteristic polynomial of q is n (-1) (in + l), and its eigenvalues are o, ow,

2 ow , . . . , own-I where

71 TI o = cos - + i sin -, n n

2 n 2n w = o2 = cos - + i sin - . n n

- 1 Note that o = o . 5. The eigenvectors of n corresponding to these roots

2 n-1 T 2 are (1, o, o , ... , o ) , (1, ow, (ow) , ... ,

n-1 T 2 2 2 2 n-1 T (OW) r (I, ow , (OW ) , -.. r (ow ) ) , - . . r (1, own-I, (ow n-1)2 n-1 n-1 T , ..., (ow 1 1 .

6. The eigenvalues of scirc(al, a2, ..., an) are

n- 1 where p (z) = al + a z + a z2 + '.. 2 3

+ anz . 2 n- 1

7. Define fill2 = diag(1, o, o , . . . , 0 ) , R = 2 "-' and fi are unitary. diag(l,w,w , . . . , w ).fi

Moreover.


-112, -1/2 Q = (FR *(on) (FR 1 . 8. S is a skew circulant if and only if it is of the

1 2 ) where A is diagonal. form s = (~5~") *A(FR

9. S is a skew circulant if and onlv if it is of the

form S = R1/2~~1'2, where C . ,.. is a-circulant. -

.. 11. If S, V are skew circulants and q(z) is a poly-

nomial in z, then ST, S*, SV, q(S), S' (cf.

Theorem 3.3.1). S-I (if it exists) are skew circulants. Moreover, S and V commute.

3.3. MULTIPLICATION AND INVERSION OF CIRCULANTS

Since a circulant is determined by its first row, it is really a "one-dimensional" rather than a "two- dimensional'' object. The product of two circulants is itself a circulant, so that a good fraction of the arithmetic normally carried out in matrix multiplication is redundant. For circulants of low order, multiplication can be performed with pencil and paper using the abbreviated scheme sketched below.

Product of two circulants: : ) : :5=(32 37 36 32 37) 36

Abridged multiplication: 1 2 4 4 5 6

It is seen from this that the multiplication of two circulants of order n can be carried out in at most

n2 multiplications and n(n - 1) additions.

8 6 Circulant Matrices

However, using fast Fourier transform techniques,

the order of magnitude n2 may be improved to O(n log n). Recall the relationship between the first row y

of a circulant C = circ y = circ(cl, c2, ..., cn) and its eigenvalues Al, ..., An. From (3.2.7) we have

Now let A have first row a and eigenvalues XAJ1.

..., AA,n and B have first row 0 and eigenvalues hBrlr -.., XB,n. Let the product AB have first row y.

Then

(3.3.2) A = circ a = F*dlag(XArl, ..., XA,n)F, B = circ 0 = F*diag(pBrl, ... , fiB,n)Fr

SO that

(3.3.3) AB = circ y = F*diag(A X X IF. A,1 B,1' ...' 'A,n B,n

Now from (3.3.1)

f n1/2~*aT = ( x ~ , ~ , . . . , T 'A,n)

n 1/ZF*@T = ( A ~ , ~ , ..., XB,n )T. Therefore, we have

(3.3.4) yT = n1/2~ [ (F*aT) f (F*B~) I .

The symbol ? is used to designate element-by-element product of two vectors.

Thus the multiplication of two circulants can be effected by three ~burier transforms plus O(n) ordinary multiplications. Since it is known that fast techniques permit a Fourier transform to be carried out in O(n log n) multiplications, it follows that circulant-by-circulant multiplication can be done in O(n log n) multiplications.

It would be interesting to know, using specific computer programs, just where the crossover value of n is between naive abridged multiplication and fast Fourier techniques.

Multiplication and Inversion of Circulants 8 7

Moore-Penrose Inverse. For scalar A set

and for A = diag (A l1 X21 ..., in) set -

(3.3.6) A'=diaq(h;, A;, ..., A:).

Theorem 3.3.1. If C is the circulant C = F*AF, then its Moore-Penrose generalized inverse (M-P inverse) is the circulant

Proof. The four conditions of Section 2.8.2 are - immediately verifiable for c7 (or see Theorem 2.8.3.2).

Corollary

where Bk are the matrices Bk = F*AkF, Ak = diag(0, 0,

..., 0, 1, 0, ..., 0). In particular, -

(3.3.9) Bk - - Bk.

Circulants of Rank n - r, 1 5 r 5 n. Insofar as a circulant is diagonalizable, a circulant of rank n - 1 has precisely one zero eigenvalue. If C = F*AF, then C has rank n - 1 and only if for some integer j, 1 < I 5 n, -

with ui # 0, i p j. Now,

and C' = F*A'F, so that


(3.3.12) CC: = CTC = F*(l, 1, ..., 1, 0, 1, ..., 1)F, where 0 occurs in the jth position. From this it follows that

(3.3.13) CC' = cfC = F*(I - A . ) F = I - F*A.F = I - B.. I I I

The B. are the matrices given by (3.3.8). For I

circulants of rank n - 2, one has -

(3.3.14) CC- = C'C = I - B. - B I k

for some i, j, j # k.

PROBLEMS

1. Let A, X , B be of order n. Let A and B be circulants. Prove that AX = B has a solution if and only if, wherever an eigenvalue of B is not 0, the corresponding eigenvalue of A is not 0. In this case, there is a solution X that is a cir-

? culant. 1. .

2. Let A, B be circulants of order n with eigenvalues 1, . 5' ..., An; ul, ..., 'n' Let p(x, y) be a poly-

nomial in x, y. Prove that the eigenvalues of ! p (A, B) are precisely p (Xi, u . ) , j = 1, 2, . . . , n. I

f , Remark: A theorem of Frobenius says that if A and B commute, then the eigenvalues of p(A, B) are precisely p(X. pj), j = 1, 2, ..., n for some

I ' I pairing of the eigenvalues. This has been gener- 1 alized by numerous authors.

I ! Circulant Inverses, Continued. Let C = circ(al, a2,

..., an) and let . . . + anz n- 1 (3.3.15) p(z) = al + a2z +

i be its representer. From (3.1.4) one has

(3.3.16) c = p(n).

The last few coefficients in (3.3.15) may be zero. Assuming that C + 0, let us rewrite (3.3.15) in the form

Multiplication and Inversion of Circulants 8 9

with 1 5 r - < n - 1 and ar # 0.

Suppose that ul, u2, ..., pr- 1 are the zeros of - - - - the representer p(z) (to be distinguished from the eigenvalues of C). Thus p(z) = ar(z - u )(z - u ) - . - (Z - u ~ - ~ ) . hence 1 2

. . . (7 - ur-lI). This gives us a factorization of any circulant into a product of circulants n -11 I that are of a particularly elementary type. k

Suppose now that C is nonsinqular. This is true if and only if none of the eigenvalues of C is zero.

That is, if and only if A . = p(wl-l) # 0, j = 1, 2, 1 ..., n. This will be true if and only if pk # an nth

n root of unity. Thus pk # 1, k = 1, 2, ..., r-1. From (3.3.18) one has

Let us examine a typical factor. Let u be a complex variable. Then, for a given matrix M, a

-1 . matrix of the form (M - pI) 1s called the resolvent function of M. The resolvent of n has a particularly simple form.

Theorem 3.3.1 Let U" # 1. Then

(3.3.20) ( " - = 1 n[pn-l~ + pn-'n + pn-3n2

1 - I J

Proof. Multiply the right side by n - uI and use - the fact that nn = I.


- 1 We may also relate C to the reciprocal of p(z).

Let C be a circulant with representer p(z). Suppose

that p(eiO) # 0, 0 < 8 5 27. Then, since the zeros of a polynomial are isolated, p(z) is not zero in some open annulus A that contains lzl = 1 in its interior.

Thus [p(z)]-' is regular there, hence has a Laurent expansion

- b.21) = which converges absolutely in A, and ~ ( z ) (Ij=-_ I. It follows that the series

! converges, and one has p(n) (IT=--blnl) = 1.

i Theorem 3.3.2. Let p(eiO) # 0, 0 5 0 5 2 ~ . Then

n Proof. Make use of n = I to regroup the terms -

in 17 ,=-- b , 73. Circulant Inversion by FFT Techniques. Let C = circ Y = circ (cl, c2: ..., . c n ) = F*diag(hl, ..., hn)F. Then

ci = F*diag(X;, hi, . . . , h;)F. Let C' = circ 8: then

from (3.2.7) or (3.3.1)

n1/2F*yT = (Alr ..., in) T , B~ = n-ll2F(h;, hi, ..., A*)

Thus

T (3.3.24) 8 = F(F*~~)'. I -

The notation ( ) means apply "t" element by element. A somewhat more aesthetic form of (3.3.24) is as

Multiplication and Inversion of Circulants 91

follows. For c = circ y, write c7 = circ y7. Then

(3.3.25) (yilT = F(F*~~)'.

From (3.3.25) it appears that a circulant inverse (or generalized inverse) can be computed in two Fourier transforms plus n ordinary reciprocations. Thus it can be done in O(n log n) multiplications.

The same line of reasoning allows us to compute f(C) where f is any function defined on the eigenvalues A of the circulant C. Write C = circ y and f(C) = k circ 8. Then nl'*E'*yT = (Al, h2, ..., But the

eigenvalues of f (C) are f (Al), . . . , f (An), so that fiT = n-1'2F(f(~l), f (A2), ..., f(An))T. Thus

where we use the notation

PROBLEM

1. Let C be a circulant of order n with representer p(z) and characteristic polynomial q(z). Prove

that zn - 1 divides q (p (2)).

3.4 ADDITIONAL PROPERTIES OF CIRCULANTS

Multiplication of Circulants. Let us look more closely at the product of circulants. Let C,, k = 1, .. 2, ..., p be circulants with diagonalization C = F*AkF, Ak = diagonal. Then k


From this it follows that the eiqenvalues of the product C1C2..-C P are the product of the eigenvalues.

This is an essential feature of a11 families of matrices that are - simultaneously diagonalizable by a fixed matrix.

A special case of (3.4.1) is

Rank. The rank of a diaqonalizable matrix is equal to the number of its nonzero eigenvalues. Hence, if C = F*AF, A = diag(hl, . . . , A ) , then r(C) = number of the

X's that are not zero. From (3.4.2) it follows that

Trace. Let C = circ(clr C2, ..., c ) = F*AF, A = - n diag(hl, ..., An). Then

where y = (cl, C2. . . . r Cn).

From (2.7.16) we have -

(3.4.6) t r ( ~ ~ * ) = tr(~*Ax~) = tr(bh)

Determinant. The determinant of circ(cl, c2, ..., Cn) is a homogeneous polynomial of degree n in the

Additional Properties of Circulants 9 3

variables cl, ..., c n ' There are no "simple' formulas.

We note the first four cases:

(3.4.7) n = 1, det circ(cl) = cl, - -

n = 2, det circ(cl, c21 = c 2 2 1 - C24

n = 3, 3 3 det circ(cl, c2, c3) = cl + c2 + c 3 3

- 3 c c c 1 2 3'

n = 4, det circ (c 1 r C2, C3, c4) =

4 4 4 c1 - C2 + C3 4 - C4 - 2c3c: + 2c2c4)

Spectral Decomposition. Let C = F*AF where A = diag(X1, X2, ..., An). Introduce the diagonal matrices

where the 1 occurs in the kth position. Now A =

diaq(hl. .... An) = ~ ~ = l ~ k ~ k , so that C = ~ ~ = l l k ~ * ~ k ~ .

If we set

then we can write

The matrices Bk are the component or principal idem-

potent matrices of the circulant C. The matrices B k

are, of course, circulants. Note that B.B = I k

'1

-H

o

w3

H

.

a

a

a.

w

r

ti

wr

t

I1 -

P

xu

- xu

.

11

>w

wq

'1

t

ip

**

M

II

(D

(D

>>

x 3

0 X

u.

I1

Sm

wm

mI

I*

>

w I-

rm

r.

x.

e

m-

or

t

m

0

ti

P

II ti

rm

W 3t

3q

c 0

mr

t-

m

0

07

3

rt

I1

- r-

om

mr

.

m m

11

3x

*x

m

m

$

x- - H

. 11

w 'i U

'

rt

m

wm

x

7O

OH

* m

H

3

3

>r

.F

am

x

*a

>

mx

(D

I1 (D

m

X

- *

n

F

I- 7

C m

-

mr

rt

3

w r

3Y

-

-r

tr

r

- m

\n

,

r

3

m

-w

. 2 22:

3 -

rt

Yt-

Y

r

m3

0

-rt

m

tT

+f

(D

f -

ft

T r

t n

-m

7

-r

(D

. N

w - (D

I+

P

. +

c "

2

3

(0

.n

3

.r

t

C

r.

w 0

!- +

3 c m

f

a m

- (D

rm 0

3

r.

m

- 3

-0

rt

a s

I1 0

0

rt3

0

7

r.

m*

ti

r

I

S z

$2

m

r.

. o

5'

" ?

2G:

0

m

a

P

r..

ti

wr

t '

n

0

- m

u

lr

a

w m

c

oo

w

0

ti

03

(D

n

z a

ul

F

ma

\

w n

W

N

O

WN

Zr

a

n

II n

n

r.

ti

0

* ?;

g c

a

ww

c

I-

r.

a

rt

rt

w w

0

r.

(D

3

ID

ao

.

rt

- rt

3

Y

n VF

w 0

P \

m

\N

<

N

- W

-

Y

F?

2

,I-

ma

ti

\

N

OF

<

- m

in -

E.

Y

n

.

xr

r-

7

. \ m

.

N

- .

rt

n

Y

0

r

3r

\

Ym

h

N

79

-

- m

c

N

'1

3

W .

ti

I1 r.

m

m

n


2. Let B. be the matrices of (3.4.9). Let C be a 3

circulant with eigenvalues nlr ..., qn. Prove that

B.C = 11 .B.. I 3 I

2 3. ~ e t y = (1, w, w , ... , wn-l)T. Prove that B.Y =

3 6.Y where 62 = 1, 6. = 0 otherwise. Prove that I - 3

8.y = ~ . y where E = 1, E . = 0 otherwise. 3 3 n 3

4. Outer product expansion. Let A be of order n and have the singular value decomposition A = UDV* where u and V are unitary and D = diag(dl, ..., d ) (see (2.8.3.1)). Let Ak be as in (3.4.8) and n set Bk = uA~v*, k = 1, ..., n. Let u. be the ith

1

column of U and v* be the jth row of V*. Show 3

(a) B = u v* (the outer product of uk and vk); k k k

(b) The matrices Bk have rank (1); B.B" 0, 1 I

i f j; (d) ~ ! B . B ? = I; (e) tr(BiBT) = 1 (see 1=1 1 1

Minimal Polynomial of a Circulant. Let A be a matrix whose characteristic polynomial is

where X ..., X are distinct and the integers ak 1. 1' s Then the minimal polynomial of A has the form

with 1 < f3; 5 a;, j = 1, 2, . . . , s. Now, it is known - J ,

that a matrix is simple (diagonalizable) if and only if its minimal polynomial has only simple zeros. There- fore if A is simple, in particular, if A - circulant, then

Additional Properties of Circulants 97

In other words, m(A) is that monic polynomial of minimal degree which has as its zeros all the distinct eigenvalues of A. Of course, one has m(A) = 0.

Derivatives of Circulants and of Determinants of Circulants. Let A be an m x n matrix whose elements a,, = a,,(t) are differentiable functions of t on

A J 'J some common interval. By dA/dt or we mean the m x n matrix I (d/dt)a. . I . It is easy to verify the

1 I identities

d d A dB (3.4.22) =(aA + 68) = a dt + 6 -. dt' n, 6 scalar

constants,

d (3.4.23) - dA da dt a A = u - + - A , dt dt u = scalar function.

If A and B are compatible for multiplication,

d - d A dB (3.4.24) d t ( A B ) = - B + A - d t dt'

If A is square and nonsingular,

Now let A = A(t) = circ(cl. c2, ..., c ) where c. n 3 = c.(t) are differentiable functions. Then by (3.2.2)

3

where A = diag(Al(t), ..., hn(t)) and

Then

with

Of course, one also has from (3.1.4)


n-1 dc.

Let c. = c.(t) be differentiable functions and I 3

set A = A(t) = det circ(cl, c2, ..., c,). The follow- ! ing identity is valid.

c c . . . c' C' n-1 n

dA cncl ... c (3.4.27) = n det n- 2 'n-9 . . . . . . .

2 ... C C n 1

From the ordinary law of determinant differentiation, one has

... C' C'

... C

- dA = det d t

C n 1

C1 C2 ... C C n- 1

C' C' ... C'

+ det

C2 C3 . a . 'n C 1

+ ... I C1 C2 . .. C

+ det (: ! I c c . . . c' c' ! i

n 1

Additional Properties of Circulants 9 9

Now it turns out that these n determinants are all equal; hence the theorem.

In order not to get lost in a welter of notation, we show this in the case n = 3. It is merely a row- column interchange. The method is perfectly general. Note that

and

Since n* = - 1 n , we find, upon taking determinants, that all the determinants in the previous expansion are equal.

3.5 CIRCULANT TRANSFORMS

Let C = circ y, y = (cl, c2, ..., cn) be a circulant of order n. Let Z = (zl, z2, ..., z ~ ) ~ and w =

T (wl, w2, ..., wn) . If W is related to Z by means of

(3.5.1) W = CZ,

then W is called the circulant transform of Z by C. It is also called the circular convolution or the wrapped convolution of y and 2 .

We mention a number of circulant transforms of of particular interest:

(1) C = I = circ(1, 0, . . . , 0). This is the identity.

(2) n = circ(0, 1, 0, . . . , 0). This is the fundamental circulant. n causes a circular shifting of the components of Z.


r (3) For integer r, n causes a circular shifting of the components of Z by r positions.

( 4 ) D = I - n = circ(1, -1, 0, 0, ..., 0) Since DZ = (zl - z2, z2 - z3, T ..., z - zl) , n it is clear that D is a circular differencing operator. -

(5) For integer r 2 0, D~ = (I - is a circular differencing operator of the rth order. -

(6) For s, t > 0, s + t = 1, the circulant transform C = SI + tn is, as we shall show later, a smoothing operator.

Let C = F*AF; then (3.5.1) becomes

so that if one writes 2 and ii for the Fourier transforms of Z and W, one has

If C is nonsingular, then the inverse transform is given by

and is itself a circulant transform. If C is singular, then (3.5.1) may be solved in

the sense of least squares, yielding

This, again, is a circulant transform that is often of interest. r As a concrete instance of (3.5.4). select C = n , r = 0, +1, t2, ... . Then nrZ is just Z shifted

circularly by r indices. Since nr = F*R~F, R = 2 n-1)

diag(1, w, w , . . . , w , one has

Circulant Transforms 101

This is known as the shift theorem.

PROBLEM

1. Is the circular convolution of two vectors a commutative operation?

3.6 CONVERGENCE QUESTIONS

Convergence of Sequences of Matrices. Let MI, M2, ... be a sequence of matrices all of the same order. Iteration problems often lead to questions about whether certain infinite sequences or infinite products of matrices converge. In the case of infinite products, particular importance attaches to whether the limiting matrix is or is not the zero matrix.

Prior to discussing this question, we recall the definition of matrix convergence. Let

be a sequence of matrices all of size m x n. We shall say that

(3.6.1) lim Ar = A = (a. ) if and only if r-m lk

lim a!r) = a , Jk'

for j = 1, 2, ..., m; r+.. Jk k = 1, 2, ..., n.

The notation l:=l~r = A is an abbreviation for k rn

limk+mlr=lAr = A and the notation IIrZ1Ar = A is an

abbreviation for limk,,II~=,Ar = A. One sometimes ~- - -

writes A_ - A for convergence. L

Elementary properties of convergent sequences of matrices are:

(1) If Ar + A , then uAr + aA; u, scalar.

(2) If Ar, Br are of the same size, then Ar + A,


Br + B implies Ar + Br + A + B. (3) If AT are m x n and Br are n x p and if Ar -

+ A, Br + B then ArBr + AB.

(4) If A i s m x n and / / A / / designates the matrix norm

k=l

then Ar + A if and only if limr+,l /A-ArI I = 0.

If Ar is a sequence of square matrices of order n the m

question of the convergence of lIr=lAr may be a diffi- - -

cult one. Somewhat simpler to deal with is the case in which all the Ar are simultaneously diaqonalizable

by one and the same matrix.

Theorem 3.6.1. Let Ar = MArM-l, r = 1, 2, . . . , where (r) M is a nonsingular matrix and where Ar = diag(hl ,

. . . , h (l) ) . Then Ilm A exists if and only if n r=l r cm

x!~) exists for j = 1, 2, . . . , n. In such a case, =r=1 ,

k Proof. l$.=l~r = k (MA,M-l) = M(IIr,lAr)~-l and - nr=l

A =. r l l r ~ . Hence n!=l~r converges if and

only if II:=l~r does. But Il:,l~r = diag (l~!=~h,!~))

The theorem now follows.

Corollary. An infinite product of circulants converges if and only if the infinite products of the respective eiqenvalues converge.

Proof. All circulants are simultaneously - diagonalizable by F.

Convergence Questions 103

Note. We have said that IIT=lhr CO- if and

only if limk IIk h exists. This terminology is at +- r=l r

variance with some parts of complex variable theory

which requires also that lim Ilk X # 0. k-f- r=l r

Corollary. If C is a circulant with eigenvalues X1, k h2, ..., An, then limk+_C exists if and only if

k If limk+_C exists we shall designate its limit-

ing value by C-. It is useful to have an explicit

form for the limiting value C- of a circulant C. Let JC designate the subset of integers r = 1, 2,

..., n for which X = 1. r

Corollary. Assuming (3.6.2).

C- = Br Jc # (the null set), rtJC

(3.6.3) if

Proof. If C = F*AF, A = diag(hl, A 2 , ..., In), then C = F*A-F, = diag(X7, A;, . . . , A:), where CO

Xr = 1 if Xr = 1 and 0 if lArl C 1. The statement now

follows from (3.4.8) and ( 3 . 4 . 9 ) .

Corollary. Let C be a circulant with eigenvalues hl, X2, ..., An. Then the C6saro mean

1 lim-(I + C + ... + cr-l) = c r rt-

exists if and only if

(3.6.4) l h r 1 5 l, r = 1, 2, ..., n. The representation (3.6.3) persists with replacing

c-.


Proof. Write C = F*AF, A = diag(A1, X2, ..., An). Then

1 1 -(I + . . . + cr-l) = F*diag(,(l + A . + A? + . r I 3

+ A?'))F. I

Now

1 and

It is clear that or converges if and only if ihl 5 1. It converses to 1 if and only if X = 1 and to 0 if and only if X # 1, IA j 5 1 .

In discussing convergence problems, it is useful to introduce the spectral radius or norm, p(M), of a matrix M by means of

(3.6.5) P (M) = max , , j=1,2, ..., n 151

where A , are the eigenvalues of M. I

Inasmuch as circulants are a special case of a . . diagonalizable matrix, we append a table of the beha-

. ,,

vior of M~ as r + - for diagonalizable matrices. All , -1 r . 1:

results are obtained by using M~ = S A S and an

examination of the individual behavior of A: as r + m .

BY a unimodular eigenvalue we mean an eigenvalue A, for which 1 A, 1 = 1. .. ..

It is of interest to contrast this tabulation m

with the general theorem on the existence of M , where M is not necessarily diagonalizable.

I 1 Theorem 3.6.2 cm

(a) If X = 1 is an eigenvalue of M, then M exists if and only if A = 1 is a slmple root of the minimal polynomial of M and if all other roots are less than 1 in absolute value.

(b) If A = 1 is not an eigenvalue of M, then M-

exists if and only if p (M) < 1, in which case M- = 0.

What is the general form of infinite powers?

Omit the trivial case M- = 0. Assume M has order n. Then, since the Jordan blocks corresponding to the eigenvalue X = 1 all must be of dimension 1, it follows that M can be Jordanized as follows:

where S is nonsingular and where Q has the form

Convergence Questions 105

Behavior of M ~ , r + -; M Diagonalizable

Necessary and Sufficient Behavior Conditions

Converges to 0 P(M) < 1

Converges to M- # 0 P(M) = 1; all unimodular eigenvalues equal 1

Diverges boundedly P(M) = 1; not all unimodular eigenvalues equal 1

Cgsaro mean converges P (M) = 1, no unimodular to 0 eigenvalue equals 1

~Qsaro mean converges, p(M) = 1, at least one, but but not to 0 not all unimodular eigen-

values equal 1

Finite number of limit p(M) = 1, not all unimodular points eigenvalues equal 1. ~ l l

unimodular eigenvalues are roots of unity

Infinite number of p(M) = 1, at least one uni- limit points modular eigenvalue is not a

root of unity

Diverges unboundedly P (M) > 1

10 6 C i r c u l a n t M a t r i c e s

I n ( 3 . 6 . 7 ) , Im i s t h e i d e n t i t y m a t r i x o f a c e r t a i n

o r d e r m, 1 < m 5 n , and X i s (n - m ) x (n - m ) and -

p ( X ) < 1. Hence X- = 0, s o t h a t

- -1 T h e r e f o r e , M'" = SQ S . NOW w r i t e S i n b lock form a s s = ( A I B ) where A i s ( n x m ) and B i s ( n n - m ) .

where C i s ( m x n ) and D i s (n - m ) x n . W r i t e s - ~ =

Then from ( 3 . 6 . 6 ) it f o l l o w s t h a t M~ = AC.

PROBLEMS

1. I n v e s t i g a t e t h e convergence o f sequences of d i r e c t sums.

2. I n v e s t i g a t e t h e convergence o f sequences o f Kronecker p r o d u c t s .

3. Prove t h a t i f Ak a r e s q u a r e , l imk+,Ak = A , and A

i s n o n s i n g u l a r , t h e n f o r k s u f f i c i e n t l y l a r g e , Ak - 1

i s n o n s i n g u l a r and l i m k + m ~ ~ l = A . 4 . L e t A , B be s q u a r e of same o r d e r and commute. L e t

k l i m k + - A = Am, Bk = Bm e x i s t . Then l i m k , , ( ~ ~ ) k =

A-B,.

5. Show t h a t t h e i d e n t i t y o f Problem 4 may n o t be 5 2

v a l i d i f AB # BA. Take A = ( ' 0 0 ) , B = A*.

6. What f u n c t i o n s o f m a t r i c e s a r e c o n t i n u o u s under m a t r i x convergence? For example: d e t e r m i n a n t , r a n k , e t c .

7 . L e t A = 1 be a n e i g e n v a l u e o f A and a s i m p l e r o o t

o f i t s minimal polynomial ~ ( h ) . L e t Am e x i s t . Then, i f one w r i t e s L I ( ~ ) = ( A - l ) q ( h ) , q ( l ) # 0,

one h a s Am = ( q ( l ) ) - l q ( ~ ) . ( G r e v i l l e . ) a b

8. When i s (c a n i n f i n i t e power?

Convergence Q u e s t i o n s 107

9. Leve l s p i r i t s . Take t h r e e g l a s s e s , c o n t a i n i n g d i f f e r e n t amounts o f vodka. By p o u r i n g , a d j u s t t h e f i r s t two g l a s s e s s o t h a t t h e l e v e l i n b o t h is t h e s a m e . A d j u s t t h e l e v e l i n t h e second and t h i r d g l a s s e s . Then i n t h e t h i r d and f i r s t g l a s s e s . I t e r a t e . P r e d i c t t h e r e s u l t a f t e r n i t e r a t i o n s . What happens a s n + a? What i f t h e g l a s s e s d o n o t have t h e same c r o s s - s e c t i o n ? What i f t h e g l a s s e s d o n o t have c o n s t a n t c r o s s - s e c t i o n a l a r e a ? What i f a f t e r t h e k t h l e v e l i n g , a n amount v is drunk from b o t h o f t h e l e v e l e d g l a s s e s ? k

10. Prove t h e s t a t e m e n t a t t h e end o f S e c t i o n 1 .3 . G e n e r a l i z e i t .

REFERENCES

C i r c u l a n t m a t r i c e s f i r s t a p p e a r i n t h e mathemat ica l l i t e r a t u r e i n 1846 i n a paper by E. C a t a l a n .

I d e n t i t y (3 .2 .14) f o r t h e d e t e r m i n a n t o f a cir- c u l a n t is e s s e n t i a l l y due to Spot t i swoode , 1853.

For a r t i c l e s o n c i r c u l a n t s i n t h e o l d e r l i t e r a t u r e see t h e b i b l i o g r a p h i e s o f Muir , ( 11 - 161.

C i r c u l a n t s : A i t k e n , 111 , 121; Bellman, [ l ] ; C a r l i t z ; Charmonman and J u l i u s ; Davis , L l l , [ 2 ) ; Marcus and Minc. [21; Muir , [ l l ; Muir and M e t z l e r , (71 ; O r e ; Trapp; Varga.

z-Transform: J u r y .

F r o b e n i u s theorem: Taussky.

Convergence: G r e v i l l e , (11; Ortega .

Skew c i r c u l a n t s ; { k ) - c i r c u l a n t s : Beckenbach and Bellman; Smith , [ l ] .

T o e p l i t z m a t r i c e s : Gray, [ I ] - [ 4 1 ; Grenander and Szeg6; Widom.

D e t e r m i n a n t a l i n e q u a l i t y : Beckenbach and Bellman.

Outer p r o d u c t : Andrews and P a t t e r s o n .

SOME GEOMETRICAL APPLICATIONS OF CIRCULANTS

We are interested here in the quadratic form

where Q is a circulant matrix. The reader will perceive that some of what is presented is valid ih a wider context. In (4.0.1) we have written Z =

T (zl, . . . , zn) . Insofar as Q = F*AF, A = diag(il,

A2' ..., An), one has

This is the reduction of Q(Z) to a sum of squares. If one writes for the Fourier transform of Z,

A ,, A

(4.0.3) Z = (zl, z 2 , . . . , ;n)T = FZ,

then one has

4.1 CIRCULANT QUADRATIC FORMS ARISING IN GEOMETRY

We list a number of specific quadratic forms Q(Z) in which Q are Hermitian circulants and which are of importance in geometry.

Circulant Quadratic Forms 109

= polar moment of inertia around z = 0 of the n-gon Z whose vertices are unit point masses.

From (4.0.4),

which expresses the isometric nature of the unitary transformation F.

= sum of squares of the sides of the n-gon Z.

where k is a positive integer. Z*QZ = sums of squares of the kth-order cyclic difference of the vertices of Z. For example,

We wish next to exhibit the area of an n-gon as a quadratic form in Z. Since for a general Z, the geometrical n-gon may be a multiply covered figure, it is more convenient to deal with the oriented or signed area of Z.

Let zk = xk + iyk, k = 1, 2, 3 be the vertices . ~ .. ..

of a triangle T taken in counterclockwise order. From 1 . 2 1 5 we have

rt We

r'

1 r

t1

rt

m

m

0

c -

3 v

N m

e - I1

PlW

rt

a

m

m

rr

- r.

rt

rn r

t

1 w

om

3

NN

N

me^

WN

P

m P

rt

mm

1 0

NI N

I N

I

";5

W

NV

r

t

I -1

a

v

o\

n

I-

PI-

N

H3

rn

NIM

U

. n

LO

N

I-

PU

. 0

<+

n

mI-

X

u

rn

Ym

x

m <

3

m

0 'I

m *

P

. n m

rn

- X X X WN

V

YL

CL

C

WN

V

n

NN

N

WN

P

NI N

I N

I

WN

V

I-

w w 0

m O

F

k- w

mm

-rt

3:

h

In

N O

-w

112 Some Geometrical Applications The Isoperimetric Inequality 113

T 2. Let J = (1, 1, ..., 1) . Prove that Q3(Z + cJ) =

Q3(Z). Interpret geometrically.

3. Prove that Q3 (nz) = Q3 (2). Interpret.

4. Prove that Q3(TZ) = -Q3(Z). (See p. 28 for T.) Interpret.

4.2 THE ISOPERIMETRIC INEQUALITY FOR ISOSCELES POLYGONS

Consider a simply connected, bounded, plane region9

I with a rectifiable boundary. If A designates its area and L the length of its boundary, the nondimensional

ratio A/L~ is known as its isoperimetric ratio. The famous isoperimetric inequality asserts that for all 9

and that equality holds in (4.2.1) if and only if @ 1 is a circle. :? If @is a regular polygon of n sides each of

. " I 2 length - 2a, it is easily shown that L = 2na, A =

. I L i na cot v/n. Hence the isoperimetric ratio for a

' ' . , a regular polygon of n sides is

A - 1 1 1 - - - Ti 4n cot - = < - ~2

n 4n tan n/n - 4n

It is a reasonable conjecture that if @ is any equilateral polygon of n sides, with area A and peri- meter L, then

(4.2.2.) < 1

L 2 - 4n tan n/n

with equality holding if and only if @ is regular, that is, equiangular as well. We can now establish the truth of this conjecture. Write (4.2.2) in the form

From (4.1.9) we have, using the double angle formula and observing that the first term of the series vanishes,

n n TI n(' - 1) 4n (tan -)A = 4n 1 tan (:)sin In n j=2

. cos n(j - n 1) l;j12,

NOW if @ is equilateral, then for some b > 0, - 2.1 = b, j = 1, 2, ..., n, so that L = nb, 1 3

2 2 2 n 2 L = n b . NOW Q2(Z) = lj=lI~j+l - zjI2 = nb2 = L /n.

Thus from (4.1.8), since the first term of the series vanishes,

For j = 2, we have (tan n/n) (sin n/n) (cos n/n) =

sin2 n/n, so that

n - [sin (1 - - tan - cos (j - l)n 2 n n n I IGjl

Notice that sin[(j - l)nl/n > 0 for j = 3, 4, . n. The bracketed quantity

sin ( j - l)n - tan 1 cos (j - l)n n n n

(1 - l)nKtan TI = cos (1 - l)n - tan -].

n n n

When cos[(j - l)n]/n = 0, then sin[(j - l)nl/n > 0. When the cos > 0, the tan > 0 and tan[(j - l)nl/n > tan n/n. When the cos < 0, the tan < 0. Therefore

the coefficients of 1;. l 2 are always positive. It I

follows that I,' - 4n(tan n/n)A t 0, and equality holds if and only if i3 = A - - . . . = = 0. To interpret

24A n the equality, one has Z = FZ so that

114 Some Geometrical Applications Side Conditions 115

for some a, B . Thus, in the case of equality,

and these are the vertices of a regular polygon of n sides.

4.3 QUADRATIC FORMS UNDER SIDE CONDITIONS

Pick an r with 1 5 r 2 n. Let z'~) be an eigenvector of Q corresponding to A,. Then, up to a scalar

factor, Ztr) = F*(O, ..I, 0, 1, 0, ..., O)T, where the 1 is in the rth position. Suppose now that

Z I z"), that is, z*z(~) = 0. Then Z*F*(O, ..., 0, 1, 0, ... , O)T = (FZ)*(O, ..., 0, 1, 0, ..., o)T = 0. This is valid if and only if 1 = 0. Hence r

2 (4.3.1) z I z(~) implies Q(Z) = 1 AklGkl .

k#r

For distinct rl, r2, ..., r 0 < m < n, m r - -

(rk) (4.3.2) Z I Z k = 1, 2, ..., m, implies

Q(z) = 1 xk~'k~2. k#rlrr 2r...,r m

In particular, since Z (I) = (l/fi) (1, 1, ..., 1) T ,

n 2 implies p(Z) = 1 XklGkl . k= 2

The eigenvalues Ak are, of course, generally

neither real nor positive. For a given matrix Q, the set of all values Q(Z)

with 1121 1 = 1 is the field of values of Q (see Page 63).

It is easily shown, using the fact that a normal matrix is unitarily diagonalizable, that the field of values of a normal matrix is the convex hull of its eigenvalues. Since circulants are normal, the same may be asserted for the field of values of a circulant. The X, are real if and only if a circulant Q is .. Hermitian. Then from (4.0.4), Q(Z) will be real for all 2. In this case, one has the Rayleiqh inequal- ities arrived at as follows. Let Amin and Xmax be

the smallest and largest of the Ak. Then

Hence, from (4.1.1') and (4.0.4),

2 (4.3.4) Amin/ 121 / 5 Q(Z) I Amaxl 121 12.

Therefore, for any Z # 0,

In all our work so far with circulants, it has been convenient to number the eigenvalues so that A . =

I p(wl-I), where p is the representer of the circulant [cf. (3.2.611. To derive equality conditions and further conclusions along the lines of what is now called the Courant-Fisher theorem, it is convenient briefly to renumber the eigenvalues and vectors so that one has

116 Some Geometrical ~ ~ ~ l i c a t i o n s ~ Side Conditions 11

(j I The corresponding eigenvectors of Q will be Z . Suppose now that we have a vector Z # 0 for which

2 2 (4.3.7) Q(Z1 = hminl 121 I = Anl 121 I . Then

n 2 2 2 Q(Z) = 1 hk/Gkl = Anl 121 I = Anl 1

k=l

2 Thus (Ak - An) / zk/ = 0. Since (Ak - A n ) - > 0, k =

1, 2, ..., n, it follows that (Ak - A,) /zkl2 = 0, k =

1, 2, . . . , n. Now assume that

(4.3.8) > A > ... i 1 - 2 - 5 > An.

Then (Ak - An) # 0 for k = 1, 2, .. . , n - 1. Thus A - (4.3.7) holds if and only if = A 22 - "'

= Z = 0. n- 1

Therefore, Z = F*? = F* (0, 0, . . . , 2 ) = g 2'"). In n n other words, (4.3.7) holds if and only if Z is an eigenvector corresponding to An (i.e., to A . ) . mln

Let now Z be a vector such that Z I 2'"). As observed, z = 0, and from (4.0.4) n

or briefly,

(n) for all vectors Z I Z .

Make the further hypothesis that

> A > ... (4.3.10) A1 - - - 2 An-3 > 'n-2 - 'n-1 ' 'n and suppose that equality holds in (4.3.9):

Then

SO that

Since (Ak - An-l) L 0 for k = 1, 2, ..., n-1, it follows that (Ak - An-l) lgkl2 = 0 for k = I, 2, ..., n-1. Hence, by (4.3.10), zk = 0 for k = 1, 2, ..., .. n-3. The structure of must therefore be 2 = (0, 0, ..., 0, zn-2, z ~ - ~ , 01 for arbitrary $n-2, A A

z so that Z = F*z = z ("-2) (n-1) n-1' n-2 + zn-lz

In summary, if (4.3.10) holds, then (4.3.11) holds if and only if Z is a linear combination of the

eigenvectors Z ("-I1 and Z (n-21

We now present an application of these ideas. Select Q = (I - n)*(I - r r ) . From (4.1.81, the eigenvalues of Q are (in the usual ordering)

A . = 4 sin (j-l)', , = I , 2, ..., n. I n

The eigenvalue of smallest value ?s 0, corresponding to j = 1. The next two in size are paired, corresponding to j = 2 and j = n. The common value is

.

4 sinZ n/n. Thus we arrive at

Theorem 4.3.1. Let zl, z2, ..., - 'n' - zl be

complex numbers with lklzk = 0. Then

118 some Geometrical Applications

Equality in (4.3.12) holds if and only if

k-1 + Bak-l, (4.3.13) z = aw k

k = 1, 2, ..., n for constants a, B.

Proof - n 2

Q(Z) = Z*(I - n)*(I - n)Z= 1 Izk+l- zkl . k= 1

The eigenvalue of Q of lowest value is 0; the corresponding eigenvector is (1, 1, ..., 1). The eigenvalues next in size are paired; the eigenvectors are

:I 2

n (l,w,w ,..., w ) and (1, wn-I, w"-~, ... , w) (second and last columns of F*). . $

The inequality (4.3.12) goes by the name of the discrete inequality of Wirtinger.

;' For upper bounds we must obtain I :I ,: = max 4 sin 2 ( j - l)n . I j 'max j n

For n = 2p, one has h 3 4, occurring when j = p + 1 ! I max .. , For n = 2p + 1, one has X 2

= 4 sin (pn/n) = .: j

2 max i .,! 4 cos (n/2n) , occurring doubled when j = P + 1, P + 2.

I, , c c:: This information may now be inserted in (4.3.5).

PROBLEMS

n 1. Letzl, z2, . . . , z be complex numbers with lkc1zk n

= 0. For other integers k, define zk cyclically.

Let A designate the difference operator (Azk =

Z 2

k+l - zk, A zk = A(azk), etc.). Then for all

integers p 2 0, use (I - n)' to prove that

Side Conditions 119

2. For real x. write the Wirtinqer inequality in the form 1'

- 2n ! x2 < [ 2n/n 2 2n k - 2 n k - 2 sin n/nl IF 1 ( 2n/n 1 1 k= 1 k=l

Use this, together with n + m, to prove that if 2 n

f (t) has period 2n and f (t) dt = 0, then

2 n 2 n 0

j f2(t) dt 5 /(f* (t) )2 'It. 0 0

What integrability conditions on f(t) are required here? This is Wirtinger's integral inequality.

3. Let zk, k = 1, 2, ..., n be as in Problem 1. Prove ~-

that the z, are the real affine images of the .. vertices of a regular n-qon (see p. 123 for "affine").

4. Let C be a circulant whose eigenvalues have equal moduli 0. Then, for all vectors 2, I I C Z I I = ollzl I .

5. Prove that the field of values of any matrix is a convex set in the complex plane.

6. Prove that for any matrix, the convex hull of its eigenvalues is contained in the field of values.

4.4 NESTED n-GONS

(See Section 1.4.) Let Z = (zl, z2, . . . , zn)T designate the vertices of an n-qon and let the transformation C (= Cs) be applied iteratively where

= SI + tn, s > o , t > O , s + t = l .

The eigenvalues of C are hk = s + twk-l, k = 1, 2,

..., n. These numbers are strictly convex combina-

tions of 1 and wk-l. Hence, il = 1 and for k = 2, ...,

120 Some Geometrical Applications

Figure 4.4.1

n, one has jXkj < 1. See Figure 4.4.1. In fact, these

numbers lie on a circle interior to and tangent to the unit circle at z = 1. One has

2n(k - 1) = Is2 + t2 + 2St COS I t

It is clear that the eigenvalues of-absolute value next in size to A1 = 1 are A2 and hn (= h2) for which *

2 n (4.4.3) 1 h 2 2 = A = s 2 + ti + 2st cos _ l L .

From (3.4.14) one has for r = 0 , 1, ...,

hence

r (4.4. 4') lim C Z = BIZ.

r+-

Since from (3.4.13), B1 = l/n circ (1, 1, . . . , T

B z = (l/n) (zl + z2 + ... + zn) (1, 1, ..., 1) . ~ence, 1

as r + -, each component of CrZ approaches the c.9. of

Nested n-Gons 121

z with geometric rapidity. It is useful, therefore, to assume that this c.9. is at z = 0, eliminating the first term in (4.4.4). Thus we assume that

(4.4.5) Z + Z + ... 1 2 + Zn = 0 .

Further asymptotic analysis may be carried out along the line of the power method in numerical analysis for the computation of matrix eigenvalues. Write

(4.4.6) crz = h > 2 ~ + h r ~ z + (hrg + ... n n 3 3 + x ~ - ~ B ~ - ~ ) Z -

Then, since / A / = Ih2/, n

Now Since 1 h3 / t 1 h4 1, . . . t 1 1 < 1 AZ 1 , the term in the parentheses approaches 0 as r + m. We designate it by E (r). (It is a column vector.) Let

(4.4.8) h2 = li21eie,

8 = tan -1 t sin 2n/n (S + t cos 2n/n).

An = ~h~/e-~',

Therefore,

Write

(4.4.10) Y = eire -ire r B 2 Z + e BnZ,

SO that


Since from (3.4.9) Bk = F*AkF, we have

ir8 -ire Y = e B2Z + e r BnZ

Hence

= constant (as far as r is concerned).

From this follows immediately that if the second and nth components of FZ, the Fourier transform of Z, are not both zero, then the Yr are a family of nonzero

n-gons of constant moment of inertia.

In this case, then, the rate of convergence of

crZ is precisely I 1, I-r, r + -. Notice from (4.4.3) - or Figure 4.4.1 that as n + m, X2 t 1, so that the more vertices in the n-gon, the slower the convergence.

r The sequence of n-gons crz/I X 7 / will be called -

normalized, and the normalized n-gons "approach" the family Y_. It is of some interest to look at the

geometric nature of Yr. T

Lemma. Let Z = (Zl, Z2t - - - 8 Zn) - Let - n-1

(4.4.12) pz(u) = zl + z 2 u + z 3 u2 + * - - + znu . For r = 1, 2, . . . , n, let

Nested n-Gons 123

Then

In particular,

1 2 n-1 T (4.4.15) B2Z = ,(P, (W)) (1, W, W , . . . , W ) ,

k 2k Proof. From (3.4.12), Br = l/n circ(1, w , w , ..., W (n-1) k . Hence each row of Br is the previous row multiplied by Ck. The identities should now be obvious.

Lemma. Let z = x + iy, z' = x' + iy', 'rl, T~ complex. Then

is an affine transformation of the (x, y)-plane. It is nonsingular if and only if / T 1 # I T 1 . 1 2

Proof. Write rl = t1 + inl, r2 = c 2 + in2, where the S's and n's are real. Then the transfGrmation (4.4.17) can be written as

x' = (C1 + S2)x + (nl - r12)y, (4.4.18)

Y' = (nl + n2)X + (C2 - C1)y.

This is an affine transformation of the x, y plane. The determinant A of the transformation is

2 2 2 2 2 2 A = E2 - El - n1 + n2 = 1 ~ ~ ) - 1 ~ ~ 1 ,

so that A # 0 if and only if # 1 ~ ~ 1 . Theorem 4.4.1. If /g 2 / # I;*/, the n-gons Yr are nonzero, and of constant moment of inertia. They are the affine images of the regular unit polygon of n sides, hence are convex.


Proof. We have --

Hence if we write T~ = (l/n)eir8 - pZ(w), T2 -

(l/n) e-ire -

pZ(w), the vertices in Yr are the images of 2 -

(1, w, w , . . . , wn-l) under z' = 7 , z + T ~ Z . Since A -

pZ(w) = < and pZ(" = ;2, it follows that 1 ~ ~ 1 # 1 ~ ~ 1 . This is a nonsingular affine transformation and all

I such transformations send convex figures into convex * figures. t ..

i For further analysis, one makes the assumption

that 8 is a rational multiple of 2n. In this case, ; one can identify llmits of subsequence of the normal- ,! ized figures c ~ z / I x , ~ ~ , r = 0, 1, 2, ... . -

Instead of working generally, we shall assume that

1 . This leads immediately to

TI (4.4.20 i h I = cos -, 8 = ' 7 2 n n

so that (4.4.9) becomes

Let now

(4.4.22) r = 2jn + b, O ~ b 2 2 n - 1,

] = 0, 1, ... . Then (4.4.21) becomes

Nested n-Gons 125

Writing

(4.4.24) Ub = e Tib/n~2z + e -nib/ng n

one now has

C2jn+b (4.4.25) lim 2jn+b = 'b' b = 0, 1, 2, ...,

j+- (COS n/n) 2n - 1,

so that the normalized n-gons approach 2n limiting n-gons, each of which is an affine transform of a regular n-gon. See Figure 4.4.2.

PROBLEMS

1. Prove that if I i2 / # 1 , the sequence of corresponding normalized vertices of the nested n-gons r = 0, 1, 2, ... lie asymptotically on an ellipse.

2. Analyze what happens when Z is taken as the vertices of a regular polygon.

2 4 3 T 3. Take Z = (1, w , w , w, w ) . w5 = 1 (a regular pentagram). What happens under C 1/2? Do the successive iterates ever become convex?

4. Analyze what happens when Z is taken as the affine image of a regular polygon.

5. Let C -1 = circ(l/r, 1-(l/r), 0, 0, ..., O), r = r -

1, 2, 3, . . . . Discuss n m r=l C -1, and apply it to nested n-gons. r

I

i-\+

I

-. z

-. z

0

0

0

I4 4

m

+ 0

q

r.

R Q

C

ti

ID

P

/+ ft"

2 +

-+

/+

51

0

Smoothing and Variation Reduction 131

Fi~ure 4.4.2 (Continued)

4.5 SMOOTHING AND VARIATION REDUCTION

The smoothing or filtering of data is a common operation and is worthy of discussion within the present framework. We assume that we have a finite sequence

of data values Z = (zl, ..., zn) and we subject the

data to a linear transformation with matrix A: A

(4.5.1) Z = AZ.

What properties of the matrix A will be required for smoothing? Numerous definitions have been put forward. Greville has proposed the following. A matrix A will be called smoothing if:

(1) A has A = 1 as an eigenvalue,

(2) A- = lim exists. PA"

The rationale behind this definition is as follows. The eigenspace S of vectors corresponding to A = 1 has the property that if z E S, Az = 2 . Call S the set of smooth vectors. Then vectors that are already smooth are unaffected by the operation A. Now take any vector Z and "smooth" it over and over again by

applying A. Then this will approach A-2. NOW since

A(A~Z) = A"Z, A ~ Z E S, hence it is a smooth vector. Referring to Theorem 3 . 6 . 2 , we see that the

necessary and sufficient condition for A to be smooth- Ing in the sense of Greville is that:

(1) ,i = 1 be an eigenvalue of A.

(2) h = 1 be a simple root of the minimal polynomial of A and if h # 1 is an eigenvalue, then i h l < 1.

If A is a circulant then the criterion simplifies somewhat.

Theorem 4.5.1. A circulant C is a smoothing operator if and only if

(1) A = 1 is an eigenvalue of C.

(2) If i # 1 is an eiqenvalue of C, then l h / < 1.

rt

w

s

C

J

m rt

m

m

m

. s r

. o

m

N

w

ar

OM

-

rm

m

Po

+

'"I

rt

m

<

jila

N

- m

5:

W

N

Wr

t

I1 -

YO

- Pw

?

I1 *

tf

N

N

P

-

sm

r w

m

w

-

+ N

n

ci

- P

Q

YN

O

W

c n

N

P

N

I w

- n

1

Y

r.

o

II I

N

rt

n.

3

N

r-

..

m

r

-

rt

. r

t\

I

NY

-

w3

w

3

N

+ Y

N r

tn

3 - -

Y

m r.

rt

-

-. "I

N

N

s

H

o

N

m

- +

tf

r

I <

m

-

w

N

N

ci

w

P

1

w

r-

-

. r"

a

I N

r

tr"

r.

rt

N

+ 0

r"

P

Y

. -

.

N.

<

o

m

r

m

n

- rt

. N

0

- ci

<

g

r. e c

i n

r-

ms

mo

Y

ca

3 m

r

mm

tf

r.

wr

. a

x<

rt1

m

m

m D

P

-* r

t Y

r.

3

r.w

o

m

V

ICH

Y

mr

t c

w

0

J

YC

mm

r.

ms

rt

m

cim

m

wr

.1

0

n

ua

mm

3 r.

mw

mm

1

YO

0

0

<a

1 m

r

w

mc

m

Prt

U .

c r

rm

m

w r.

m

- rtm

rt

m

m

c

ua

rr

H

r.

v

s

s

C

m

m

m

11 1

3

m

m

F

U

c,

E! Y

-

0

0

J

* -

or

.

s

yr

tm

m

P

C

rt

m

. rts

0

ws

m rt

- w

0

. r

tm

v

C

m

0" K

r"

m

ci

xrt

m

m -

AP

r.

P

0

a-

rt

rr

-m

s

. m rt

3

E w

s

zz

m.

r

0

. X

Mm

c.

m

r.2

-

r-

ou

ar-

m

<

mr

t~

~~

0 3

cis

0

I - w

r.

rn

P

C(

-

0

4C

O

3

o

mP

-

C

mr

m

Ilr"

. r.

x 3

rt

a

s

aP

g

m

m-

m

r.n

ci

1

m m

r..

rt

c

r- rt x

0 3

rm

w

1

rtz

m

s w

m

Y rt

r s

mm

m

r. r

.

Qu

a

m

ZZ

2

<9

w

L

g ;

Cr

t

m

m r.

mm

0

I m

0 rt

m

- rt

r

3: ci m

0

m

rr

3

r -E

0

7

em

c

iw

3

rt

cng

.w

o

mz

mr

m

o

m-

r

m

N

WZ

sW

O

OO

Y

I1 r

rr

m

a

rt

- a

ms

N

G

P..

IU

m

m

rt

N

n

- C

C

<

Pm

m

\m

m

r.

N

m

. rt

P.

Y

w

m m

1 c

rt P

0 0

N

-rt

-Y a


have for real pk 2 0, D = diag(pl, ..., un) and

unitary U, A*A = U*DU. Hence nI - A*A = U*(nI - D)U. So the eigenvalues of 01 - A*A are T? - uk. Thus 0 - <

< 0 is necessary and sufficient. "k -

corollary. I / AZ/ I 5 qlIZ/ 1 for all Z if and only if p (A*A) - < n .

1f 0 5 0 5 1, condition (4.5.6) may be described by saying that A is norm reducing (more strictly: norm nonincreasing). If 0 < n < 1, A is a contraction. [A contraction generally means that (4.5.6) is valid with 0 5 T- < 1 where ( 1 I I can be taken to be any vector norm.]

Lemma. Let Mk, k = 1, 2, ..., be a sequence of - matrices. Then

(a) lirnk+_MkZ = 0, for all 2, if and only if

Proof. Using a compatible matrix norm, I I M I 1, one h V M k ~ ] 1 - < / / M k / l ljzli. NOW limk+_M k = 0 if

and only if limk+, / lMk 1 1 = 0. Hence (b) + (a) . Con-

versely, (b) follows from (a) if, in (a), one selects Z successively as all the unit vectors.

Theorem 4.5.2. Let Mk, k = 1, 2, ..., be a sequence of matrices and set ok = p(MgMk) = spectral radius of MGMk. Let

r (4.5.8) lim Il o k = 0.

r+- k=l

Then

for all 2 , hence


Proof. From the previous corollary,

If we wish to obtain a condition such as (4.5.7) or (4.5.8) directly on the eigenvalues of M (and not on those of M*M), it is convenient to hypothesize that M is normal.

For in this case M = U*diag(A1, ..., hn)U so that M*M = u*diag(AIX1, A2X2, ..., h )U, and the eigen- n n

2 2 2 values of M*M are precisely 1 hl 1 , / X2 I , . . . , / h n l . In this way we are led to our next result.

Theorem 4.5.3. Let Mk, k = 1, 2, ... be a sequence of normal matrices. Assume that

m

(4.5.11) n P ( M ) = 0. k=l k

Then - (4.5.12) n M ~ = o .

k=l

In the case of a sequence of circulants, see corollary to Theorem 3.6.1 for a stronger statement.

2 2 We return now to the inequality I I A Z I I S I I B Z ~ 1 . We have already seen that a necessary and sufficient condition for this is that B*B - A*A be positive semidefinite. We should like to be able to "decouple" the matrices A and B. To this end, we make the hypothesis that A and B are normal and commute. (Recall - - that this means that A*A = AA*, B*B = BB*, AB = BA.) Such pairs of matrices are remarkable in that they are simultaneously unitarily diagonalizable. We shall now prove this basic fact.

Theorem 4.5.4. Let A and B be square matrices of the same order. Then A and B are normal and commute if


and only if they are simultaneously diagonalizable by one and the same unitary matrix.

Proof. ''If." Let A = U*DIU, B = U*D U where - 2 is unitary and Dl, D2 are diagonal. Then A*A =

U * ~ ~ U U * D ~ U = u * D ~ D ~ u = U*D 5 U = AA* so that A is norma 1 1 Similarly for B. Now AB = U*DIUU*D 2 U = U*D1D2U =

U*D2D1U = BA.

"Only if." Assume that A, B are normal and commute. Since A is normal, we have for some unitary U and diagonal D, A = U*DU. Since AB = BA, we have U*DUB = BU*DU. Hence D (UBU*) = (UBU*)D. Set C = UBU*. Hence B = U*CU. Then DC = CD. Write

where p , U 2 ..., us are distinct and where ul is repeated a, times, ..., us is repeated as times,

- - a, + a., + . - . + aq = n. This displays the possible - - multiplicities of the eigenvalues of A. If now C = (c. ) , then DC = CD implies lk

U.C. - 1 ]k - %cjk j, k = 1, 2, ..., n. Therefore

if u j # u k thenc. = 0 , 1 k

if u . = uk then c. = arbitrary. I 1 k

Therefore C must be of the form C = C, Q C ? Q . . - Q Cq - - - where Cr is of order ar and is arbitrary. Since B is

normal, so is C. Since C is normal, so is each C k' k = 1, 2, ..., r (as is easily established). Hence for appropriate unitary V and diagonal Ak of order k ak, we have Ck = VgAkVk. Thus,


where

v = v Q v2 @ ... 1 Q Vs,

A = n Q A, @ ... 1 Q As.

Now

= U*(ulV;Vl @ p2V;V2 Q - . . Q u V*V )U S S S

= u* (V? Q v; Q . - - @ v;) (lJlIL1 @ . . - @ P I ) 1 as

(V1 0 v2 Q "' Q VS)U

= U*V*DVU.

Therefore VU diaqonalizes A and B. It is easily verified that VU is unitary.

Theorem 4 . 5 . 5 . Let A and B be normal and commute. Then / 1AZI I < 1 ~ B Z I I for all z if and only if there is an ordering of the eigenvalues of A and B

hl, h 2 , ..., h ; n ul, u 2 r ..., ' n (under a simultaneous diagonalization) such that

Proof. Let A and B be normal and commute. Then we can find a unitary U such that A =


Hence B*B - A*A = U*d' 2 2 2 2 1ag(lu1/ - lill E 1 1 * - Ih21 1

. . . , 2 2 ILJn 1 - A n U . Condition (4.5.13) is now .. ..

equivalent to the positive semidefiniteness of B*B - A*A.

Corollary. If A and B are circulants, then (4.5.13) is necessary and sufficient for 1 ~ A Z I 1 < 1 I B Z ~ 1 for all Z.

Proof. Circulants are normal and commute. -

In dealing with pairs of matrices that are normal and commute, it is useful to assume that their eigenvalues have been ordered so as to be consistent with the simultaneous diagonalization by unitary U.

Let M be a square matrix. We shall call a matrix A M-reducing if

(4.5.141 1 1 ~ ~ ~ 1 I 5 1 I M Z I I for all Z.

Theorem 4.5.6

(a) A is M-reducing if and only if M*M - (MA)*MA is positive semidefinite.

(b) Let A and M be normal and commute. Let A,, A

..., An; pl, ..., vn be the eigenvalues of A and M. Let JM be the set of integers r = 1, 2, ..., n for which vr # 0. Then a necessary and sufficient condi-

-

tion that A be M-reducing is that

(4.5.15) 1 . h k l - < ' for k E JM.

Proof. Under the hypothesis, there is a unitary - U such that A = U*diag (Al, . . . , h )U, M = U*diag ( L I ~ , n . U . Therefore T = M*M - (MA)*(MA) =

n- - - U*diaq(ilkuk - hkhkukuk)U. Hence the condition for

2 2 positive semidefiniteness of T is 111 (1 - I A 1 ) 0, k k -

k = 1, 2, ..., n. This is equivalent to (4.5.15).


Corollary. A is variation reducing [see (4.5.5)l if and only if (I - n)* (I - n) - ((I - n)A) * ((I - n)Al is positive semidefinite.

Proof. Set M = I - n. -

Corollary. Let A be a circulant with eigenvalues Al,

..., A n . Then a necessary and sufficient condition .. that A be variation reducing is that

j-1 Proof. The eigenvalues of M = I - n are 1 - w ,

j = 1 . n. Hence JI-n = 12, 3, ..., nl.

PROBLEM

1. Consider the nonautonomous system of difference equations Z = G Z where n+l n n

Show that p(Gn) < 1, but the sequence Z may n diverge. (Markus-Yamabe, discretized.)

4.6 APPLICATIONS TO ELEMENTARY PLANE GEOMETRY: n-GONS AND Kr-GRAMS

We begin with two theorems from elementary plane geometry.

Theorem A. Let zl, z2, z 3' z4 be the vertices of a

quadrilateral. Connect the midpoints of the sides cyclically. Then the figure that results is always a parallelogram (Figure 4.6.1). Write P = (z

T 1' z2r

Z3' z4) C1/2 = circ(l/2, 1/2, 0, 0). This means

that C,,,P is always a parallelogram. Hence the I/ '.

transformation C is not invertible. (For if it 1/2

Some Geometrical Applications

Figure 4.6.1

were, there would be quadrilaterals whose midpoint quadrilaterals would be arbitrary.)

Theorem B. Given any triangle, erect upon its sides outwardly (or inwardly) equilateral triangles. Then the centers of the three equilateral triangles form an equilateral triangle (see Figure 4.6.2). This is known as Napoleon's theorem.

Figure 4.6.2

Applications to Elementary Plane Geometry 141

Our object is now to unify and generalize these two theorems by means of circulant transforms and to derive extremal properties of certain familiar geometrical configurations by means of the M-P inverses of relevant circulants.

Let us first find simple characterizations for equilateral triangles and parallelograms. Let zl, z2,

z3 be the vertices of a triangle T in counterclockwise

order. Then T is equilateral if and only if

(4.6.la) 2 2% i Z1 t WZ + W z3 = 0, w = exp

while

(4.6.lb) 2 Z1 t w z2 + WZ3 = 0

is necessary and sufficient for clockwise equilateral- ity. The proof is easily derived from the fact that if zl, z2, z3 are clockwise equilateral they are the

2 images under z + a + bz of 1, w, w ; that is, if and only if for some a, b, zl = a + b, z2 = a + bw, -

2 z3 - a t bw . Of course, if b = 0, the three points degenerate to a single point. The center of the triangle is defined to be z = a = c.g. (zl, z 2r z3).

Let zl, z2, z3, z4 be a non-self-intersecting

quadrilateral Q given counterclockwise. Then Q is a parallelogram if and only if . (4.6.2) z1 - z2 + 2

3 - z = 0.

4

This is readily established. For integer n - > 3 and integer r set w = exp(2ni/n)

and set

1 r 2r (4.6.3) K =-circ(1, w , w , ..., w (n-l)r r n ) - Notice that the rows of Kr are identical to the

first row 1, wr, .. ., w (n-l)r !. , mul'iiplied by some w . In particular, one has

i L (4.6.4) n = 3, r = 1 : K = -c~rc(l, w, w ) , 1 3

w = exp (2ni/3),

142 Some Geometrical A p p l i c a t i o n s

1 . ( 4 . 6 . 5 ) n = 4 , r = 2 : K = + l r c ( l , -1, 1, 2 4

- 1 1 ,

w = e x p ( 2 n i / 4 ) = i.

W e see f rom ( 4 . 4 . 1 ) and ( 4 . 4 . 2 ) t h a t P i s e q u i l a t e r a l or a p a r a l l e l o g r a m ( i n t e r p r e t e d p r o p e r l y ) if and o n l y i f KP = 0 , t h a t i s , i f and o n l y i f P l i es i n t h e n u l l s p a c e o f K . T h i s l e a d s t o t h e d e f i n i t i o n

D e f i n i t i o n . An n-gon P = ( z l , Z 2 , . . . , z n ) w i l l

be c a l l e d a gram i f and o n l y if

o r e q u i v a l e n t l y i f and o n l y i f

The r e p r e s e n t e r p o l y n o m i a l f o r Kr i s p ( z ) = ( l / n )

2 r z 2 + 0 . . + w ( n - 1 ) r n-1 (1 + wrz + w z ) = ( ( w r z ) " - 1)/

r n ( w z - 1). The e i q e n v a l u e s o f K r a r e p (wj- ' ) , . - 1 =

1, 2 , . n . NOW f o r j - 1 # n r , p(wl-') = 0. n - r + l

w h i l e p ( w ) = 1. Thus i f

t h e n Kr = F * d i a g ( O , 0 , ..., 0 , 1, 0 , ..., O)F, t h e 1

o c c u r r i n g i n t h e j t h p o s i t i o n . T h i s means t h a t

( 4 . 6 . 8 ) Kr = F*A.F = B . [ s e e ( 3 . 4 . 9 ) ] . I I

The B . are t h e p r i n c i p a l i d e m p o t e n t s o f a l l c i r c u l a n t s 3

o f o r d e r n . We h a v e [ s e e a f t e r (3.4.1011

I f C is a c i r c u l a n t o f r a n k n - 1, t h e n by ( 3 . 3 . 1 3 ) , f o r some i n t e g e r j . 1 - < j - < n ,

From ( 4 . 6 . 8 ) , ( 4 . 6 . 9 ) , and S e c t i o n 2 .8 .2 , p r o p e r t i e s (1) and ( 2 ) .

A p p l i c a t i o n s t o E l e m e n t a r y P l a n e Geometry 1 4 3

S e v e r a l more i d e n t i t i e s w i l l b e o f u s e . A g a i n , r 2 r l e t Kr = ( l / n ) c i r c ( l , w , w , ..., w ( n - l ) r I . L e t Y

be a n a r b i t r a r y c i r c u l a n t so t h a t o n e c a n w r i t e Y = F* d i a g ( q l . v 2 . ..., n n l F f o r a p p r o p r i a t e q . . NOW K ~ Y

1 = (F*A . F ) ( F * d i a g ( n l , . . . ,

I n n ) F l = F * d i a g ( O , . .. , 0 , q j ,

0 , ..., O ) F = q .F*A.F = n.K . Thus 1 3 I r

6 . 1 1 KrY = n . K . I r

I n p a r t i c u l a r , i f Y i s m e r e l y a column v e c t o r

Y = ( y o , y l , ... , Y ~ - ~ ) ~ ~ t h e n

( 4 . 6 . 1 2 ) K r Y = n . f c ( K r ) 3

where t h e n o t a t i o n f c ( K d e s i g n a t e s t h e f i r s t column r

o f K r . One a l s o h a s

( 4 . 6 . 1 3 ) K Y = o ( l , w ( " - ' ) ~ ( n -21 r r W r T

r , ..., w )

where

( 4 . 6 . 1 4 ) o = yo + ylw r + ... ( n - l ) r

+ Yn-lw

L e t Y be f u r t h e r s p e c i a l i z e d t o Y = f c ( K r ) . Then Y =

1 / 1 1 w ( n - l ) r ( n - 2 ) r . w r T , ..., w 1 . T h e r e f o r e f rom ( 4 . 6 . 1 4 ) . o = 1 , and f rom ( 4 . 6 . 1 3 )

Each c i r c u l a n t C o f r a n k n - 1 d e t e r m i n e s a n i n t e g e r j u n i q u e l y , and t h r o u g h ( 3 . 3 . 1 3 ) and ( 4 . 6 . 9 ) a m a t r i x K r , h e n c e a c l a s s o f K -grams. I n t h e f o l - r l o w i n g t h e o r e m s t h i s d e t e r m i n a t i o n w i l l be assumed.

Theorem 4 .6 .1 . L e t P b e a n n-gon. Then t h e r e e x i s t s

a n n-qon 6 s u c h t h a t CG = P i f and o n l y i f P i s a K - gram. r

144 Some G e o m e t r i c a l A p p l i c a t i o n s

Proof. The sys tem of e q u a t i o n s CB = P h a s a

s o l u t i o n i f and o n l y i f P = C C ~ P . T h i s i s e q u i v a l e n t t o P = ( I - Kr)P = P - KrP o r KrP = 0 [by ( 4 . 6 . 9 ) l .

C o r o l l a r x . L e t P be a Kr-gram. Then t h e g e n e r a l

s o l u t i o n t o C; = P i s g i v e n by

(4 .6 .16) $ = C ~ P + T f c ( K r )

f o r an a r b i t r a r y c o n s t a n t T.

P r o o f . I f P i s a Kr-gram, t h e n t h e g e n e r a l

s o l u t i o n t o c6 = P i s g iven by 6 = c T p + ( I - C ~ C ) Y = C'P + K,Y f o r a n a r b i t r a r y column v e c t o r Y. From

( 4 . 6 . 1 7 ) . K r Y = n . f c ( K r ) and t h e s t a t e m e n t f o l l o w s . I

C o r o l l a r y . P i s a Kr-gram i f and o n l y i f t h e r e i s a n

n-gon Q s u c h t h a t P = CQ.

P r o o f . L e t P = CQ. Then KrP = KrCQ. S i n c e KrC - - = 0 , i t f o l l o w s t h a t KrP = 0 s o t h a t P i s a Kr-gram.

A

Converse ly , l e t P be a Kr-gram. Now t a k e f o r Q any P

whose e x i s t e n c e i s g u a r a n t e e d by t h e p r e v i o u s c o r o l - l a r y .

C o r o l l a r y . Given a n n-gon P which i s a Kr-gram. Then A

given an a r b i t r a r y complex number zl , we c a n f i n d a A

unique n-gon P = (il , ", ..., ;,IT, w i t h :1 a s i t s

f i r s t v e r t e x and such t h a t ~6 = P.

P r o o f . S i n c e t h e g e n e r a l s o l u t i o n o f C$ = P i s -

P = C'P + T f c ( K r ) , q i v e n gl, we may s o l v e u n i q u e l y

f o r a n a p p r o p r i a t e T s i n c e t h e f i r s t component o f f c ( K r ) i s 1 (# 0 ) .

Theorem 4.6 .2 . L e t P be a n n-gon which i s a Kr-gram.

Then t h e r e is a u n i q u e n-qon Q which i s a Kr-gram and

such t h a t CQ = P. I t i s g i v e n by Q = C ~ P .

A p p l i c a t i o n s t o Elementary P l a n e Geometry 145

Proof O S i n c e P i s a Kr-gram, it h a s t h e form P = CR

f o r some R. Hence Q = C'P = cTcR = c(c 'R) . Hence Q i s a K -gram. r

( b ) Q i s a s o l u t i o n of CQ = P, as we c a n s e e by s e l e c t i n g T = 0 i n t h e above.

(c ) A l l s o l u t i o n s a r e o f t h e form P = C ~ P + A

T f c ( K r ) . Now P i s a Kr-gram i f and o n l y i f K,$ = 0.

T h a t is , i f and o n l y i f K , C ~ P + rKrfc(Kr) = 0. Now

K ~ C - = 0. But Krfc ( K r ) = Kr. T h e r e f o r e T = 0.

Theorem 4.6.3. L e t P be a Kr-gram. Among t h e i n f i n -

i t e l y many n-gons R f o r which CR = P, t h e r e is a u n i q u e one o f minimum norm 1 / R / 1 . I t i s q i v e n by R =

CTp. Hence it c o i n c i d e s w i t h t h e u n i q u e Kr-gram Q such t h a t CQ = P.

P r o o f . Use t h e l a s t theorem and t h e l e a s t - s q u a r e s c h a r a c t e r i z a t i o n of t h e M-P i n v e r s e .

Suppose now t h a t P i s a g e n e r a l n-gon and we wish t o approx imate it by a Kr-gram R such t h a t I I P - R I / = minimum. Every K -gram c a n be w r i t t e n a s R = CQ f o r r some n-gon Q s o t h a t o u r problem is: g i v e n P , f i n d a Q such t h a t 1 I P - C Q I I = minimum. T h i s problem h a s a s o l u t i o n , and t h e s o l u t i o n is un ique i f and o n l y i f t h e columns o f C a r e l i n e a r l y i n d e p e n d e n t . T h i s i s n o t t h e c a s e ( t h e r a n k o f C b e i n g n - l ) , hence Q =

C ~ P i s t h e s o l u t i o n w i t h minimum 1 1 . Thus, R = CQ

= C C ~ P i s t h e b e s t a p r o x i m a t i o n o f t h e n-gon P by a K -gram w i t h minimum 7 I Q I I . W e p h r a s e t h i s a s f o l l o w s . r

Theorem 4 .6 .4 . Given a g e n e r a l n-gon P = ( z T 1' ""

zn) . The un ique Kr-gram R = CQ f o r which I I P - R I 1 =

minimum and / I Q I / = minimum i s g i v e n by

(4 .6 .17) R = CC-P = (1 - Kr)P = P - KrP

= P - o ( 1 , w ( n - l ) r ( n - 2 ) r r~ , w , ..., w )


where o = zl + z wr + -.. + znW . Alternatively, 2

this can be written as

where n. is determined from I

circ(zl, z2, . . . , z n ) = F*diag(nl, q2, . . . ,nn)F.

proof. AS before, R = C C ~ P = (I - K )P = P - - r K,P. BY (4.6.12), K ~ P = n .fc (K ) . Notice that R is

3 r a K -gram because KrR = Kr (P - Ii . f c (Kr) = KrP -

r 3 q.K fc(Kr). Since by (4.6.15) Krfc(Kr) = fc(Kr), 3 r

K R = 0. r

Notice also that if P is already a Kr-gram, a =

Z + z wr + * . . 2 + znw = 0. In this case, from

1 (4.6.17), R = P: so, as expected, P is its own best approximation.

Generally, of course, the operation R(P) = CCIP is a projection onto the row or column space of C.

4.7 THE SPECIAL CASE: circ(s, t, 0, ..., 0) An interesting class of cyclic transformations comes about from circ(s, t, 0, 0, ..., O), of order n, where one assumes that s + t = 1, st # 0, and that the rank is n - 1. Write

The representer polynomial is p(z) = s + (1 - s)z, so k k

that the eigenvalues of Cs are p(w ) = s + (1 - s)w , k = 0 1, . - 1 . Suppose that for a fixed j, 0 - <

j - < n - 1, s + (1 - s)w3 = 0. Thus, there will be a

zero eiqenvalue if and only if s = w3/(w3 - 11, t =

1 - w . For such s, Cs can have no more than one k zero eiqenvalue since s + (1 - s)w = s + (1 - s)w3 = 0

k implies that w = w3, or k = j. Thus we have

The Special Case 147

Theorem 4.7.1. The circulant Cs has rank n - 1 if and only if for some integer j, 0 < j < n - 1, - -

In this case,

If s is real, then C has rank n - 1 if and only if n s is even and s = t = 1/2.

Proof. The j + 1st eigenvalue of Cs is zero. Hence (4.7. 2) follows by (4.6.7), (4.6.9). If s is

real, so is 1 - s and hence 1 - wl. Therefore wJ is

real. Since j = 0 is impossible (s = m ) , w3 = -1. This can happen if and only if n is even. From (4.7.2). s = t = 1/2.

If s is real, the transformation induced by Cc - is interesting visually because the vertices of P = CsP lie on the sides (possibly extended) of P. More-

over, if s and t are limited by

- that is,a convex combination, then P is obtained from

P in a simple manner: the vertices of $ divide the sides of P internally into the ratio s: 1 - s. (Cf. Section 1.2.)

If s and t are complex, we shall point out a geometric interpretation subsequently.

As seen, if n = even and s is real, then C9 is - singular if and only if s = t = 1/2. In all other real cases, the circulant Cs is nonsingular and hence, given

an arbitrary n-gon P, it will have a unique pre-image 5 under Cs: csP = P.

Example. Let n = 4, s = t = 1/2. If Q is any quadrilateral, then ClI2Q is mbtained from Q by joining suc-

cessively the midpoints of the sides of Q. rt is


therefore a parallelogram. Hence, if one starts with a quadrilateral Q, which is not a parallelogram, it can have no pre-image under ClL2.

Since in such a case the system of equations can be "solved" by the application of a generalized inverse, we seek a geometric interpretation of this process.

4.8 ELEMENTARY GEOMETRY AND THE MOORE-PENROSE INVERSE

select = even, s = t = 1/2. Then Cs = circ(l/2, 1/2,

0, . 0 For simplicity designate C112 by D:

This corresponds to j = n/2 in (4.7.2). Hence by (4.7.3)

(4.8.2) DD= = I - K 4 2

where by (4.6.3)

For simplicity we write K = K. n/2

It is of some interest to have the explicit

expression for D-.

Theorem. Let D = circ(l/2, 1/2, 0, 0, ..., 0) be of order n, where n is even. Let

(-1) (n/2)-1 (4.8.4) E = circ n 1 n 2 1 ( n - 1 . . . ,

5, -3, 1, 1, -3, 5, ..., (-1) (42)-1 (n-l) ) .

Then E = D ~ .

As particular instances note:

1 n = 4: D~ = circ -(3, -1, -1, 3) 4 - 1 n = 6: D' = circ z(5, -3, 1, 1, -3, 5).

Elementary Geometry 14 9

Proof (a) A simple computation shows that

DE = c i r c n n - 1 1 - 1 1 - 1 ..., -1, 1)

Hence DED = (I - K)D = D - KD = D, since by (4.6.10) (or by a direct computation) KD = 0.

(b) On the other hand, EDE = DEE = (I - K)E = E - KE. An equally simple computation shows that KE =

0. Hence EDE = E. Thus by (2.8.2) (1)-(4). E = D ~ .

From (4.6.6b) or (4.6.6a). in the case under ~~ -

study, a K-gram is an n-gon whose vertices zl, ..., z satisfy n

(4.8.5) zl - z 2 + z 3 - Z + " ' + Z 4 - z = 0. n- 1 n

It is easily verified that for n = 4 the condition

holds if and only if zl, z2, zj, z4 (in that order)

form a conventional parallelogram. Thus, an n-gon which satisfies (4.8.5) is a "generalized" parallelogram. The sequence of theorems of Section 4.6 can now be given specific content in terms of parallelograms or generalized parallelograms. We shall write it up in terms of parallelograms.

Theorem 4.8.2. Let P be a quadrilateral. Then there

exists a quadrilateral 6 such that DP = 6 (the midpoint property) if and only if P is a parallelogram.

Corollary. Let P be a parallelogram. Then the gen- A

era1 solution to DP = P is given by

for an arbitrary constant T.

Corollary. P is a parallelogram if and only if there is a quadrilateral Q such that P = DQ.

150 Some Geometrical ~pplications

Corollary. Let P be a parallelogram. Then, given an

arbitrary number zl, we can find a unique quadrilat- A

era1 P with gl as its first vertex such that DP = P.

Theorem 4.8.3. Let P be a parallelogram. Then there is a unique parallelogram Q such that DQ = P. It is

qiven by Q = Dip.

Notice what this is saying. DQ is the parallelogram formed from the midpoints of the sides of Q. Given a parallelogram P, we can find infinitely many quadrilaterals Q such that DQ = P. The first vertex may be chosen arbitrarily and this fixes all other vertices uniquely. But there is a unique parallelogram

Q such that DQ = P. It can be found from Q = Dip (see Figure 4.8.1).

Figure 4.8.1

Theorem 4.8.4. Let P be a parallelogram. Among the infinitely many quadrilaterals R for which DR = P, there is a unique one of minimum norm I I R I I. It is

given by R = D ~ P . Hence it coincides with the unique paralleloqram Q such that DQ = P.

Theorem 4.8.5. Let P be a general quadrilateral. The unique parallelogram R = DQ for which 1 IP - R / =

minimum and I Q I I = minimum is qiven by R = (1 - K)P.

In the theorem of Section 4.7, select n = 3 and

$ = exp(2ni/3), so that w3 = 1. Select j = 1, so that s = W/(W - 1). 1 - s = 1/(1 - w). In view of 1 + w + w2 = 0, this simplifies to s = 1/3 (1 - w , 1 - s =

2 1 3 1 - w . On the other hand, the selection j = 2

Elementary Geometry 151

2 2 2 leads to s = w / ( w - 1) = 113 (1 - w ) , 1 - s =

1 - w 2 = 1 3 1 - w . The corresponding circulants Cs we shall designate by N (in honor of Napoleon):

1 2 (4.8.7) NI = circ - w, 1 - w , O), j = 1

1 2 No = circ ?(1 - w , 1 - w, O), j = 2

the subscripts I, 0 standing for "inner" and "outer." For brevity we exhibit only the outer case, writing

1 2 (4.8.7') N = circ -(1 - w , 1 - w, 0). 3

We have

1 KO = circ -(l, 1, 11, 3 1 2 (4.8.8) K = circ ?(1, W, w ) , KO + K1 + K2 = I 1 1 2 K2 = circ ~ ( 1 , w , w).

From (4.7.3) with n = 3, j = 2,

-

Theorem 4.8.6. N' = K - WK 0 2'

Proof. Let E = Kg - wK2 Then from (4.8.7'), - 2 N = K - w K 2 0 2. Hence, NE = (KO - w K2) (KO - wK2) =

- - 2 3 2 K + w K = K + K2 = I - Kl [cf. after (4.6.8)l. 0 2 0

2 2 Therefore NEN = (I - K1)(KO - w K ) = K - w K2 = I$. 2 0 Similarly, ENE = (I - K ~ ) (KO - WK ) = K - wK2 = E. 2 0 Thus, by Section 2.8.2, properties (1) to (4). E = N ~ .

It follows from (4.6.la) and (4.6.lb) that a counterclockwise equilateral triangle is a K,-gram,

A

while a clockwise equilateral triangle is a K2-gram.

Let now (zl, z2, z3) be the vertices of an

arbitrary triangle. On the sides of this triangle erect equilateral triangles outwardly. Let their

vertices be zi, z' z' From (4.6.la), 2' 3'


The centers of the equilateral triangles are therefore

This may be written as

providing us with a geometric interpretation of the transformation induced by Napoleon's matrix.

The sequence of theorems of Section 4.6 can now be given specific content in terms of the Napoleon operator. In what follows all figures are taken counterclockwise.

Theorem 4.8.7. Let T be a triangle. Then there

exists a triangle ? such that N? = T if and only if T is equilateral. (The "only if" part is Napoleon's theorem. )

Corollary. Let T be equilateral. Then the general A

solution to NT = T is given by

for an arbitrary constant T.

Corollary. T is equilateral if and only if T = NQ for some triangle Q.

Corollary. Given an equilateral triangle T. Given

also an arbitrary complex number There is a

unique triangle ? with as its first vertex such

that N? = T.

Theorem 4.8.8. Let T be an equilateral triangle.

Elementary Geometry 153

Then there is a unique equilateral triangle Q such

that NQ = T. It is given by Q = N~T.

Theorem 4.8.9. Let T be equilateral. Let R be any triangle with NR = T. The unique such R of minimum

norm I I R I I is the equilateral triangle R = N ~ T . It is identical to the unique equilateral triangle Q for which NQ = T. (See Figure 4.8.2.)

Figure 4.8.2

Finally, suppose we are given an arbitrary triangle T and we wish to approximate it optimally by an equilateral triangle. Here is the story.

Theorem 4.8.10. Let T be arbitrary; then the equilateral triangle NR for which I I T - N R ( ~ = minimum and

such that I I R / ( = minimum is given by R = N'T and NR =

N N ~ T = (I - K )T. 1

PROBLEMS

1. Discuss the matrix circ(l/3, 1/3, 1/3, 0, 0, 0) from the present points of view and derive geometrical theorems. To start: this matrix maps every 6-gon into a parahexagon, that is, a 6-gon whose

'(13

rt

3b

'0

r

om

e

c

+a

O

(D

m

r-

(D

a

2m

m

H

m.

*M

e

C

0

nru

m

art2

C

5.e

. a

e

rt T

I A

Y (

D

r

< 14

(D

r.

3

as

3

'T

[U rt-

~r

.

h

m

OQ

m

3

c ru

02

3

-0

P

O

ar

r

ci

'i

u

0

X

nc

; 0

.

mo

2

Hm

l

m1

U

m -

rt r

0

PC

Q

om

2

ti

rt

m

* 'C

"3

Yt

r

zo

r.

m

-

em

h

n

(D

m h

m

ar

t s

r.

7

Y

r-a

(

Dr

tr

D

2m

0

am

Q H

m

3

3

m r

t h r

e r

. ' m

r.

N 0

e Y

rt

e

r-a

0

rt

2

m

01 h

1>6 Generalizations of Circulants

places is the same as a shift of g mod n places. By convention, if g is negative, shifting to the right g places will be equivalent to shifting to the left (-9) places. ~ h u s , for any integers q, g' with g' g(mod n) a 9'-circulant and a g-circulant are synonym- ous.

Example 1. A 4-circulant of order 6 is

a 4 a 5 a

a3 a4 a5 a6 al a2

al a2 a3

Example 2. A 1-circulant is an (ordinary) circulant.

Example 3. A 0-circulant is one in which all rows are identical.

Example 4. J = circ(1, 1, ..., 1) is a g-circulant for all g.

Example 5. A (-1)-circulant (or an (n - 1)-circulant) has each successive row moved one place to the left. It is sometimes called a left circulant or an anti- circulant or a retrocirculant. Thus

is the anti-identity or the counter-identity.

Let A = (a,.). Then, evidently, A is a g- 1 I

circulant if and only if

(5.1.2) a. . = a. i, j = 1, 2, ..., n. I t 3 l+l, j+g

Equivalently, if A = (a. . ) = g-circ (a a2, ..., an), then 1 I

Take g > 0 and let (n, g) designate the greatest common divisor of n and g. The g-circulants split into two types depending on whether (n, g) = 1 or (n, g) > 1. The multiples kg, k = 1, 2, . . ., n through a complete residue system mod n if and any; if (n, g! = 1. Hence the rows of the general g-circulant are dlstinct if and only if (n, 9) = 1. In this case, the rows of a 9-circulant may be permuted so as to yield an ordinary circulant. Similarly for columns. Hence if A is a 9-circulant, (n, g) = 1, then for appropriate permutation matrices P 1' P 2

(5.1.4a) A = P C, 1

(5.1.4b) A = CP2,

where in (5.1.4a) C is an ordinary circulant whose first row is identical to that of A. In a certain sense, then, if (n, g) = 1, a g-circulant is an ordinary circulant followed by a renumbering.

However, the details of the diagonalization, and so on, are considerable. If (n, 9) > 1, this is a degenerate case, and naturally there are further com- plications.

Example. Making use of the geometric construction of Section 1.4, we shall illustrate this distinction by the two matrices of order 8:

In the first case, transformation of the vertices of a regular octagon by A, yields a regular octagon in - permuted order (Figure 5.1.1). In the second case, a square covered twice (Figure 5.1.2).

Theorem 5.1.1. A is a g-circulant if and only if

(5.1.5) nA = ~n'.

1 2 ... n Proof. In (2.4.6) take o = (2 ... Then

Pa = n so that if A = a , nA = (a. 1 I + In

(2.4.8). take

9 . 3 , n: 8

F i g u r e 5.1.1

9 - 2 , n - 8

F i g u r e 5 .1 .2

\ 1 + g 2 + g ... 4 1 t h e n P -1 = (% )-l = ag. Hence T A ~ - ~ =

o (ai+l, j+g) . The r e s u l t now f o l l o w s from ( 4 . 1 . 2 ) .

C o r o l l a r y . L e t A and B be g - c i r c u l a n t s . Then AB* is a 1 - c i r c u l a n t . I n p a r t i c u l a r , i f A i s a g - c i r c u l a n t , AA* is a 1 - c i r c u l a n t .

P roof . A = n*Ang, B = n*Bn9. Hence AB* =

n * ~ n ~ ? r * ~ ~ * n = n*AB*?r.

Theorem 5.1.2. If A i s a q - c i r c u l a n t and B i s a n h- c i r c u l a n t t h e n AB i s a g h - c i r c u l a n t .

h P r o o f . nA = &ng and ilB = Bn . Now

n(AB) = AvgB = ( ~ n ~ - l ) (nB) = (Ang-') (Bnh)

h h h = ( ~ n ~ - ~ ) (nBn ) = ( ~ n ~ - ~ ) (Bn ) n

=

Keep t h i s up f o r h t i m e s , l e a d i n g t o

9h ?r(AB) = ( ~ n ~ - ~ ) ( ~ n ' ~ ) = (AB)n . Now a p p l y Theorem 5.1 .1 .

W e r e q u i r e s e v e r a l f a c t s from t h e e l e m e n t a r y t h e o r y o f numbers.

Lemma 5.1.3. L e t q , n be i n t e g e r s n o t bo th 0. Then t h e e q u a t i o n

h a s a s o l u t i o n i f and o n l y i f ( n , g ) = 1.

P r o o f . I t is wel l known t h a t g iven i n t e g e r s g , n , - n o t bo th 0, t h e n t h e r e e x i s t i n t e g e r s x , y such t h a t gx - ny = ( n , g ) . Hence i f ( n , g ) = 1, ( 5 . 1 . 6 ) h a s a s o l u t i o n . Converse ly , i f ( 5 . 1 . 6 ) h o l d s , t h e n f o r some

160 Generalizations of Circulants

integer k, gx - 1 = kn. If q and k have a common factor > 1, it would divide 1, which is impossible. I Corollary. For (n, 9) = 1, the solution to qx = 1 (mod n) is unique mod n.

proof. Let gxl = 1 (mod n) and qx2 = 1 mod n;

then q(xl - x ) = 0 (mod n). Since (n, g) = 1. 2

( x ~ - x ) = 0 (mod n). 2

For (n, g) = 1 we shall designate the unique - 1

solution of (5.1.6) by g . Theorem 5.1.4. Let A be a nonsinqular g-circulant. - Then A-l is a g l-circulant. 1

Proof. Since A is nonsingular, it follows that - - 1 - 1 (n, g) = 1, hence that g exists with qg = 1 (mod

-1"-1 - n) : Now, from (5.1.5) nA = An9 so that A - T - ~ A - ~ . Hence

-g+l -1 -g+l -1 -1 T12 T A - ~ = n A n = n (A n )

-9+1 ("-qA-l), 2 = 71 -2q+lA-ln2* = Tl

DO this s times, and we obtain

n'4-l = n -sq+lA-ins -

Now select s = q and there is obtained nA-l = - 1 -1 . -

A-ln9 , which tells us that A is a q l-circulant. I Theorem 5.1.5. A is a g-circulant if and only if (Ai) * is a g-circulant. I

Proof. Let A be a g-circulant. Then A = n-lAn9. - 1 -g i

Hence (since n , n , n9 are unitary) A+ = n A n.

~ h u s ( A ~ ) * = n* (A+) * (n-9)* = IT-~(A+)*T~. Therefore

( A $ ) * is a g-circulant. Conversely, let (Ai)* be a g-circulant. Then by 1

what we have just shown, ( ((Ai) * ) ' I * is also a g- circulant. But this is precisely A.

Corollary. If A is a q-circulant then AAi is a 1- circulant.

Proof. In the corollary to Theorem 5.1.1, take - B = (A7)*. This is a q-circulant by what we have just

shown. Hence AB* = AA7 is a l-circulant.

If A is a g-circulant, then AA* is a l-circulant. Hence it may be written as AA* = F*hAA,F where h an* is

.-. the diagonal of eigenvalues of AA*. Now by Problem 16

of Section 2.8.2, for any matrix M, M7 = M*(MM*)~. Hence

Theorem 5.1.6. If A is a q-circulant, then

(5.1.7) AT = A*(AA*)~ = A*F*A~ AA*F.

We now produce a generalization of the representation (3.1.4). Let

Notice that Q_ is a permutation matrix and is '3

unitary if and on1.y if (n, 9) = 1. (For in this case and only in this case will Qn have precisely one 1 in

3

each row and column.)

Theorem 5.1.7

Proof. The positions in A occupied by the symbol a are precisely those occupied by a 1 in Q . The 1 4 positions occupied by the symbol a2 in A are one

place to the right (with wraparound) of those occupied

162 Generalizations of Circulants

by al. Since right multiplication by n pushes all

the elements of A one space to the right, it follows that the positions occupied by a2 in A are precisely

those occupied by 1 in Q n. Similarly for a 9

3r ..., a n' Corollary. A is a 9-circulant if and only if it is of the form Q C where C is a circulant.

9 Proof. Use (3.1.4).

Since

one has

Corollary. A is a (-1)-circulant if and only if it has the form A = TC where C is a circulant and where the first rows of A and C are identical.

Corollary. A is a (-1)-circulant if and only if it has the form

where A is diagonal. In this case,

for integer values of n.

Proof. A = TC with circulant C. But such C = - F*AF, so that A = (TF*)hF. From the corollary to

Theorem 2.5.2, F * ~ = T* = r so that TF* = F * ~ = F*T and (5.1.10) follows.

If A = diag(X1, . . . ,An), then

The eigenvalues of the (-l)-circulant A are identical to those of FA and the latter are easily computed. (See Section 5.3.)

Note also that

5 . 1 2 trn)' = diag(hlhl, hnA2, An-1X3, .. ., i2in) so that the even powers of T A are readily available.

PROBLEMS

1. Prove that g-circulants form a linear space under matrix addition and scalar multiplication.

2. Let S denote the set of all matrices of order n that are of the form aA + BB where A is a circu- land and B is a (-1)-circulant. Show that they form a ring under matrix addition and multiplication.

3. What conditions on n and g are sufficient to guarantee that the g-circulants form a ring?

k 4. Let A be a g-circulant. Then for integer k, T A =

~ n ~ ~ . Hence if g / n, nnIg~ = A.

5. Let (n, g) = 1 and suppose that A is a g-circulant. Prove that there exists a minimum integer r 5 1, such that A~ is a circulant. Hint: use the Euler- Fermat theorem. See Section 5.4.2.

6. Let (n, g) = 1. Prove that if A is a g-circulant, each column can be obtained from the previous

column by a downshift of g-' places.

If g = 0, each row of A is the previous row "shifted" zero places. Hence all the rows are identical. Since the rows are identical, r (A) 5 1. If r (A) = 0, A = 0, and the work is trivial. Suppose, then, that r(A) = 1. Then, by a familiar theorem (see Lancaster [l], p. 56), A must have a zero eigenvalue of multiplicity 2 n - 1. Its characteristic polynomial is therefore of the form

An - ohn-'. If we write A = 0-circ(a 1, a2, ..., an) =

Date post:	21-Jul-2016
Category:	Documents
Upload:	kunduru-srinivasa-reddy
View:	105 times
Download:	16 times

Davis Circulant Matrices

Documents