Date post:  21Jul2016 
Category: 
Documents 
Upload:  kundurusrinivasareddy 
View:  105 times 
Download:  16 times 
PREFACE
"Mathematics," wrote Alfred North Whitehead, "is the most powerful technique for the understanding of pattern and for the analysis of the relations of pat terns." In its pursuit of pattern, however, mathema tics itself exhibits pattern; the mathematics on the printed page often has visual appeal. Spatial arrange ments embodied in formulae can be a source of mathe matical inspiration and aesthetic delight.
The theory of matrices exhibits much that is visually attractive. Thus, diagonal matrices, symmet ric matrices, (0, 1) matrices, and the like are attractive independently of their applications. In the same category are the circulants. A circulant matrix is one in which a basic row of numbers is repeated again and again, but with a shift in posi tion. Circulant matrices have many connections to problems in physics, to image processing, to probabil ity and statistics, to numerical analysis, to number theory, to geometry. The builtin periodicity means that circulants tie in with Fourier analysis and group theory.
A different reason may be advanced for the study of circulants. The theory of circulants is a relative ly easy one. Practically every matrixtheoretic ques tion for circulants may be resolved in "closed form." Thus the circulants constitute a nontrivial but simple set of objects that the reader may use to practice, and ultimately deepen, a knowledge of matrix theory.
Writers on matrix theory appear to have given circulants short shrift, so that the basic facts are
vii
viii Preface Preface ix
rediscovered over and over again. This book is inten ded to serve as a general reference on circulants as well as to provide alternate or supplemental material for intermediate courses in matrix theory. The reader will need to be familiar with the geometry of the com plex plane and with the elementary portions of matrix theory up through unitary matrices and the diagonaliza tion of Hermitian matrices. In a few places the Jordan form is used.
This work contains some general discussion of matrices (block matrices, Kronecker products, the UDV theorem, generalized inverses). These topics have been included because of their application to circulants and because they are not always available in general books on linear algebra and matrix theory. More than 200 problems of varying difficulty have been included.
It would have been possible to develop the theory of circulants and their generalizations from the point of view of finite abelian groups and group matrices. However, my interest in the subject has a strong numer ical and geometric base, which pointed me in the direc tion taken. The interested reader will find references to these algebraic matters.
Closely related to circulants are the Toeplitz matrices. This theory and its applications constitute a world of its own, and a few references will have to suffice. The bibliography also contains references to applications of circulants in physics and to the solu tion of differential equations.
I acknowledge the help and advice received from Professor Emilie V. Haynsworth. At every turn she has provided me with information, elegant proofs, and encouragement.
I have profited from numerous discussions with Professors J. H. Ahlberg and Igor Najfeld and should like to thank them for their interest in this essay. Philip R. Thrift suggested some important changes.
Thanks are also due to Gary Rosen for the Calcomp plots of the iterated ngons and to Eleanor Addison for the figures. Katrina Avery, Frances Beagan, Ezoura Fonseca, and Frances Gajdowski have helped me enormous ly in the preparation of the manuscript, and I wish to thank them for this work, as well as for other help rendered in the past.
The Canadian Journal of Mathematics has allowed me to reprint portions of an article of mine and I would like to acknowledge this courtesy.
Finally, I would like to thank Beatrice Shube for inviting me to join her distinguished roster of scientific authors and the staff of John Wiley and Sons for their efficient and skillful handling of the manuscript.
Philip J. Davis
Providence, Rhode Island April, 1979
CONTENTS
Notation
Chapter 1 An Introductory Geometrical Application
xiii
1
1.1 Nested triangles, 1 1.2 The transformation a, 4 1.3 The transformation o , iterated with
different values of s, 10 1.4 Nested polygons, 12
Chapter 2 Introductory Matrix Material
2.1 Block operations, 16 2.2 Direct sums, 21 2.3 Kronecker product, 22 2.4 Permutation matrices, 24 2.5 The Fourier matrix, 31 2.6 Hadamard matrices. 7 7 .  . 2.7 Trace, 40 2.8 Generalized inverse. 40 2.9 Normal matrices, quadratic forms,
and field of values, 59
Chapter 3 Circulant Matrices 66
3.1 Introductory properties, 66 3.2 Diagonalization of circulants, 72 3.3 Multiplication and inversion of circulants, 85 3.4 Additional properties of circulants, 91 3.5 Circulant transforms, 99 3.6 Convergence questions, 101
xi
xii Contents
Chapter 4 Some Geometric Applications of Circulants
Circulant quadratic forms arising in geometry, 108 The isoperimetric inequality for isosceles polygons, 112 Quadratic forms under side conditions, 114 Nested ngons, 119 Smoothing and variation reduction, 131 Applications to elementary plane geometry: ngons and Krgrams, 139 The special case: circ(s, t, 0, 0, ..., 01, 146 Elementary geometry and the MoorePenrose inverse, 148
Chapter 5 Generalizations o m c u ? a n c s : 9Circulancs and Block Circblancs 155
5.1 gcirculants, 155 5.2 0circulants, 163 5.3 PDmatrices, 166 5.4 An equivalence relation on il, 2, ..., n], 171 5.5 Jordanization of gcirculants, 173 5.6 Block circulants, 176 5.7 Matrices with circulant blocks, 181 5.8 Block circulants with circulant blocks, 184 5.9 Further generalizations, 191
Chapter 6 Centralizers and Circulants 192
6.1 The leitmotiv, 192 6.2 Systems of linear matrix equations. The
centralizer, 192 6.3 t algebras, 203 6.4 Some classes Z(Po, PT), 206
6.5 Circulants and their generalizations, 208 6.6 The centralizer of J; magic squares, 214 6.7 Kronecker products of I, n , and J, 223 6.8 Best approximation by elements of
centralizers, 224
Appendix
Bibliography
Index of Authors
Index of Subjects
C the complex number field
'rnx n the set of m x n matrices whose elements are in C
transpose of A  A conjugate of A
A* conjugate transpose of A
A B B direct (Kronecker) product of A and B
A 0 B Hadamard (element by element) product of A and B
A' MoorePenrose generalized inverse of A
r (A) rank of A
If A is square,
det(A) determinant of A
tr (A) trace of A
h(A) eigenvalues of A: individually or as a set
A 1 inverse of A
p ( A ) spectral radius of A xiii
ma
4
10
z
H
o
@m
x
X
ti
(D c
Lou
II II
11 II
II II
II II
II II
r.1
0
Q n
a
xa
=I
a
n
h .
m r
N
sti
+ o
rtr.
r.
r
I .
C
a
lo
wr
. c
Yr
"a
r"
ti
n.
3(
Dt
ir
tT
U
x
ti
Q
r
Q n
r
.
as
o
r~
g
hl
r
a
w
h
w
rt
~
oo
Qr
Co
o3
r.
II
*
ti
m

I
D
rt
Y w
r
I c
3
Y
n
(D
ti
rt
OH
$P
.
r p
. .
rt
P.
(D
WI
It
iO
rt
0
0
C
. .
n
h0
ti
3
'
H
<N
.
n 
aim
m
r
3
a
o
(~
rt
I
..
X
.
. 0

h
0
ti.
?
3
0
 i1
.
r
0
. r

. 
z2
e;
w
rt
 ZT
r
. ..
0 
w .
mc
C
0
.P
rt
Ln
z3
,
r .
. C

w3
c
0
1C
O
3.
'(
0
r"
. C
X
.
0
.rt
ti
 (D
.
 2
.
. 
J
I1 3
(0
rt
rt
r"
X
z .
" ;.
x" h
0 
rtn
w
~i
r
0
11 (D

L
r
c
ti
0 
m 9
n m
r.
m u.
"
wo
or
o
N
OW
..
.
. .
..
.,
.
..
..
w
ro
o0
3
I P
*m
vL
1'0
r.w
o o
o
r.
r. r
. c
a 3
3
P
**
ww
m
w tit
3 *
ti
PO
ti0
w
r:(:
r
.m
d *
w
* ti* w
z*
Q z
mT
rm
* r.
ID
ti
;:lo
=m
C
I. r'
w r
m r
a 1
Pr
.N
ID
Q
0
3.
m r
zz
0
m
30
e
m H
T
r
?"
m*
0
c
iT
r.
3
ID
P 2
m
m
r.w
m
Q
ti*
T
c ID
ti
ID
ti
P
w 9
w w
*
w3
CI
P.
31
4 w
0 '<
P 1
3
mm
.
*.
T
w
P 2
mm
ti
0
OX
t3
*ti
ffl 7
3 *
mw
T
ti
** ID
<
T
m
00
3
w
r
w3
=a
w
91
ti
t3 r
. I
ma
I
2 An I n t r o d u c t o r y Geomet r ica l A p p l i c a t i o n Nested T r i a n g l e s 3
( 4 1 Given a T7, t h e r e i s a un ique t r i a n g l e T,  whose midpoin t t r i a n g l e it is.
( 5 ) The a r e a o f T2 i s minimum among a l l t r i a n g l e s
T, t h a t a r e i n s c r i b e d i n T, and whose ver  L L
t i c e s d i v i d e t h e s i d e s o f T1 i n a f i x e d r a t i o , c y c l i c a l l y .
( 6 ) If t h e midpoin t t r i a n g l e o f T7 i s T?, and   s u c c e s s i v e l y f o r T4, T5, ..., t h i s n e s t e d
s e t o f t r i a n g l e s converges t o t h e c e n t e r o f g r a v i t y o f T, w i th geomet r ic r a p i d i t y .
A
[By t h e c e n t e r o f g r a v i t y ( c . 9 . ) o f a t r i a n g l e whose v e r t i c e s have r e c t a n g u l a r coor d i n a t e s (x i , y i ) , i = 1, 2 , 3 , i s meant t h e
p o i n t 1 /3(x1 + x 2 + x3, y1 + y2 + y 3 ) . l
F i g u r e 1 .2 .2
PROBLEMS
1. Prove t h a t t h e t r i a n g l e s Tn a r e a l l s i m i l a r . .. 2. Prove t h a t t h e medians of Tn, n = 2 , 3 , ..., l i e
a l o n g t h e medians o f T ~ . 
3. Prove t h a t t h e c .9 . o f T_, n = 2 , 3 , ..., co in 
Prove t h a t a r e a T n + l
= 1 / 4 a r e a T . n
Prove t h a t t h e p e r i m e t e r of Tn+l = 1/2 p e r i m e t e r o f T". .. Conclude, on t h i s b a s i s , t h a t Tn converges t o c . g . TI (F igu re 1 . 1 . 2 ) .
 Desc r ibe t h e s i t u a t i o n when T1 is a r i g h t
t r i a n g l e ; when T, i s e q u i l a t e r a l .
Given a t r i a n g l e T1, c o n s t r u c t a t r i a n g l e To such
t h a t T1 is i t s midpoin t t r i a n g l e .
The midpo in t t r i a n g l e of T1 d i v i d e s T1 i n t o f o u r   s u b t r i a n g l e s . Suppose t h a t T, d e s i g n a t e s one of  t h e s e , s e l e c t e d a r b i t r a r i l y . Now l e t Tn de s ig 
n a t e t h e sequence of t r i a n g l e s t h a t r e s u l t from an i t e r a t i o n of t h i s p r o c e s s . Prove t h a t Tn
converges t o a p o i n t . Prove t h a t eve ry p o i n t i n s i d e TI and on i ts s i d e s i s t h e l i m i t of an
a p p r o p r i a t e sequence T . n Sys t ema t i ze , i n some way, t h e s e l e c t i o n p r o c e s s i n Problem 9.
I f two t r i a n g l e s have t h e same a r e a and t h e same p e r i m e t e r a r e t h e y n e c e s s a r i l y congruen t?
L e t P be an a r b i t r a r y p o i n t l y i n g i n t h e tr j Tl = AA B C L e t T2 = o (T ) be de te rmined 1 1 1' 1
.. F i g u r e 1 .1 .3 c i d e s w i t h t h e c .g . of T1.
4 An Introductory Geometrical Application The Transformation o 5
n Fisure 1.1.3. Determine the rate at which U (T,) Write
< I converges to P.
1.2 THE TRANSFORMATION U
As a first generalization, consider the following transformation a of the triangle T1. Select a
nonnegative number s: 0 < s < 1, and set
Let A2, B2, C2 be the points on the sides of the
triangle T1 such that
In this equation A A designates the length of the 1 2 llne segment from A to A2, and so on. Thus the 1 points A C divide the sides of Tl into the 2' B2' 2 ratio s/t, working consistently in a counterclockwise fashion. (See Figure 1.2.1.)
(1.2.3) T2 = AA2B2C2 = 0 (TI)
and in general
( 1 . 2 3 Tn+, = o(Tn)
n = o (T1), n = 1, 2, 3, ... .
Figure 1.2.2 illustrates the sequence T for s = t = 3/4. n
Figure 1.2.2
The transformation a depends, of course, on the parameter s, and we shall write os when it is neces sary to distinguish the parameter.
To analyze this situation, one might work with vectors, but it is particularly convenient in the case of plane figures to place the triangle T in the com 1 . plex plane. We write z = x + iy, 2 = x  ly, i = fl, and designate the coordinates of Tn systematically by . . z z z Write, for simplicity, zll  zl, z =  In' 2n' 3n' 21 z2, z31 = z3. The transformation a operating succes
sively on T 1' T2' ..., is therefore given by
 k .
li
h.
N
a
om
PI
3
ea

IN
a s
r
C.
m
w +
n
r.n
G
h
rD
<
o
r
 m
l w art
e
tN
h
OF
t
V h
 +
rt
Dim
0
tim
I1
0
 01
Y 3 G
F
NP
0 3
N
3 r
I
OD
ir
D,
 c133
N
a rt.0
rj
0
rt.
3' 3
s.
r.
m
m.
LO
.. 3
Y
3
WH
3 7
H
m
wr
tt
m r
t D
i.
3
3
E
nm
LO
rtr.
m
o
SP
r
rt
wr
r
tr
s rt
7
0
w m
*
or
.
*L
O
. 3
a
rt
Yr
3.

3'F

C.
rD D
i e
em
rt
s
rw
a
w 0
0
rt
n
r.
C.0
3
0
mm
a
rt
.
o
a
 r
0
e
r
Or
t
.3
0
ow
1
0
5
Dir
t .
rt r

II rD
0
Pe
n
.
w r9
w rt
r.
r
m
m
m
m
rt
e
G
so
n
r
3
03
0

rD rD
3
0
rt
h
w 3
s r. L
O D
iLQ
LO
51
1 rt
r
m
2 a r .
0
+ a
o r

w m3
m
rt
N
m
r.
N 
r.
J 3
3
mw
N
14
hm
rt
LO
+ rt
C
.ID

s
Dim
h

N
0
Di
W
(3
Crt
 N


+
N
mr
m
W
rn
w
+N
N

N
P
I
N
N N
+
IW
I
w
+
W

NI

07
LO
NO
W
N
r
rt 
rt
N
I +
+ H
+
W
+ ET
+
rD N
I
m
m
rt
rt
'IN
N

NN
N
m
w

 
c<

wN
c


*
e
'I
+ N
l N
m I
$ 
I
I 
N
 w
+
N
N I
r
NI
+
N

+ N
N
I +
N 
W
NI
N W
N
N

8 An Introductory Geometrical Application The Transformation o 9
so that
1 2 . 1 4 lim z . = 0 for i = 1, 2, 3. n+m I t n
We have therefore proved the following theorem.
Theorem 1.2.1. Let 0 < s < 1 be fixed and let T.. be I,
the sequence of nested triangles given by Tn =
o " ( ~ ~ ) , n = 1, 2, . . . . Then T n .converges to c.g. (T1).
The function V(T) is a simple example of a Lyapunov function for a system of di.fference equations. The c.g. is known as the limit set of the process.
It is also of interest to see how the area of T1
changes under o. Designate the area by p(T1). Assum
ing, as we have, that z, = x, + iylr z2 = x2 + iy2, A A
z3 = x + iy are the vertices of T1 in counterclock 3 3
wise order, we have
Theorem 1.2.2. min o.s.1 ,,J (o (TI) ) occurs uniquely when
s = 1/2 and equals (1/4)u(T1).
Proof. The minimum value of g(s) = 1  3s + 3s 2  occurs uniquely when s = 1/2 and equals 1/4.
PROBLEMS
1. Interpret the transformation o geometrically when s is real but does not satisfy 0 < s < 1. What does o do when s = l?
2. Interpret the transformation o geometrically when s and t are complex.
3. In this case, find a formula for V(o(T 1 ) .  1 4. Let V(T designate the polar moment of inertia of
1 T1 about its center of gravity, regarding T1 as a
lamina of unit density. Prove that
5 . Let o(T ) have vertices A2, B2, C2. Then the lines 1 A1B2, B1C2, CIA2 are concurrent if and only if s =
t = 1/2 . (Use Ceva's theorem.)
6. Let T be an equilateral triangle. Then for any s, o (T) is equilateral. Interpret this as an eigen S value property of
\ t 0 s ' Thus the equilateral triangles are "eigenfigures"
10 An Introductory Geometrical Application Different Values of s 11
of o. Generalize. @: Let the vertices of T Then T in counterclockwise order be zl, z2, z3.
is equilateral if and only if zl + wz + w z3 = 0, where w = exp(2ni/3).
2
1.3 THE TRANSFORMATION a, ITERATED WITH DIFFERENT VALUES OF S
As observed, the transformation o depends on the selec tion of the parameter s. Let us indicate this by writ ing o . Begin with the triangle T and form
S 1
Now iterate this, using different values of the para meter s. We obtain
so that, in general.
We then have from (1.2.10)
Whether or not V(Tn) converges to 0 depends on the  behavior of the infinite product Ilc=19(~k) 
m 2 IIkxl(l  3Sk + 3Sk).
2 Let pk = 3sk  3sk = 3s (1  sk). Then m w
k IIkz1g(sk) = IIkzl(l  pk). Assuming that 0 < sk < 1,
we have 0 < pk < 3/4. As is well known, if iF=l~k < m ,
n" then limn,,lIk=l(lpk) exists and is not zero. On the  other hand, if Ikz1pk  , then lirnn,,n~=,(lpk) = 0.
(See, e.g., Knopp, 1928, pp. 219221.) Thus we must
investigate the convergence of 1" s (1sk) To this end, for 0 < s < 0, introduce k=l k
k
2 kX!E. lz=l(~k  sk) < if and only if 'Irn s* < . k=l k
Proof.  s
2 k o < s  s = Sk(l  Sk) < and 5 min(Sk, 1S ) = sg
k k k 1  Sk
.n c* 2 Hence Ik=ls{ C rn implies tk=l(sk  sk) < m. On the
other hand,
m m 2 Hence lk=l~i =  implies lk=l(sk  s k ) = a.
This leads to
Theorem 1.3.1
(b) If lL=l~{ ( then
In Case (a), as before, limn+,T = c.g. ( T ~ ) In n
case (b), one conjectures that [T I approach a non n
trivial limiting triangle T, (see Figure 1.3.1). We
shall return to this point in Section 3.6 for a more complete analysis.
PROBLEMS
1. Let s = l/(k + 1l2, k = 1, 2, . . . . Compute, k
12 An Introductory Geometrical Application 1 Nested Polygons 13
Each side of P is now divided in length into the ratio s/t, 0 < s < 1, t = 1  s, proceeding cyclically  counterclockw~se. The points of division form the ver tices of a new polygon &[P). (See Figure 1.4.1.) We wish to discuss what happens when this transformation is iterated.
Figure 1.3.1
approximately, limk_,l~ (Tk) /b (TI) . 2. Do the same with s = exp(pk), p > 0, k =
1, 2, ... . k
Figure 1.4.1 1.4 NESTED POLYGONS
t
Let pn = an (P~), let the vertices of Pn have the
coordinates z z l,nr 2.n' ...' and for Simplicity 
p,n' write z 1,1 = zl, ..., z  z . The transformation
P,l P o may obviously be written in matrix form as
We pass now from triangles to polygons. Let z,, z,, A 
..., z be ordered vertices of a polygon P (assumed to P
be located in the complex plane). We make no restric tions on the complex numbers zk, so that P may be con
vex or nonconvex, simply covered or not; furthermore, the points z, are not necessarily distinct so that the .. polygon may have "multiple vertices." All geometric constructions described below are to be interpreted appropriately with this in mind. We shall also call such a figure a pgon. We shall assume, however, that the center of gravity of P, l/p(zl + .  . + z ) , is at the origin. This means that P

14 An Introductory Geometrical ~pplication Nested Polygons 15
If one writes
and abbreviates the p x p matrix in the right hand of (1.4.2) by G, then
( 1 4 . 2 Zn+l = GZn; Z1 = a given initial vector.
This is a linear autonomous system of difference equa tions, that is, G i s independent of n. The solution of this iteration is
Thus the limitinq behavior of P _ (i.e., Z) as n + = I1 I I
depends substantially on the behavior of G" as n + . The matrix G is a circulant matrix; that is, in
each successive row the elements move to the right one position (with wraparound at the edges). It is also true that the matrix G is a nonnegative. doubly stochastic, irreducible, and normal matrix. In this essay we emphasize the circulant aspect of G. We post pone further discussion of the pyon problem until we have somewhat developed the theory of circulants.
PROBLEMS
1. Let G = (gij) be a P * p matrix. Let the pgon Z1
be transformed into the pgon Z2 linearly by means
of Z2 = G Z 1 What are necessary and sufficient
conditions on G that it ?reserve centers of
sravitv? Express as an eiqenvaluevector condition. 
2. Let G (as in Problem 1) satisfy G~ = I for some positive integer k. Describe the geometric situa
 '3n+l  G3Z3n'
 '3n+2  G1Z3n+l' for n = 0, 1, . . . . z  3n+3  G2Z3n+2'
Find a formula for Z . n
Generalize this section to space pgons (in three dimensions).
Develop analytical apparatus for generalizing this section to nested polyhedra. In particular, let T1 be a tetrahedron. Let T2 be the tetrahed ron whose vertices are the c.9.'~ of the faces of T1. Iterate this.
REFERENCES
Convergence of nested polygons: Berlekamp et al.; Rosenman; Huston; Schoenberg Ill.
pgons in a general setting: Bachmann and Schmidt; Davis 111, 121.
Liapunov functions, limit sets: LaSalle [2].
tion upon iteGation.
3. Suppose that Z is given and that 0
2 HNTRODUCTOWY MATERIAL
MATRIX
2.1 BLOCK OPERATIONS
It is very often convenient in both theoretical and computer work to partition a matrix into submatrices. This can be done in numerous ways as suggested by this example:
Each submatrix or block can be labeled by subscripts, and we can display the original matrix with submat rices or blocks for its elements. The general form of a partitioned matrix therefore is
Dotted lines, bars, commas are all used in an obvious way to indicate partitions. The size of the blocks must be such that they all fit together properly.
16
Block Operations 17
This means that the number of rows in each A.. 1 3
must be the same for each i and the number of columns must be the same for each j. The size of A. . is
11  therefore m. x n. for certain integers m. and n.. He
1 I 1 3 indicate this by writing
No. of columns m m2 ... m e No. of rows
A12 . . . "1
2 . 1 1 ' A =
A square matrix A of order n is often partitioned symmetrically. Suppose that n = nl + n2 +   . + nr with n. > 1. Partition A as
1 
A12 . . . (2.1.2) A = ( ;ll
Arl Ar2 ... Arr A1r)
where size Aij = ni x nj. The diagonal blocks Aii are square matrices of order n..
1
Example. X X X
X X X
X X X
X X X
X X X X X X
X X X t X X X
is a symmetric partition of a 6 6 matrix.
Square matrices are often built up, or compounded, of square blocks all of the same size.
18 Introductory Matrix Material
: X : X 1 ; X X X X X
X X X I X X X
X X X I X X X
If a square matrix A of order nk is composed of n x n square submatrices all of order k, it is termed an (n, k) matrix. Thus the matrix depicted above is a (2, 3) matrix.
Subject to certain conformability conditions on the blocks, the operations of scalar product, trans pose, conjugation, addition, and multiplication are carried out in the same way when expressed in block notation as when they are expressed in element nota tion. This means 
Here T designates the transpose and * the conjugate transpose.
Block Operations 19
where C . . = hA. B .. 1 I Ir=l lr r]
In (2.1.6) the size of each A , . must be the size of the corresponding B i i . 1 3
 2 In (2.1.71, designate the size of A . . by a. x 6 .
13 1 3 and the size of B.. by Yi X 6 . . Then, if B = Yr for
1 I I r 1  < r  < R, the product AirBrj can be formed and pro
duces an a+ x 6 ; matrix, independently of r. The sum A J
can then be found as indicated and the C . . are ai x 6 . l:! 3 matrices and together constitute a partition. Note
that the rule for forming the blocks C;; of the matrix
product is the same as when A . . and B. . are single numbers. 1 I 1 I
Example. If A and B are n x n matrices and if
then
WN
V
x
m : a
u
N
r.x
c
rn
<w
n
II
ma
rtrt
vw
I
 "3
g
o
r I
uti
u
I '4
a u
r
IUp
ID 0
 2
z m
s
wr
t 0
ti
ci
a
mr
.
3~
.
n
mw
x
o m
a
m
N X +
!"= .d
* w
r
YO
Ir
0t
h Y
II
w3
a
o r
tw
 mti
rt
ma
r.ti
z
z
r. m
m
r.
N
N
3
n
x
x
r.
*ID
rtN
r
m
:
3w
v
I
N
N z
2 ;?? a
m
X
z
Q
CJ
I
n
2 5;
33
 N
rt0
.
Yc
i
CI


e
N
ma
o
um
d
rn
m~
w~
r
 
 
 z
ID
v ti
0
H
H
m
m
a
m
m
 P
*
P
0 v
o
m:
z:
 a a
a
w n
on
ti
rt
ID
W
w ti
nti
rt
ID
0
Y
o
m
m
ma
9
mc
c
WD
0
11
ti
mm
m
 v
rt
ti
ci
00
0
10
C
C
IDID
ID
tt
Yr
t
rt
07
3'
ti
w
P
rt
rt
z. rt
a
*Y
ID
r
m
rt
rt
ID
0P
Y
C.
0
10
nm
m
ID

m
3 <
I1 0
w m
r
c n
m
01
22 Introductory Matrix Material
PROBLEPIS
1. Let A = A1 O A2 O ... O Ak. Prove that det A = P Iltz1 det Ai and that for integer p, A' = A' 1 O A2 O
... o AP. K
2. Give a linear algebra interpretation of the direct sum along the following lines. Let V be a finite dimensional vector space and let L and M be sub spaces. Write V = L @ M if and only if every vector x E V can be written uniquely in the form x = y + z with y E L , z E M. Show that V = L O M if and only if
(a) dim V = d i m L + dim M, L n M = {Ol. (b) if {x l,...,xLl and {yl, ..., ym) are bases for
L and M, then ixl .... ,xL,Y1, .  . #Yml is a basis for V.
3. The fundamental theorem of rankcanonical form for square matrices tells us that if A is a n x n matrix of rank r, then there exist nonsingular matrices P, Q such that PAQ = Ir @ Onr. Verify this formulation.
2.3 KRONECKER PRODUCT
Let A and B be m n n and p x q respectively. Then the Kronecker product (or tensor, or direct product of A and B) is that mp x nq matrix defined by
Important properties of the Kronecker product are as follows (indicated operations are assumed to be defined) :
(1) (aA) B B = A B (aB) = o(A 8 B); o scalar.
( 2 ) ( A + B ) B C = ( A 0 C ) + ( B 0 C ) .
(3) A B ( B + C) = ( A 0 B) + ( A B C ) .
(4) A 0 (B 0 C) = (A 8 B) 0 C.
Kronecker Product 23
( 5 ) (A B B) ( C 0 D) = (AC) B BD.
( 6 ) = A B B. ( 7 ) ( A 0 B ) ~ = B B ~ ; ( A 0 B)* = A * B B*.
We now assume that A and B are square and of orders m and n. Then
( 9 ) tr(A B B) = (tr(A)) (tr(B)).
(10) If A and B are nonsingular, so is A 0 B and
(A €4 ~ 1  l = AI B BI.
(11) det(A 0 B) = (det ~ ) ~ ( d e t B ) ~ .
(12) There exists a permutation matrix P (see Section 2.4) depending only on m, n, such that B 0 A = P*(A B B)P.
(13) Let p(x, y) designate the polynomial
Let @(A; B) designate the mn x mn matrix
Tham the eigenvalues of D (A; B) are @(Art us), r = 1 , 2, ..., m, s = 1, 2, ..., n where A and u c are the eigenvalues of A r   and B respectively. In particular, the eigenvalues of A 0 B are h y ) ~ < , r = 1, 2,
PROBLEMS
 1. Show that Im B In  Imn.
2. Describe the matrices I B A, A B I.
3. If A is m x m and B is n x n, then A 8 B = (A B In) (Im 0 B) = (Im B B ) (A 0 In).
2 4 Introductory Matrix Material
4. If A and B are upper (or lower) triangular, then so is A @ B.
5. If A @ B f 0 is diagonal, so are A and B.
6. Let A and B have orders m, n respectively. Show that the matrix (Im 4 B) + (A 8 In) has the
eigenvalues .Ar + p , i = 1, 2, ..., m, j = S
1, 2, ..., n, where hr and us are the eiqenvalues of A and B. This matrix is often called the Kronecker sum of A a11d B.
7. Let A and B be of orders m and n. If A and B both are (1) normal, ( 2 ) Hermitian, (3) positive definite, (4) positive semidefinite, and ( 5 ) unitary, then A @ B has the corresponding property. See Section 2.9.
8. Kronecker powers: Let = A @ A and, in
general, A[~+'] = A 0 A[~]. Prove that A [k+9.1 =
Alkl @ AIZl.
9 . Prove that (AB) l k l = Ark] B[~]. T
10. Let Ax = Ax and By = uy. x = (xl. ..., xn) . T T T
Define z by zT = [xly , x2y , .. . , xmy I. Prove that (A @ B) Z = AuZ.
2.4 PERMUTATION MATRICES
By a permutation a of the set N = 11, 2, ..., nl is meant a onetoone mapping of N onto itself. Includ ing the identity permutation there are n! distinct permutations of N. One can indicate a typical per mutation by
u(n) = i n
which is often written as
Permutation Matrices 25
The inverse permutation is designated by al. Thus 1 a (ik) = k.
Let Ei designate the unit (row) vector of n com
ponents which has a 1 in the jth position and 0's elsewhere:
By a permutation matrix of order n is meant a matrix of the form
a. l,a (i) = 1, i = 1,2, ..., n,
(2.4.4) P = (a. . ) where 11 a. 1, 1 = 0 , otherwise.
The ith row of P has a 1 in the a(i)th column and 0's elsewhere. The jth column of P has a 1 in the 1
o (j ) th row and 0's elsewhere. Thus each row and each column of P has precisely one 1 in it.
Example
It is easily seen that
2 6 Introductory Matrix Material
that is, PuA is A with its rows permuted by o. More
over,
so that if A = (a. . ) is r n, 11
1 That is, AP, is A with its columns permuted by o . Note also that
(2.4.91 POPT = PUT,
where the product of the permutations 0, T is applied from left to right. Furthermore,
hence
Therefore
The permutation matrices are thus unitary, forming a subgroup of the unitary group.
Permutation Matrices 27
From (2.4.6), (2.4.8) and (2.4.12) it follows that if A is n x n
so that the similarity transformation PnAP; causes a  "
consistent renumbering of the rows and columns of A by the permutation o.
Among the permutation matrices, the matrix
plays a fundamental role in the theorv of circulants. his corresponds to the forward shiftpermutation o(1) = 2, a(21 = 3, ..., o(n1) = n. ofn) = 1. that . . . is, to the cycle u = (1, 2, 3, ..., n) generating the cyclic group of order n (n is for "push"). One has
2 2 corresponding to u2 for which o (1) = 3, o (2) = 4, 2 . . . , a (n) = 2. similarly for nk and ok. The matrix
n n n corresponds to o = I, so that
Note also that
 1 n 1 (2.4.17) nT = n* = n = n . A particular instance of (2.4.13) is
(2.4.18) n ~ n ~ = (ai+l, j+l 1 where A = a and the subscripts are taken mod n.
11
2 8 Introductory Matrix ater rial I Permutation Matrices 29
Here is a second instance. Let L = (Al, A2, ..., 'n) T.
Then, for any permutation matrix P 0 '
(2.4.19) Po (diag LIP* o = diag(PoL).
A second permutation matrix of importance is
which corresponds to the permutation o(1) = 1, o(2) = n, o(3) = n  1, ..., o(j) = n  j + 2, ..., o(n) = 2. Exhibited as a product of cycles, o = (1) (2, n)
(3, n  11, ... , (n, 2). It follows that o2 = I, hence that
Also,
1 (2.4.22) r* = rT = r = r .
Again, as an instance of (2.4.13),
(2.4.23) 1 (diag L)l = diag(rL).
Finally, we cite the counteridentity K, which has 1's on the main counterdiagonal and 0's elsewhere:
2  1 One has K = K*, K = I, K = K .
Let P = Po designate an n x n permutation matrix.
Now o may be factored into a product of disjoint cycles. This factorization is unique up to the
J arrangement of factors. Suppose that the cycles in the product have lengths p 1~ P2,  . . I Pm, (pl + p2 4 . . . + pm = n). Let n designate the n matrix
p k i (2.4.14) of order pk. By a rearrangement of rows and .. ! columns, the cycles in Po can be brought into the
form of involving only contiguous indices, that is, indices that are successive inteqers. By (2.4.131,
r then, there exists a permutation matrix R of order n such that
(2.4.25) RPR* = RPR' = n e n e ... a71 . PI p2 Pm
Since the characteristic polynomial of n~ is Pk
(1) Pk(APk  I), it follows that the characteristic m Pk Pk polynomial of RPR*, hence of P, is Ilk=,(l) (A  1). .. A
The eigenvalues of the permutation matrix P are there fore the roots of unity comprised in the totality of roots of the m equations:
Example. :.et 0 be the permutation of 1, 2, 3. 4, 5, 6 for which n(1) = 5, n(2) = 1, o(3) = 6. 014) = 4. o(5) = 2, ~ ( 6 ) = 3. Then o can be factored into cycles as a = (152) (4) (36). Therefore, m = 3 and p, = 3,
p2 = 1, p3 = 2. The matrix Po is
ti
3P
'd
0
X3
F
m
em
tio
r.
w
m <
rr
rt
0 
3r
t c
t m
m a
m
ti
zm
n
m
rt
rr
3
. s I
rr
I1
r 
rm
7
00
X
Q
he
N3

'd
o r
t 
rt
m 
m2
11 r'i
I
n
xn
m
Br
'd r
3s
N
G 
 o
m
tr
I
lX
EU
O
r e
N
 m
w
0
3 r
r
n
m r~
LU
r. . 'd
I1
rt 0
ti0

3s
3
w
I x
em

CI
OM
 t
ro
 
m
~m
.
n
z rt
N
!a
3
'0
T
mz
3
m
+
0
(D
mr
r
ti
or
t I
m
r
m
m s
'i
X3
e
r
ti 
0
x rt
rt

r.
3
s
3
II r
II m
Me
Q
e
m
3
XP
N
m
ma
rt
n
m
r (
3
o 3
n
r r
~l

0
0
a r
tm
a
xm
m
mm
rt

3
0
rt
m
*m
u
tr
'2'
Vl
ti
C
oa
a
n
a
z," ",
a?
m
m r
 v
i r
. a

ti
r.
Z
14
 
m
Q
. I
xr
t
I 
r.
>O
N
I m
rt
r.
w >
P
r
Q rr
C
I Q
m
ra

m t
i3
om

3
0<
3b
r
< O
w
B
LU
rt
0
C
0
C 0
rt
I z
m3
w
r
m o
mm
rt

m
m
r.r

om
0 r
0
cm
mv
N
M
3
m w
a
I V
.
rt
0
a
wv
mr
. r
ti
m

w
r.
0
. ti
r
8 3
n
J
m r
.0 e
3
rr
C
I r
t
n P
o CI
7
ID
<L
<
m m
mn
a
ti m
%
'd T2
hl
m m
0
wt
io
r
'i
e 2
"?r
m
tirt
mrr
rr
m
wc
r.
s
*P
C
m
aw
rm
r o
ti
m
3 rr
o
rt
mm
o
r
. 3 3
rt
J
w~
r.
(n
n
rtr
trt
*ti s
m
0
. r.
m
XY
~
. w
e
0
ti
I m
01
WH
*
13
lu w
1
e
a a
e
tiQ
ti
r.0
as 0
r
. r. a
N
rt
ID
r
a
<
m r
tm
(11
P
k m
3
rt 0
01
s
m
m
re
*
ID
07
0 0
e
F. * rr r
wc
r
ms
o
r. z
r. r
t 01
3
w

32 Introductory Matrix Material I The Fourier Matrix 33
n (2.5.2) (a) w = 1,
(b) ww = 1,  1 (c) w = w ,
(d) ;k = wk = w nk
(e) 1 + w + w2 + ... + w n 1 = 0.
By the Fourier matrix of order n, we shall mean the matrix F (= Fn) where
Note the star on the lefthand member. The sequence k w , k = 0, 1, ..., is periodic; hence there are only n distinct elements in F. F can therefore be written alternatively as
It is easily established that F and F* are symmetric :
(2.5.5) T F = F , F * = ( F * ) ~ = F, F = F*.
It is of fundamental importance that
Theorem 2.5.1. F is unitary:
(2.5.6) FF* = F*F = I or FI = F* or
F$ = FF = I  1 or F = $.
Proof. This is a result of the geometric series identity
nl r(jk)= 1  w n(jk) I w r=O jk
l  w = ~ ~ ~ ~ j = ~ ~ '0 if j p k .
A second application of the geometrical identity yields
Theorem 2.5.2
/ 1 0 ... 0 \
4 2 Corollary. F* = r = I. F*3=F*4(~*) 1 = I F = F .
We may write the Fourier matrix picturesquely in the form
(1t may be shown that all the qth roots of I are of 9  the form M~DM where D = diag (ul, u2, . . . , pn), ui  1,
and where M is any nonsingular matrix.)
Corollary. The eigenvalues of F are +1, +i, with appropriate multiklicities.
Carlitz has obtained the characteristic polynom ials f lh) of F* (= F*) . They are as follows. n
n = Olmod 4 ) , 2 f(X) = ( A  1) ( A  i) ( h + 1) ( A 4  1) (n/4)1
3 4 Introductory Matrix Material I The Fourier Matrix 35 !
The discrete Fourier transform. Working with complex ntuples, write
n S 1 (mod 4), f (A) = (A  1) (A4  1) (114) (n1) j P (1)
I Z = (zl, z2, ..., Zn) and A
A A
Z = (zl, Z2, ..., QT. The linear transformation
n ! 2(mod 4), f ( h ) = (A2  1) (A4  1) (1/4) (n2)
n ! 3(mod 4), f(h) = ( A  i) (A'  1)
I i where F is the Fourier matrix is known as the discrete
i Fourier transform (DFT). Its inverse is given simply by
1 (2.5.9) Z = Fli = F*2. I
(2.5.11)
The transform (2.5.8) often goes by the name of harmonic analysis or periodogram analysis, while the inverse transform (2.5.9) is called harmonic synthesis. The reasons behind these terms are as follows: suppose
n 1 that p(z) = a. + a z + ... + anz 1s a polynomial of 1
degree 5 n  1. It will be determined uniquely by specifying its values p(z ) at n distinct points zk, n k = 1, 2, ..., n in the complex plane. Select these
2 n 1 points zk as the n roots of unity 1, w, w , ..., w . Then clearly
so that
(:' )= n'I2!j (:(w) ) ,) The passage from functional values to coefficients through (2.5.11) or (2.5.8) is an analysis of the function, while in the passage from coefficient values to functional values through (2.5.10) or (2.5.9) the functional values are built up or "synthesized."
These formulas for interpolation at the roots of unity can be given another form.
By a Vandermonde matrix V(z,,, zl, ..., z ) is n1
meant a matrix of the form
(h4  1) (1/4) (1131
From (2.5.4) one has, clearly,
2 v(1, W, W , ..., Wnl) = n1I2F*,
(2.5.13)  2 n1 1/2* ~ ( 1 , w , w , ..., w ) = n F =n1'2~.
One now has from (2.5.11)
n 1 (2.5.14) p(z) = (1, z, ..., z ) (ao, al, ..., a )T n1
n1 1/2 = (1, z, ..., z )n
F(p(l), ~ ( w ) , ..., = n 1/2 n 1  2 l , z , . , z ) V ( l , w , w ,
n1 ..., w ) (p(l), P(w), ..., P(wnl))T.
a n1 p (wnl

3 6 Introductory Matrix Material
Note. In the literature of signal processing, a  sequencetosequence transform is known as a discrete or digital filter. Very often the transform ]such as (2.5.8)) is linear and is called a linear filter.
Fourier Matrices as Kronecker Products. The Fourier
matrices of orders 2" may be expressed as Kronecker products. This factorization is a manifestation, essentially, of the idea known as the Fast Fourier Transform (FFT) and is of vital importance in real time calculations.
Let F:n designate the Fourier matrices of order 1
2" whose rows have been permuted according to the bit reversing permutation (see Problem 6, p. 30).
Examples
F;=%(: :),
One has
where Dd = diag(1, 1, 1, i). This may be easily
checked out. As is known, A B B = P(B @ A ) P * for some permu
tation matrix P that depends merely on the dimensions of A and 8. We may therefore write, for some permu
1 tation matrix S4 (one has, in fact, S4 = S4):
(2.5.16) F; = (I2 @ F')D S (I B F;)S4 2 4 4 2
Similarly,
The Fourier Matrix 3 7
where
with
2 3 (2.5.19) D = d i a g ( l , w , w , w ) , 2ni w = exp  16 '
Again, for an appropriate permutation matrix S  T 16 
sl = S16, 16
For 256 use
where the sequence 0, 8, 4, ..., 15 is the bit reversed order of 0, 1, ..., 15 and where (2.5.22) D = diag(1, w, . . . , w15), w = e 2ni/256
PROBLEMS
1. Evaluate det Fn. .. 2 . Find the polynomial P,~(Z) of degree < n  1 that
 takes on the values l/z at the nth roots of unity,
I , j = 1, 2, . . , n. What is the limiting behavior of pn(z) as n + m ? (de Mbre)
3. Write F = R + iS where R and S are real and i =
a. Show that R and S are symmetric and that 2 R~ + S = I, RS = SR.
4 . Exhibit R and S explicitly.
2.6 HADAMARD MATRICES
BY a Hadamard matrix of order n, H (= H,), is meant a
matrix whose elements are either +l or 1 and for which
3 8 Introductory Matrix Material
 Thus, n 1'2~ is an orthogonal matrix.
Examples
H1 = (l),
JT F2 = H2 = (1 I),
1 1 1
 H 4 , 1

1 1 1 1
It is known that if n , 3, then the order of an Hadamard matrix must be a multiple of 4. With one possible exception, all multiples of 4 5 200 yield at least one Hadamard matrix.
Theorem 2.6.1. If A and B are Hadamard matrices of orders m and n respectively, then A B B is an Hadamard matrix of order mn.
Proof  (A B B) (A @ B ) ~ = (A @ B) ( A ~ @ B ~ ) = ( A A ~ ) (BB~)
In some areas, particularly digital signal proc essing, the term Hadamard matrix is limited to the
n matrices of order 2 given specifically by the recur sion
Hadamard Matrices 3 9
These matrices have the additional property of being symmetric,
so that
The WalshHadamard Transform. By this is meant the transform
where H is an Hadamard matrix.
PROBLEMS
1. Hadamard parlor game: Write down in a row any four numbers. Then write the sum of the first two, the sum of the last two; the difference of the first two, the difference of the last two to form a second row. Iterate this procedure four times. The final row will be four times the original row. Explain, making reference to H Generalize. 4'
2. Define a generalized permutation matrix P as follows. P is square and every row and every col umn of P has exactly one nonzero element in it. That element is either a +1 or a 1. Show that if H is an Hadamard matrix, and if P and Q are generalized permutation matrices, then PHQ is an Hadamard matrix.
3. With the notation of (2.6.2) prove that
40 Introductory Matrix Material
4. Using Problem 3, show that the Hadamard transform of a vector by H can be carried out in
,n L
< n 2" additions or subtractions.  
5. If H is an Hadamard matrix of order n, prove that 4 2 l d e t ~ I = n .
2.7 TRACE
The trace of a square matrix A = (a..) of order n is 1 I
defined as the sum of its diagonal elements:
The principal general properties of the trace are
(1) tr (aA + bB) = a tr (A) + b tr (B) . (2.) tr (AB) = tr (BA) . (3) tr A = tr(slAs), S nonsingular.
(4) If A . are the eigenvalues of A, then ' n tr A = li=l Xi.
(5) More generally, if p designates a polynomial
n then tr (p (A) = Ikzl P (Ak).
n 2 (6) tr (AA*] = tr (A*A) = li . 1 a . 1 = square
I 11 of Frobenius norm of A.
(7) tr(A Ci B) = tr A + tr B. (8) tr (A c3 B) = (tr A) (tr B) .
2.8 GENERALIZED INVERSE
For large classes of matrices, such as the square "singular" matrices and the rectangular matrices, no
Generalized Inverse 4 1
inverse exists. That is, there are many matrices A for which there exists no matrix B such that AB = BA = 1 . .
In discussing the solution of systems of linear equations, we know that if A is n x n and nonsingular then the solution of the equation
where X and B are n x m matrices, can be written very neatly in matrix form as
1 X = A B.
Although the "solution" give above is symbolic, and in general is not the most economical way of solv ing systems of linear equations, it has important applications. However, we have so far only been able to use this idea for square nonsingular matrices. In this section we show that for every matrix A, whether square or rectangular, singular or nonsingular, there exists a unique "generalized inverse" often called the "MoorePenrose" inverse of A, and employing it,
the formal solution X = A'B can be given a useful interpretation. This generalized inverse has several of the important properties of the inverse of a square nonsingular matrix, and the resulting theory is able in a remarkable way to unify a variety of diverse topics. This theory originated in the 19205, but was rediscovered in the 1950s and has been developed extensively since then.
2.8.1 Right and Left Inverses
Definition. If A is an m x n matrix, a right inverse of A is an n x m matrix B such that AB = I_. Similar ... ly a left inverse is a matrix C such that CA = I .
n
Example. If
/1 1 1 \
a right inverse of A is the matrix
4 2 Introductory Matrix Material
since AB = I 2' However, note that A does not have a left
inverse, since for any matrix C, by the theorem on the rank of a product, r(CA) 5 r(A) = 2, so that CA #
I3. Similarly, although A is, by definition, a left
inverse of B, there exists no right inverse of B.
The following theorem gives necessary and suffic ient conditions for the existence of a right or left inverse.
Theorem 2.8.1.1. An m x n matrix A has a riqht (left) inverse if and only if A has rank m(n).
Proof, We work first with riqht inverses. Assume that AB = Im. Then m = r (I m ) 5 r (A) 5 m.
Hence rlA) = m. Conversely, suppose that r(A) = m. Then A has m
linearly independent columns, and we Fan find a per mutation matrix P so that the matrix A = AP has its first m column? linearly i~dependent. Now, if we can find a matrix B such that AB = APB = I, then B = PB is clearly a right inverse for A.
Therefore, we may assume, without loss of gen erality, that A has its first m columns linearly independent. Hence A can be written in the block form
A = (A1, A2)
where Al is an m Y m nonsingular matrix and A2 is some
m u (n  m) matrix. This can be factored to yield 1
A = A (I Q ) 1 m' (Q = A, A,) . NOW let
where B1 is m x n and B2 is (n  m) x m. Then AB = I
Generalized Inverse 4 3
if and only if
AIBl + AlQB2 = I,
or if and only if
B1 + QB2 = 1 A1
or if and only if
 1 B1 = A1  QB2. Therefore, we have
for an arbitrary (n  m) x m matrix B2. Thus there is
a right inverse, and if n > m, it is not unique. We now prove theTheorem for a left inverse.
Suppose, again, that A is m x n and r(A) = n. Then T . A is n x m and r ( ~ ~ ) = n. BY the first part, has
T a right inverse: A ~ B = I. Hence B A = I and A has a left inverse.
Corollary. If A is n x n of rank n, then A has both a right and a left inverse and they are the same.
Proof. The existence of a right and a left  inverse for A follows immediately from the theorem. To prove that they are the same we assume
AB = I, CA = I.
Then C(AB1 = CI = C. But also,
SO that B = C. This is the matrix that is defined to be the  inverse of A, denoted by AI.
4 4 Introductory Matrix Material
PROBLEMS
1. Find a left inverse for (i i) . Find all the left inverses.
have a left inverse?
3. Let A be m x n and have a left inverse B. Suppose that the system of linear equations AX = C has a solution. Prove that the solution is unique and is given by X = BC.
4. Let B be a left inverse for A. Prove that ABA = A and BAB = B.
T 5. Let A be m x n and have rank n. Prove that A A is
nonsingular and that (A~A)'A~ is a left inverse for A.
6 . let^ be m x n and have rank n. Let W be m x m
positive definite symmetric. Prove that A ~ W A is T
nonsingular and that (A W A )  ~ A ~ W is a left inverse for A.
2.8.2 Generalized Inverses
Definition. Let A be an m x n matrix. Then an n x m matrix X that satisfies any or all of the following properties is called a generalized inverse:
(1) AXA = A,
(2) XAX = X,
(3) (AX)* = AX,
(4) (XA)* = XA.
Here the star * represents the conjugate transpose. A matrix satisfying all four of the properties above is called a MoorePenrose inverse of A (for short: an MP inverse). We show now that every matrix A has a
unique MP inverse. It is denoted by A ~ . It should be remarked that the MP inverse is often designated
Generalized Inverse 45
+ by other symbols, such as A . The notation A? is used here because (a) it is highly suggestive and (b) it comes close to one used in the APL computer language.
We first prove the following lemma on "rank factorization" of a matrix.
Lemma. If A is an m x n matrix of rank r, then A = BC, whereB is m x r, C i s r x nand r(B) = r(C) = r.
Proof. Since the rank of A is r, A has r linearly independent columns. We may assume, without loss of generality, that these are the first r columns of A, for, if not, there exists a permutation matrix P such that the first r columns of the matrix AP are the r linearly independent columns of A. But if AP can be factored as
A ,. AP = BC, r (B) = r (C) = r,
then
A = BC
where C = CP' and r(C) = r(?) = r, since P is non singular.
Thus if we let B be the m x r matrix consisting of the first r columns of A, the remaining n  r columns are linear combinations of the columns of B,
of the form BQ") for some r x 1 vector ~(j). Then if we let Q be the r x (n  r) matrix,
Q = (Q (1) ... (nr) ,
we have r nr
A = (B, BQ) (letters over blocks indicate number of columns)
If we let
c = (Ir, Q),
we have
A = B(Ir, Q) = BC
and r(B) = r(C) = r.
46 Introductory Matrix Material
We next show the existence of an MP inverse in the case where A has full row or full column rank.
Theorem 2.8.2.1  1
(a) If A is square and nonsingular, set A~ = A . (b) If A is n x 1 (or 1 x n) and A # 0, set
=  A* (or A~ =  (A*A)
A*). (AA* )
(c) If A is m x n and r(A) = m, set A~ =
A*(AA*)l. If A is m x n and r(A) = n, Set
A+ = (A*A)l~*. 
Then A' is an MP inverse for A. Moreover, in the case of full row rank, it is a right inverse; in the case of full column rank, it is a left inverse.
Note that (a) and (b) are really special cases of (c).
Proof. Direct calculation. Observe that if A is m x n T r ( ~ ) = m, then AA* is m x m. It is well
 1 known that r(AA*) = m, so that (AA*) can be formed. Similarly for A*A.
We can now show the existence of an MP inverse for any m x n matrix A.
1f A = 0, set A? = O* = 0 This is readily n,m' verified to satisfy requirements (11, (2), ( 3 ) and ( 4 ) for a generalized inverse.
If A # 0, factor A as in the lemma into the product
A = BC
where B is m x r, C is r x n and r(B) = r(C) = r. Now B has full column rank while C has full row rank, so
that B~ and C' may be found as in the previous theorem. Now set
. . A T = c?B7.
Theorem 2.8.2.2. Let A~ be defined as above. is an MP inverse for A.
Then it
Generalized Inverse 47
. Proof. It is easier to verify properties (3)
and (4)rst. They will then be used in proving properties (1) and (2).
. . ( 3 ) AA' = B(CC)B~ = BIB' = BB', and since
BB: = (BB~)*, we have AAi = (rnt)*. (4) Similarly, AA = CtC = (CiC)* =

(1) (')A = (BB)BC = BC = A. . .  
( 2 ) (A'A)A~ = (cic)c'gi = c.B. = A+.
Now we prove that for any matrix A the MP inverse is unique.
Theorem 2.8.2.3. Given an m x n matrix A, there is
only one matrix A~ that satisfies all four properties for the MoorePenrose inverse.
Proof. Suppose that there exist matrices B and c satisfying
ABA = A (1) BAB = B (2) (AB) * = AB (3) (BA)* = BA (4)
Then
ACA = A,
CAC = C,
(AC)* = AC,
(CA)* = CA.
and
( 3 ) (3) and ( 2 ) = (cc*A*)(AB)   CAB.
Therefore B = C. The integers over the equality signs show the equations used to derive the equality.
Penrose has given the following recursive method
48 Introductory Matrix Material
for computing A7, which is included in case the reader would like to write a computer program.
Theorem 2.8.2.4 (the Penrose algorithm). Let A be m x n and have rank r > 0.
(a) Set B = A*A (B is n x n).
(b) Set C1 = I (C1 is n x n).
(c) Set recursively for i = 1, 2, . . . , r  1: Ci+l = (l/i)tr(C.B)I 1  CiB (Ci is n x n).
Then tr (C$) f 0 and A? = rCrA*/tr (CrB). Moreover,   Cr+lB = 0. We therefore do not need to know r
beforehand, but merely stop the recurrence when we have arrived at this stage.
The proof is omitted.
Also very useful is the Greville algorithm.
Theorem 2.8.2.5. Define A = (Akl ak) where ak is the
kth column of A and Akl is the submatrix of A consis 
ting of its first k  1 columns. Set dk = Aklak and  
ck  ak  Akldk. Set b k = ck if ck # 0. If ck = 0,
set bk = (1 + d;dk)ld~~il. Then

To start: set A; = 0 if al = 0; if not, set A; =  1
(a;al) a;.
PROBLEMS 2 2 0
1. If A = (1 2 1) , verify that 1 2 1
Generalized Inverse 4 9
1 1 I f A = (1 l), find A'.
1 2
find A'
Use Penrose's formulas to compute the inverse of the nonsingular matrix
Use Greville's algorithm. 
If c is a nonzero scalar, prove that (CAI' =
(l/C)A'. . . Prove that (Ail = A. . . Prove that (Ai)* = (A*)~.
If d is a scalar, define di by di = dI ' ~f d # 0, di = 0 if d = 0. Let A = diag(dl, ..., dn) . Prove that A' = diag(di, . . . , d=).
A 0)i  Prove that (O  (O AT B+) 0 and (B 0 A)+ =  0
Prove that if AT = 0. then A = O * . ~ ~~ . .
Let A = (: :) and have rank 1. Prove that
 A' = 1 A*.
1a12 + lb12 + 1c12 + /dl Let J be the J matrix of order n. Prove that 
J. = (l/n2)~.
Let S be an n x n matrix with 1's on the super diagonal and 0's elsewhere. Find s:.
2 Let P be any projection matrix (i.e., P = P , P*
= P). Prove that P: = P.
5 0 Introductory Matrix Material
15. prove that both AA? and AA are projections. 
16. Prove that Ai = (A*A)~A* = A*(=*)'.
17. Prove that r (A) = r (A') = r (A~A) = tr (ATA) . 1
18. Taking A = (1, o), B = show that, in gen
eral, (AB)' + B'A+. 19. ~f a and b are column vectors, then a? = (a*a):a*,
and (ab*)' = (a*a)' (b*b)iba*.
20. Prove that (A €3 B) = A~ €3 B ~ .
2.8.3 The UDV Theorem and the MP Inverse
We begin by establishing a theorem that is of great utility in visualizing the action and facilitating the manipulation of rectangular (or square) matrices. This is the UDV theorem, also called the diagonal decomposi tion theorem or the singular value decomposition theorem.
Theorem 2.8.3.1. Let A be an m x n matrix with com plex elements and of rank r. Then the exist unitary matrices U, V of orders m and n respectively such that
(2.8.3.1) A = UDV*
where
2 . 8 3 2 D = (zl O) 0
is m x n and where Dl = diag(dl, d2. ..., dr) is a nonsingular diagonal matrix of order r.
Note that the representation (2.8.3.1) can be written as U*AV = D or, changing the notation, UAV = D, and so on (since U and V are unitary).
Let A be m x n; then, as is well known, AA* is positive semidefinite Hermitian symmetric and r(AA*) =
r(A) = r(A*). Hence the eigenvalues  of  AA* are  real L
and nonnegative. Write them as dl, di, ... , d:, 0,
0, ..., 0 where the dils are positive and where there are m  r 0's in the list. The numbers dl, d2, ..., d are known as the singular values of A. r
Generalized Inverse 5 1
Proof. Define D = diag(dl, d2, ..., d r ) Let  1 U1 be m x r and consist of the (orthonormal) eigenvec
2 2 tors of AA* corresponding to the eigenvalues dl, d2,
. . . , d2 (cf. Theorems 2.9.3 and 2.9.9). We have AA*Ul 2' = U D and UPUl = 1 1 Ir. Let U2 be the m x (m  r) matrix
whose columns consist of an orthonormal basis for the null space of A*. Then A*U2 = 0 and U p 2 =
Imr * Write U = (U1, U2) (block notation). Then
Now, since AA*U1 = 2 2 UIDlr U*AA*U 2 1 = U5U1D1. But A*U2 =
0, so that U*A 2 = 0, hence U*U D2 = 0. Since D: is 2 1 1 nonsingular, it follows that u p 1 = UiU2 = 0. This means that
0
mr and hence that U is unitary.
Let V1 be the n x r matrix defined by V = 1
A*ulDil. Let V2 be the n x (n  r) matrix whose n  r columns are a set of n  r orthonormal vectors for the null space of A. Thus AV2 = 0 and V*V = I
2 2 nr ' Define V as the n x n matrix V = (V 1. V2). Now
and V*V 2 1 = V;A*U Dl = (AV~) *ul~yl = 0. It follows 1 1 that V is unitary. Finally,

52 Introductory Matrix Material
Using UDV theorem, we can produce a very conven
ient formula for AT.
Theorem 2.8.3.2. If A = U*DV*, where U, V, D are as above, then
where r mr
Proof. By a direct computation, it is easy to show that the n 
x m matrix
I 0 (VD'U)A = v ( ~ ~ O)V*, the third and fourth properties
for the generalized inverse are satisfied. Also.
AA'A = (u*Dv*) . (VDU) . (u*Dv*) = U*DD~DV* = U*DV* = A.
Similarly ATAAi = A?, proving the first two propertif
Theorem 2.8.3.3. For each A there exist polynomials and q such that
Generalized Inverse 5 3
Proof. Let A be m x n and have rank r. Then by the diagonal decomposition theorem there exist unitary matrices U, V of order m and n and an m x n matrix
r nr
D = (,1 O ) r 0 0 mr
where Dl = diaq(dl, d2, . . . , d,), dld2..dr # 01 , such
that A = U*DV*. Then A * = VD*U, AA* = U*DD*U, and  
A' = VD'U. For an arbitrary polynomial p ( z ) , p(AA*) = p (U* (DD*)U) = U*p(DD*)U. Hence A*p (AA*) =
VD*p(DD*)U. Therefore for A7 to equal A*p(AA*) it is
necessary and sufficient that D~ = D*~(DD*). Equi valently,
("1 O ) = ( D; o ) jDiDi O! 0 0 0 0 0 0
 1 2 2 o r d k =dkp(Jdk/ ) , k = 1, 2. ..., r. ~ h u s p(/dkl ) 2
= l/(ldkl ) , k = 1, 2, .... r is necessary and suffic ient. Let s designate the number of distinct values among idll, IdZI, ..., Idrl. Then by the fundamental
theorem of polynomial interpolation (see any book on interpolation, approximation, or numerical analysis) there is a unique polynomial of degree 5 s  1 that
 2 2 takes on the values idk/ at the s points ldkl . The second identity for A~ is proved similarly.
PROBLEMS
1. Let U and V be unitary. Prove that (UAV)' =
v*AiU*.
2. Let A be normal. Give a representation for in terms of the characteristic values of A. See Section 2.9.
3. Prove that if A is normal, AAi = A ~ A . 4. Prove that AT = A* if and only if the singular
5 4 Introductory Matrix ÿ ate rial I Generalized Inverse 55
values of A are 0 or 1.   1 5. Prove that A' = limt,O A*(tI + AA*) .
2.8.4 Generalized Inverses and Systems of Linear Equations
Using the properties of the generalized inverse we are able to determine, for any system of equations
whether or not the system has a solution. If it does, we can obtain a matrix equation, involving the gener alized inverse, which exhibits this solution. Oddly enough, we need only the first property of a general
(1) ized inverse. That is, we may use any matrix A ,
! such that AA(')A = A.
Definition. If A is m x n, any n x m matrix that
satisfies AA(~)A = A is called a (1)inverse of A .
! More generally, any matrix that satisfies any combinn tion of the,four requirements for the generalized
I ,, inverse on page 44 is designated accordingly. , . , I ,, Example. A (1, 2, 4)inverse for A is one that Satis i .. , :: fies conditions (I), ( 2 ) , and (4). I . I ., ; !: Theorem 2.8.4.1. Let A be m x n. The system of I I , ,, equations I 1,
has a solution if and only if B = AA(~)B, for any (1)
inverse of A. In this case, the general solution is given by
x = A(~)B + (I  A(')A)Y
for an arbitrary n x 1 vector Y.
Proof. Let B = AA(')B. Then AX = AA(l)B is
solved by x = A(~)B. Suppose, conversely, that the system has a solution Xo: AXO = B . Then, for any
Moreover, if X = A(')B + ( I  A(~)A)Y, then with B =
AA(~)B,
Therefore any such X is a solution. To show that it is the general solution, we must
AXO  AA(l)B = B  B = 0. Now therefore R = R  A(~'AR. Hence, X  A("B + (I  A(~)A)R which is of the 0 required form with Y = R.
In the numerical utilization of this theorem one should. of course, use some standard (1)inverse of A such as A:.
PROBLEMS
1. Show that if A is an m x n matrix and B is any (1)inverse of A , then AB and BA are idempotent of orders m and n respectively and BAB is a (1,2) inverse of A.
2. Show that if A is m x n (n x m), of rank m, then any (1)inverse of A is a right (left) inverse of A, and any right (left) inverse of A is a (1,2,31 [(1,2,4)1 inverse of A .
3. Consider two systems of equations: (1) AX = B, ( 2 ) CX = D . Find conditions such that every solu tion of (1) is a solution of (2).
4 . What happens in Problem 3 if B = D = O ?
5 . Prove that the matrix equation AXB = C has a . . solution if and only if ~ A ~ C B ~ B = C. In this case, the general solution is given by
5 6 Introductory Matrix Material
for an arbitrary Y.
2.8.5 The MP Inverse and Least Square Problems
Let A be m x n , X and B be n x 1, and consider the sys tem of equations
If the vector B lies in the range of A, then there exists one or more solutions to this system. If the solution is not unique we might want to know which solution has minimum norm. If the vector B is not in the range of A, then there is no solution to the sys tem, but it is often desirable to find a vector X in some way closest to a solution. To this end, for any X, define the residual vector R = AX  B and consider its Euclidean norm I / R / j = m. A least squares solution to the system is a vector Xo such that its residual has minimum norm. That is,
I / R ~ ~ 1 = 1 I A X ~ .  B / I  < I I A X  B I I for all n x 1 vectors X.
tI Theorem 2.8.5.1. The system of equations AX = B always has a least squares solution. This solution is unique if and only if the columns of A are linearly
YI! j; independent. In this case, the unique least squares
solution is given by X = AB.
Proof. Let R(A) designate the range space of A
and by [R(A)lL designate its orthogonal complement. Then we can write B = B + B2 where B1 is in R(A) and 1 B2 is in orthogonal complement [R(A)lL. For any X,
AX is in R(A) as is Ax  B hence is orthogonal to 1' B2. Now AX  B = AX  B1  B2. Hence, for any X,
2 2 2 2 IlAx  B I I = 1 1 ~ ~  B1l/ + I I B ~ I ~ 2 ~~B~~~ . Therefore 1 1 B2 1 1 is a lower bound for the values
/ /AX  B / 1' and is achieved if and only if AX = B1.
Since B1 is in R(A), there is a solution Xo to AX = B 1
Generalized Inverse 5 7
For this vector Xo,
2 2 2 llRo/I = /lAxo  B / I = I I B ~ I I ~ 2 IIAx  BIl
so that the lower bound is achieved. Since a unique solution to AX = B exists if and 1
only if the columns of A are linearly independent, the theorem is proved.
For any solution Xo to AX = B 1'
= A X  B = B  (B + B2) = B I Ro 0 1 2 is in [R(A) 1 .
Therefore A*R = 0, or 0
These are the normal equations determining the least squares solution.
If the columns of A are independent, then r(A*A) = r(A) = n, so that the n x n matrix A*A is nonsingular. The least squares solution Xo is deter
mined by A*AXO = A*B, so that X,, = (A*A)IA*B. But,
from our previous work, A+ = (A;A)~A*. Finally, we take up the general case.
Lemma. Let P = A A ~ , Q = ATA. Then, if x and Y are arbitrary vectors (conformable),
and
Proof. Since A = A A 7 ~ , AX = AA~AX = PZ with Z = AX. We now prove that PZ 1 (I  P)Y. This is equi valent to (Pz)*(I  P)Y = 0 or Z*P*(I  P)Y = 0. But
P* = P and p2 = (AA+A)A+ = AA+ = P. Therefore,
5 8 Introductory Matrix Material
P*(I  P) = 0. The first equality above now follows from Pythagoras' theorem. The . second . equality can be
derived from the first using ATT = A.
Another way of phrasing this work is that P is the projection onto the range space R(A) of A while I  P is the projection onto the orthogonal complement of R(A).
Theorem 2.8.5.2. Let A be m x n and B be m x 1. Let
X = ATB. Then for any n x 1 X f Xo, we have either 0
(2) I /AX  B I I = I I A X ~  B I I and
I Ix I I > I Ixol I 
Proof. For any X we have  AX  B = AX, AAiB + AAiB  B
By the previous lemma,
The equality holds here if and only if A(X  Xo) = 0.
Hence if AX # AXO, inequality (1) holds. Suppose, then, that AX = AX Then A ~ A X = AiAxO . .    0'
= A'AA'B = A'B = X 0' Therefore, X = X + (X  X ) = 0 0
ATB + (I  ASA)x. Hence by inequality (2) of the lemma,
so that
Generalized Inverse 5 9
This theorem may be rephrased as follows. Given
the system AX = B. Then the vector ATB is either the unique least squares solution or it is the least squares solution of minimum norm.
PROBLEM
1. A is square and singular. Characterize the solu
tion A'B.
2.9 NORMAL MATRICES, QUADRATIC FORMS, AND FIELD OF VALUES
We record here a number of important facts. By a normal matrix is meant a square matrix A for which
(2.9.1) AA* = A*A.
Examples. Hermitian, skewHermitian, and unitary matrices are normal. Hence real symmetric, skew symmetric, and orthogonal matrices are also normal. All circulants are normal, as we shall see.
Theorem 2.9.1. A is normal if and only if there is a unitary U and diagonal D such that A = U*DU.
Theorem 2.9.2. A is normal if and only if there is a Polynomial p(x) such that A* = p(A).
Theorem 2.9.3. A is Hermitian if and only if there is a unitary matrix U and a real diagonal D such that A = U*DU.
Theorem 2.9.4. A is (real) symmetric if and only if there is a (real) orthogonal matrix U and a real diagonal D such that A = U*DU.
6 0 Introductory Matrix Material
PROBLEMS
1. Prove that A is normal if and only if A = R + is where R and S are real symmetric and commute.
2. Prove that A is normal if and only if in the polar decomposition of A (A = HU with H positive semi definite Hermitian, U unitary) one has HU = UH.
3. Let A have eigenvalues Al, ..., An. Prove that A
is normal if and only if the eiqenvalues of AA* 2 2 2
are lhll : Ih2/ , ..., lhnl .
4. Prove that A is normal if and only if the eigen
values of A + A* are X1 + XI, A2 + P2, ..., in + a .
5 . If A is normal and p(z) is a polynomial, then p (A) is normal.
6 . If A is normal, prove that A~ is normal.
7. If A and B are normal, prove that A B B is normal.
8. Use Theorem 2.9.1 to prove Theorem 2.9.2. C
Quadratic Forms. Let M be n x n and let 2 = (zl, z2,  T ..., zn) . By a quadratic form is meant the function ?. of zl, ..., z given by n 11
It is often of importance to distinguish the quadratic form from a matrix that gives rise to it. The real and the complex cases are essentially dif ferent.
Lemma 2.9.5. Let Q be real and square and U a real
column. Then U ~ Q U = 0 for all U if and only if Q = T Q , that is, if and only if Q is skewsymmetric.
Proof.  T T T T 
(a) Let Q = Q . If a = U QU, a T = a = ~ ~ ~ 
Normal MatrlCeS 61
T u (Q) u = a. Therefore a = 0. m
(b) Let U'QU = 0 for all (real) U. Write Q = Q1 + Q2 where Q1 is symmetric and Q2 is skewsymmetric. Then, for all U
Since Q1 is symmetric, we have for some orthogonal P
and real diagonal matrix A: Q = pTAP. Therefore for 1
ail real Ul U ~ P ~ A P U = (PU)*A (PU). Write PU = (ul, ..., Un), A = diaq(X,, ..., A,). Then we have  A.
n X (G )2 = 0 for all (ul, . . . , u,), hence for all Ik=l k n (ul, .... Gn). This clearly implies Xk = 0, for k =
1, 2, ..., n. Hence Q1 = 0 and Q = Q2 = skewsymmetric.
Theorem 2.9.6. Let Q and R be real square and U be a T T real column. Then U QU = U RU for all U if and only if
Q  R is skewsymmetric. T T Proof. U QU = U RU if and only if U ~ ( Q  R)U = 0.
Corollary. Let Q be real and U be a real column. Then
1 T The matrix (Q + Q ) is known as the symmetriza tion of Q. 2  We pass now to the complex case.
Lemma 2.9.7. Let M be a square matrix with complex elements and let Z be a column with complex elements. Then
for all complex Z if and only if M = 0
Proof
(a) The "if" is trivial. (b) "Only if." Write Z = X + iY, M = R + is
6 2 Introductory Matrlx Material
where X, Y, R, S are all real. Then we are given
(2.9.4) (X*  iY*) (R + i5) (X + iY) = 0 for all real X, Y.
Select Y = 0. Then X*(R + iS)x = 0 for all real X or X*RX = 0 and X*SX = 0. Therefore, by the first
T lemma, K and S must be skewsymmetric: R + R = 0,
S + ST = 0. Expanding the product on the left side of (2.9.4), we obtain
In view of the skew symmetry of R and S and the first lemma, we have x*RX = X*SX = Y*RY = Y*SY = 0. There fore, we have for all real X, Y:
Thus, for all real X, Y, X'(R  R*)Y = 0 and Y*(S  S*)X = 0. Selecting X and Y as appropriate unit vectors (0,. , a , 1, 0, ... , O), this tells us that R  R * = 0 and s  S* = 0. But R* = R~ = R and S* =
sT = S, therefore R = S = 0 and M = 0.
Theorem 2.9.8. Let M and N be square matrices of order n with complex elements and suppose that
for all complex vectors Z. Then M = N.
Proof. As before, Z*MZ = Z*NZ if and only if  Z*(M  N)Z = 0.
~ormal Matrlces 6 3
Note that this theorem is false if (2.9.5) holds only for real Z.
Corollary. z*Mz is real for all complex Z if and only if M is Hermitian.
Proof. Z*MZ is real if and only if Z*MZ =  (Z*MZ)* = Z*M*Z. Hence M = M*.
Let M be a Hermitian matrix. It is called ~~ ~ ~   ~
positive definite if Z*MZ > 0 for all Z # 0. It is called positive semidefinite if Z*MZ 1 0 for all 2. It is called indefinite if there exist Z, # 0 and Z,
I L # 0 such that ZTMZl > 0 > ZzMZ2.
Theorem 2.9.9. Let M be a Hermitian matrix of order n with eigenvalues A,, ..., A,. Then  ..
( a ) M is positive definite if and only if Ak > 0, k = 1, 2, ..., n.
(b) M is positive semidefinite if and only if > 0, k = 1, 2, ..., n. Ik 
(c) M is indefinite if and only if there are integers j , k, j Z k, with A. > 0, Ak < 0.
I Field of Values. Let M designate a matrix of order n. The set of all complex numbers Z*MZ with I IzI I = 1 is known as the fieldof values of M and is dksignated by P(M). 1 IZ11 desiqnates the Euclidean norm of Z. 
The foliohing facts, due to Hausdorff and Toeplitz, are known.
(1) 9(M) is a closed, bounded, connected, convex subset of the complex plane.
(2) The field of values is invariant under unitary transformations:
(2.9.6) Y(M) = F(U*MU), U = unitary.
(3) If ch M designates the convex hull of the eigenvalues of M, then
(2.9.7) ch M 5 F(M).
(4) If Mis normal, then .F(M) = ch M.
64 Introductory Matrix Material Normal Matrices 65
PROBLEMS
1. Show that the field of values of a 2 x 2 matrix M is either an ellipse (circle), a straight line segment, or a single point. More specifically, by Schur's theorem**, if one reduces M unitarily to upper triangular form,
then
(a) M is not normal if and only if m # 0. (a') A1 f A2. 9(M) is the interior and
boundary of an ellipse with foci at XI,
X2' length of minor axis is iml. Length 2 1/2
of major axis (/m12 + IX1  h21 ) .  A2. 9(M) is the disk with center (a") X1 
at hl and radius jm1/2.
(b) M is normal (m = 0). !'! , 8. .,. (b') X1 f h2. y(M) is the line segment 4 joining A and X2. 5  1 .. (b") hl = X 2 . 9(M) is the single point hl. Ci i l I I REFERENCES "a n
General: Aitken, [I]; Barnett and Story; Bellman, 121; Browne; Eisele and Mason; Forsythe and Moler; Gant macher; Lancaster, [I]; MacDuffee; Marcus; Marcus and Minc; Muir and Metzler; Newman; M. Pearl; Pullman; Suprunenko and Tyshkevich; Todd; Turnbull and Aitken.
Vandermonde matrices: Gautschi.
Discrete Fourier transforms: Aho, Hopcroft and Ullman; Carlitz; Davis and Rabinowitz; Fiduccia; Flinn and McCowan; Harmuth; Nussbaumer; Winograd; J. Pearl.
**Any square matrix is unitarily similar to an upper triangular matrix.
Hadamard matrices: Ahmed and Rao; Hall; Harmuth; Wallis, Street, and Wallis.
Generalized inverses: BenIsrael and Greville; Meyer.
UDV theorem: BenIsrael and Greville; Forsythe and Moler; Golub and Reinsch (numerical methods).
CIRCULANT MATRICES
3.1 INTRODUCTORY PROPERTIES
By a circulant matrix of order n, or circulant for short, is meant a square matrix of the form
1 ", $ The elements of each row of C are identical to those
of the previous row, but are moved one position to the right and wrapped around. The whole circulant is evidently determined by the first row (or column). We may also write a circulant in the form
1 . c = (c. = (ckj+l)* subscripts mod n. ik
Notice that
Introductory Properties 6 7
so that the circulants form a linear subspace of the set of all matrices of order n. However, as we shall see subsequently, they possess a structure far richer.
Theorem 3.1.1. Let A be n x n. Then A is a circulant if and only if
The matrix n = circ(0, 1, 0, ..., 0). See (2.4.14).
Proof. Write A = (a..) and let the permutation o 11
be the cycle o = (1, 2, ..., n). Then from (2.4.13)
PoAP* = ( u ao(i) ,o(j) 1  where, in the present instance, Po  n. But A is
evidently a circulant if and only if a.. = a 11 o(i),u(j)'
that is, if and only if nAn* = A . This is equivalent to (3.1.3) by (2.4.17).
We may express this as follows: the circulants comprise all the (square) matrices that commute with
 1 n, or are invariant under the similarity A + nAn . Corollary. A is a circulant if and only if A* is a circulant.
Proof. Star (3.1.3).
PROBLEMS
1. What are the conditions on c2 in order that J
circ(cl, c2, ..., c ) be symmetric? Be Hermitian n symmetric? Be skewsymmetric? Be diagonal?
2. Call a square matrix A a magic square if its row sums, column sums, and principal diagonal sums are all equal. What are the conditions on ci in order , that circ(ci, c2, ..., c ) be a magic square? n
3. Prove that circ(1, 1, 1, 1) is an Hadamard matrix. It has been conjectured that there are no other
68 Circulant Matrices I Introductory Properties 6 9
circulants that are Hadamard matrices. This has been proved for orders 5 12,100. (Best result as of 1978. )
A Second Representation of Circulants. In view of the
structure of the permutation matrices nk, k = 0, 1, ..., n1, it is clear that
Thus, from (3.1.21, C is a circulant if and only if C = p(n) for some polynomial p(z). Associate with the ntuple y = (cl, c2, ..., c n ) the polynomial
n1 (3.1.5) pY(z) = c1 + c 2 z +   . + c n z . The polynomial p Y (z) will be called the representer of
the circulant. The association y ff p Y (2) is obvious
ly linear. (Note: In the literature of signal proc essing the association y ++ p (l/z) is known as the
I * Y ztransform.) The' function
. i (n1) 8 1 (3.1.5') $(€I) = Oy(8) = c1 + c 2 ei8 +  .  + cne 5 m, , is also useful as a representer. 3: Thus, 111
5 (3.1.6) C = circ y = p (T). Y
Inasmuch as polynomials in the same matrix com mute, it follows that all circulants of the same order commute. If C is a circulant so is C*. Hence C and C* commute and therefore all circulants are normal matrices.
PROBLEMS
1. Using the criterion (3.1.3), prove that if A and B are circulants, then AB is a circulant.
2. Prove that if A is a circulant and k is a non
negative integer, then is a circulant. If
A is nonsingular, then this holds when k is a negative integer.
3. A square matrix A is called a "left circulant" or a (1)circulant if its rows are obtained from the first row by successive shifts to the left of one position. Prove that A is a left circulant if and only if A = TAT (see Section 5.1).
4 . A generalized permutation matrix is a square mat rix with precisely one nonzero element in each row and column. That nonzero element must be +1   ..  or 1. How many generalized permutation matrices of order n are there?
5. Let C be a circulant with integer elements. T Suppose that CC = I. Prove that C is a general
ized permutation matrix.
6. Prove that a circulant is symmetric about its main counterdiagonal.
7. Let C = circ(al, a2, ..., an). Then, for integer m,
nmc = circ (a a2,,,' ., anm) Subscripts mod n.
8. By a semicirculant of order n is meant a matrix of the form
Introduce the matrix
Show that E is nilpotent. Show that C is a
70 Circulant Matrices
semicirculant if and only if it is of the form C = p(E) for some polynomial p(z).
9. Prove that if (d, n) = (greatest common divisor of d and n) = 1, then C is a circulant if and
d only if it commutes with n . Hence, in partic ular, if and only if it commutes with n*.
10. Let K[w] designate the ring of polynomials in w of degree < n and with complex coefficients. In K[wl the uzual rules of polynomial addition and multiplication are to hold, but higher powers are
to be replaced by lower powers using wn = 1. Prove that the mapping circ(cl, c2, ..., C n ) ++
n 1 C1 + C W + "' +cnw 2
[or circ y ++ py (w) 1 is
a ring isomorphism:
(a) If a is a scalar, u circ y ++ ap (w). Y (b) circ y1 + circ y 2 ++
(c) (circ y ) (circ y2) f 1 (W)P (w). Pyl Y2 ., n
11. Let circ y ++ py (w). Then (circ y)T ++ w p (w') Y
;; '*, s Block Decomposition of Circulants; Toeplitz Matrices. ir. The square matrix T = (t,,) of order n is said to be
* $ (3.1.7) t.. = t. i, j = 1 , 2, ..., n  1 . I 1 I 1+1, j+lr .' ,
Thus Toeplitz matrices are those that are constant along all diagonals parallel to the principal diagonal.
Example. (4 It is clear that the Toeplitz matrices of order
n form a linear subspace of dimension 2n  1 of the space of all matrices of order n. It is clear, fur thermore, that a circulant is Toeplitz but not neces sarily conversely.
A circulant C of composite order n = pq is auto matically a block circulant in which each block is
Toeplitz. The blocks are of order q, and the arrange ment of blocks is p x p.
Example. The circulant of order 6 may be broken up into 3 x 3 blocks of order 2 as follows:
where
It may also be broken up into 2 x 2 blocks each of order 3.
A block circulant is not necessarily a circulant. This circulant may also be written in the form
Quite generally, if C is a circulant of order n = pq, then
where I nJ are of order p and where the A. are P ' P I
Toeplitz of order q. A general Toeplitz matrix T of order n may be  7 .
embedded in a circulant of order 2n as ( G . See also Chapter 5.
72 Circulant Matrices
3.2 DIAGONALIZATION OF CIRCULANTS
This will follow readily from the diagonalization of the basic circulant n.
Definition. Let n be a fixed integer 2 1. Let w = exp(2ni/n) = cos (2n/n) + i sin(2n/n), i = a. Let (3.2.1) 2 n1) n = ( Q ) =diag(l, w, w , ..., w n
k 2k Note that rlk = diag(1, w , w , . . . , w (nl)k) Theorem 3.2.1
Proof. From (2.5.31, the jth row of F* is
(I/&) (w T  l ) O , w(jl)l , ..., w j 1 . Hence the
jth row of F*R is ( 1 w r  wr) = (I/&) (Wjr),
r = 0 1, . . . ,  1 . The kth column of F is (I/&) (;(kl)r , r = 0 1, .  1 . Thus the (j,k)th
! element of F*RF is . 1 i f j = k  1 ,
mod n
lli am Then (3.2.2) follows.
NOW
(3.2.3) C = circ y = p (n) = p (F*~F) Y Y
Thus we arrive at the fundamental
Theorem 3.2.2. If C is a circulant, it is diagonalized by F. More precisely,
Diagonalization of Circulants 73
(3.2.4) C = F*AF
where  (3.2.5) = A, = diag(p Y (I), p Y (w), . . . , py (wnl)). The eigenvalues of C are therefore
(Note: The eigenvalues need not be distinct.) The columns of F* are a universal set of (right)
eigenvectors for  all circulants. They may be written T as F*(O, ..., 0, 1, 0, ..., 0) .
We have conversely
Theorem 3.2.3. Let A = diag(hl,X2, ..., An); then C = F*AF is a circulant.
Proof. By the fundamental theorem of polynomial interpolation, we can find a unique polynomial r(z) of
degree 5 n  1, r(z) = dl + d2z + .  . + dnznI and such that r(wJl) = A;, j = 1, 2, .. ., n. Now, form
J
D = circ(dl, d2, ..., d n ) It follows that D = F*AF =
C, so that C is a circulant.
With regard to the diagonalization (3.2.4), it should be observed that there is really no "natural" order for the eigenvalues of a matrix. Corresponding to every permutption of eigenvalues, there will be a unitary matrix F for which a formula analogous to (3.2.4) will be valid.
More precisely, let C = F*AF and let Pn be the  permutation matrix corresponding to the permutation a. Then C = F* (P;Pu) A (P;Po)F = (F*P;) (P,AP;) (PaF). Now
T if A = diag(Al, ..., An) and L = (A1, ..., An) , then from (2.4.19), PaAP; = diag (POL). If we now let @ be the unitary matrix F = PaF, we have
74 Circulant Matrices
We have found it to be convenient to standardize the order of the eigenvalues in the way we have done, leading to (3.2.4).
Let us exhibit the solution of this interpolation problem more explicitly. Write
L = (:') and yT = (i') . 'n n
Then, from (25.11 and (2.5.14).
(3.2.7) yT = n 1/ZFL
and
(3.2.8) py(z) = n 1 1 Z ..., z )FL.
'. Also,
,, i. (3.2.9) A = n1/2diag(~*yT) * a: . It. Since F2 = and FF* = I, one also has the identity .,  2 T (3.2.10) FyT = F (F*y ) = nl/'r~. I% il OW MI
On the basis of the fundamental representation !. (3.2.41, it is now easy to establish that
Theorem 3.2.4. If A and B are circulants of order n
and ak are scalars, then A ~ , A*, a A + u2B, AB, 1
li,oak~k are circulants. Moreover, A and B commute. If A is nonsingular, its inverse is a circulant. With A = F*AF, A = diag(hl, ..., An) its inverse is given by
where
Diagonalization of Circulants 75
Since
(circ (c T  c2, . c  circ(cl. cn, c ~  ~ , ..., c2) = r ( c ~ , c2, . .. , T
cn) 8
if we write
Y = (cl, C2' ... , Cn), we have
The determinant of a square matrix is the product of its eigenvalues. Therefore from (3.2.61,
(3.2.14) det(circ y) = det circ(cl, c2, ..., cn) n
= n (wjl ) . j=1 Y
If
m f(z) = a0z + a z m1 + .. . 1 + a,,,, a" # 0,
n g(z) = b0z + blz nl + . . . + bnr bo # 0
and have roots al, ..., a ; el, ..., 8, respectively, m the resultant R(f, g) of f and g is defined by
R(f, 9) = a:g(al)g(a2) 0 . . g(am)
mn n =  1 bof (el)f (B2) " ' f (0,)
= (l)rn"~(g, f).
I Thus, with f (z) = zn  1, g(z) = py(z), we have
76 Circulant Matrices
where pl, ..., "1 are the roots of p (2). Y
In this way, det circ is expressed as the resul
tant of the two polynomials zn  1 and pY (2). In the case of real elements, the representation
(3.2.14) may be simplified somewhat. Let y = (c lr C2' ..., Cn), py(z) = C1 + C z + " ' 2 + cnzn1, w =
8 . exp (27ri/n). Then
, ,.  = wnj w3 = exp(
, .  and therefore, with c's real,
Z If now n = 2r + 1 = odd, then : 5
n 1 r 2 det circ y = Il p (wJ) = py (1) Il I P (wJ) 1 . j=o Y j=1 Y
Z If n = 2r + 2 = even, I
Corollary. Let y = (c,, c,, ..., c,,) have real com  n .A
ponents. If n is odd, then Ijel ci 5 0 implies det circ y > 0.
If n is even and n = 2r + 2, then
n Proof. We have p (1) = Ijz1cj and p (1) =  Y Y
Diagonalization of Circulants 7 7
1 (l)lcj. Since jp (wj) 1 2 0, the odd case is Y
immediate. For the even case, note that
Conditions for det circ y > 0 or for det circ y < 0 are easily formulated.
A square matrix is called nondefective or simple if the multiplicity of each of its distinct eiqenval ues equals its geometric multiplicity. By geometric multiplicity of an eiqenvalue is meant the maximal number of (right) eigenvectors associated with that eiqenvalue. A matrix is simple, therefore, if and
n only if its right eigenvectors span C . Equivalently, a matrix is simple if and only if it is diaqonalizable. It follows from Theorem 3.2.2 that all circulants are simple.
As we have seen, all circulants are diagonalized by the Fourier matrix, and the Fourier matrix is a particular instance of a Vandermonde matrix. It is therefore of interest to ask: what are the matrices that are diagonalized by Vandermonde matrices?
Toward this end, we recall the following defini tion. Let
(3.2.15) $(x) = xn  a x n1  a x n2  ... n 1 n 2
 alx  a 0 be a monic polynomial of degree n. The companion matrix of $,
c$ r is defined by
78 Circulant Matrices
It is well known and easily verified that the charac teristic polynomial of C '$ is precisely +(x). Hence,
if a,,, alp ..., a n 1 are the eigenvalues of C dr' we have
Theorem 3.2.5. LetV = V(ao, al, . . . , a n 1 ) designate the Vandermonde formed with ao, ..., anl [see (2.5.12)l. Let D = diag(ao, al, ..., a n1 ) . Then
If the a. are distinct, V is nonsingular, which I
gives us the diagonalization
(3.2.19) 1 C$ = vov .
Hence, for any polynomial p(z),
Proof. A direct computation shows that the first n  1 sows of VD and of C V are identical. Now the
dr element in the (n, j) position of VD computes out to
n be ajl. The element in the (n, j) position of C dr V
computes out to be
Diagonalization of Circulants 79
this reduces to an 31 Therefore VD = C $ V.
Since det V = lli<j (ai  ai), it follows that V is nonsingular if and only if the a. are distinct. In
I this case we can arrive at (3.2.19).
Example. If we select @ (x) = xn  1, then C = T. dr The roots of $ are wJ, j = 0, 1, . .., n1 and V is a scaled version of F*. Since all polynomials in C, = n
Y are circulants and vice versa, (3.2.20) reduces to (3.2.4).
Let us note another consequence of (3.2.2) which is of interest.
Let P (= Po) be the permutation matrix corres
ponding to the permutation a. From (2.4.111 we know that PP* = P*P = I, so that P is unitary and normal. It follows from general theory that P is unitarily diagonalizable. It is often useful to be able to exhibit this diagonalization explicitly.
In Section 2.4, we arrived at the following iden tity. Let o be factored into the product of disjoint cycles of lengths pl, p2, ..., pm. Then, by (2.4.25),
  ... there is a permutation matrix R such that
RPR* = n B n B ... n . p1 p2 pm
From (3.2.2),
~l = F * n F , j = 1, 2, ..., m, P P. Pj 3 3
where F and n are the Fourier and 0 matrices of P. 3
order p.. Thus if we set 3
(3.2.21) U = F O F o . . " F , p1 p2 Pm
n = n (B ... " np2 e n , p1 pm
we have
By (3.2.15) this is an 31  a . II I, and by (3.2.17)
80 Circulant Matrices
RPR* = U*AU,
so that
Observe that A is diagonal and U, and hence UR are unitary.
PROBLEMS
1. If A and B are square and AB is a circulant, are A and B circulants?
2. If is a circulant, is A a circulant?
3. Diaqonalize J = circ(1, 1, ..., 1). 4. Diagonalize circ(a, a + h, a + 2h. ....
a + (n  1)h). Find its determinant. 2 n1)
5. Diagonalize circ(a, ah, ah , ..., ah . Find its determinant.
6. Diaqonalize circ (1, 3, 6, 10, . . . , n(n + I)/>). 7. Diagonalize A = pI + qJ. J is as in Problem 3.
Find det A.
8. In Problem 7, prove that if p > 0 and p + nq > 0, A is positive definite symmetric.
9. Diagonalize circ(1, s, 0, 0, ..., 0, S). 10. Let C be a circulant with eigenvalues Ak. Show
T that C = F*diaq(hl, An, An,, ..., h2)F. 11. Diagonalize the checkerboard circulant
circ(0l 01 01 ... 01). 12. Diagonalize circ(001 001 001).
13. Diagonalize circ(0, 1/2, 0, 0, ... , 0, 1/2) = 1/2(7~ + n*). (Random walk on a circle. One dimensional lattice.)
14. Analyze circ(0, p, 0, ..., 0, q), p + q = 1.
15. Prove that a circulant C is real and has eigen values A . if and only if A . = Xn+lj, j =
I I 1, 2, ..., n.
Diagonalization of Circulants 8 1
Let
1 G2 = circ(5, 1 + fi, 1, 1  A, 1, 1  fi,
1, 1 + a) . Show that G2 and G3 are symmetric circulants and

that G2G3 = G G = G2. 3 2 Let A, B be circulants of order n with eigen values X
A , j r ' ~ , j ' j = 1, 2, ..., n. Prove that
AB = A if and only if h B r i = 1 whenever AA f 0. . i Prove that a circulant is~ermitian if and.only if its eigenvalues are real.
Prove that a circulant is unitary if and only if its eigenvalues lie on the unit circle.
Prove that a circulant is Hermitian positive def inite if and only if its eigenvalues are positive.
Prove that circ(cl, c2, ..., cn) has all row and column sums equal to a if and only if IEZ1ck = a.
Prove that if A is normal and has all row sums equal to a, then all column sums equal a.
Prove that A is normal if and only if there exists a unitary U and a circulant C such that A = U*CU. In other words, A is normal if and only if it is the unitary transform of a circulant.
A matrix M is said to be periodic if there exists
p 1 1 such that M~ = I. Find all the circulants of order n that satisfy this equation.
Prove that det circ(x, 1, 1, 1, 1) =
Prove that det circ(a,, a,, a,, 0, 0, ..., 0) I L J n
= a; + a;  5;  C2 where il and c2 are the *
roots of xL + a x + a a = 0. 2 1 3 Prove that
Circulant Matrices
det circ(a, a, ..., a; b, b, ..., b) (ma + nb) (a  b) m+n1 if (m, n) = 1,
if (m, n) > 1.
Here m = number of a's, n = number of b's, and (m, n) = greatest common divisor of m and n.
Prove that
2 r1 det circ(1, a, a , ..., a , 0, 0, ..., 0)
(0. Ore.)
Prove that
det circ(ao, al, a 2' 0, 0, . . . , 0) =a:+a; (1) n+s  n  s ) (aoa2) al . ( n n S
s n2s
s=o 1 s' ~n
(0. Ore.)
The matrix circ(1, 2, 1, 0, 0, ..., 0) occurs in the theory of morphogenesis (diffusion on a circle). Diagonalize it. Generalize; for exam ple, circ(1, 3, 3, 1, 0, ..., 01, circ(1, 4, 6, 4, 1, 0, 0, ..., 0). Let c2 = c n+l = C Nn+l = cN = 1. All other c's =
0. Find the eigenvalues of circ(cl, c2, ..., cN). (Twodimensional lattice.)
Let p(z) be the representer of the circulant C.
Prove that C is idempotent (12' = C) if and only
if p(w3) = 0 or 1 for j = 0, 1, ..., n1. If A is square, of order n, define per(A) as the determinantal expansion of A in n! terms where all the minus signs have been changed to plus.
a b For example, per (c = ad + bc; per(A) is called the permanent of A. Let Dn = per(J  I) with J as
Diagonalization of Circulants 8 3
in Problem 3. Prove that
(For this and applications of circulants to combinatorial problems, see Minc.)
3.2.1 Skew Circulants
A skew circulant matrix is a circulant followed by a change in sign to all the elements below the main diagonal.
Example
(3.2.1.1) scirc(a, b, c, d) =
b c d a
In the same way that the theory of circulants is related to the matrix n, the theory of skew circulants is related to the matrix
r o I o ...
The main development of the theory is given in the next group of problems, and the solutions can be car ried out along the lines already indicated for circu lants. Skew circulants have also been called negacyclic matrices.
The notion can be extended somewhat by using the matrix
Circulant Matrices
where lkl = 1. A {k)circulant is one which commutes with nk. For k = 1, k = 1 we obtain the circulants
and skew circulants respectively. Representations analogous to those given in the Problems are valid.
PROBLEMS
3. A is a skew circulant if and only if An = 0A.
4. The characteristic polynomial of q is n (1) (in + l), and its eigenvalues are o, ow,
2 ow , . . . , ownI where
71 TI o = cos  + i sin , n n
2 n 2n w = o2 = cos  + i sin  . n n
 1 Note that o = o . 5. The eigenvectors of n corresponding to these roots
2 n1 T 2 are (1, o, o , ... , o ) , (1, ow, (ow) , ... ,
n1 T 2 2 2 2 n1 T (OW) r (I, ow , (OW ) , .. r (ow ) ) ,  . . r (1, ownI, (ow n1)2 n1 n1 T , ..., (ow 1 1 .
6. The eigenvalues of scirc(al, a2, ..., an) are
n 1 where p (z) = al + a z + a z2 + '.. 2 3
+ anz . 2 n 1
7. Define fill2 = diag(1, o, o , . . . , 0 ) , R = 2 "' and fi are unitary. diag(l,w,w , . . . , w ).fi
Moreover.
Diagonalization of Circulants 8 5
112, 1/2 Q = (FR *(on) (FR 1 . 8. S is a skew circulant if and only if it is of the
1 2 ) where A is diagonal. form s = (~5~") *A(FR
9. S is a skew circulant if and onlv if it is of the
form S = R1/2~~1'2, where C . ,.. is acirculant. 
.. 11. If S, V are skew circulants and q(z) is a poly
nomial in z, then ST, S*, SV, q(S), S' (cf.
Theorem 3.3.1). SI (if it exists) are skew circulants. Moreover, S and V commute.
3.3. MULTIPLICATION AND INVERSION OF CIRCULANTS
Since a circulant is determined by its first row, it is really a "onedimensional" rather than a "two dimensional'' object. The product of two circulants is itself a circulant, so that a good fraction of the arithmetic normally carried out in matrix multiplica tion is redundant. For circulants of low order, multiplication can be performed with pencil and paper using the abbreviated scheme sketched below.
Product of two circulants: : ) : :5=(32 37 36 32 37) 36
Abridged multiplication: 1 2 4 4 5 6
It is seen from this that the multiplication of two circulants of order n can be carried out in at most
n2 multiplications and n(n  1) additions.
8 6 Circulant Matrices
However, using fast Fourier transform techniques,
the order of magnitude n2 may be improved to O(n log n). Recall the relationship between the first row y
of a circulant C = circ y = circ(cl, c2, ..., cn) and its eigenvalues Al, ..., An. From (3.2.7) we have
Now let A have first row a and eigenvalues XAJ1.
..., AA,n and B have first row 0 and eigenvalues hBrlr .., XB,n. Let the product AB have first row y.
Then
(3.3.2) A = circ a = F*dlag(XArl, ..., XA,n)F, B = circ 0 = F*diag(pBrl, ... , fiB,n)Fr
SO that
(3.3.3) AB = circ y = F*diag(A X X IF. A,1 B,1' ...' 'A,n B,n
Now from (3.3.1)
f n1/2~*aT = ( x ~ , ~ , . . . , T 'A,n)
n 1/ZF*@T = ( A ~ , ~ , ..., XB,n )T. Therefore, we have
(3.3.4) yT = n1/2~ [ (F*aT) f (F*B~) I .
The symbol ? is used to designate elementbyelement product of two vectors.
Thus the multiplication of two circulants can be effected by three ~burier transforms plus O(n) ordin ary multiplications. Since it is known that fast techniques permit a Fourier transform to be carried out in O(n log n) multiplications, it follows that circulantbycirculant multiplication can be done in O(n log n) multiplications.
It would be interesting to know, using specific computer programs, just where the crossover value of n is between naive abridged multiplication and fast Fourier techniques.
Multiplication and Inversion of Circulants 8 7
MoorePenrose Inverse. For scalar A set
and for A = diag (A l1 X21 ..., in) set 
(3.3.6) A'=diaq(h;, A;, ..., A:).
Theorem 3.3.1. If C is the circulant C = F*AF, then its MoorePenrose generalized inverse (MP inverse) is the circulant
Proof. The four conditions of Section 2.8.2 are  immediately verifiable for c7 (or see Theorem 2.8.3.2).
Corollary
where Bk are the matrices Bk = F*AkF, Ak = diag(0, 0,
..., 0, 1, 0, ..., 0). In particular, 
(3.3.9) Bk   Bk.
Circulants of Rank n  r, 1 5 r 5 n. Insofar as a circulant is diagonalizable, a circulant of rank n  1 has precisely one zero eigenvalue. If C = F*AF, then C has rank n  1 and only if for some integer j, 1 < I 5 n, 
with ui # 0, i p j. Now,
and C' = F*A'F, so that
88 Circulant Matrices
(3.3.12) CC: = CTC = F*(l, 1, ..., 1, 0, 1, ..., 1)F, where 0 occurs in the jth position. From this it follows that
(3.3.13) CC' = cfC = F*(I  A . ) F = I  F*A.F = I  B.. I I I
The B. are the matrices given by (3.3.8). For I
circulants of rank n  2, one has 
(3.3.14) CC = C'C = I  B.  B I k
for some i, j, j # k.
PROBLEMS
1. Let A, X , B be of order n. Let A and B be cir culants. Prove that AX = B has a solution if and only if, wherever an eigenvalue of B is not 0, the corresponding eigenvalue of A is not 0. In this case, there is a solution X that is a cir
? culant. 1. .
2. Let A, B be circulants of order n with eigenvalues 1, . 5' ..., An; ul, ..., 'n' Let p(x, y) be a poly
nomial in x, y. Prove that the eigenvalues of ! p (A, B) are precisely p (Xi, u . ) , j = 1, 2, . . . , n. I
f , Remark: A theorem of Frobenius says that if A and B commute, then the eigenvalues of p(A, B) are precisely p(X. pj), j = 1, 2, ..., n for some
I ' I pairing of the eigenvalues. This has been gener 1 alized by numerous authors.
I ! Circulant Inverses, Continued. Let C = circ(al, a2,
..., an) and let . . . + anz n 1 (3.3.15) p(z) = al + a2z +
i be its representer. From (3.1.4) one has
(3.3.16) c = p(n).
The last few coefficients in (3.3.15) may be zero. Assuming that C + 0, let us rewrite (3.3.15) in the form
Multiplication and Inversion of Circulants 8 9
with 1 5 r  < n  1 and ar # 0.
Suppose that ul, u2, ..., pr 1 are the zeros of     the representer p(z) (to be distinguished from the eigenvalues of C). Thus p(z) = ar(z  u )(z  u )  .  (Z  u ~  ~ ) . hence 1 2
. . . (7  urlI). This gives us a factorization of any circulant into a product of circulants n 11 I that are of a particular ly elementary type. k
Suppose now that C is nonsinqular. This is true if and only if none of the eigenvalues of C is zero.
That is, if and only if A . = p(wll) # 0, j = 1, 2, 1 ..., n. This will be true if and only if pk # an nth
n root of unity. Thus pk # 1, k = 1, 2, ..., r1. From (3.3.18) one has
Let us examine a typical factor. Let u be a complex variable. Then, for a given matrix M, a
1 . matrix of the form (M  pI) 1s called the resolvent function of M. The resolvent of n has a particularly simple form.
Theorem 3.3.1 Let U" # 1. Then
(3.3.20) ( "  = 1 n[pnl~ + pn'n + pn3n2
1  I J
Proof. Multiply the right side by n  uI and use  the fact that nn = I.
9 0 Circulant Matrices
 1 We may also relate C to the reciprocal of p(z).
Let C be a circulant with representer p(z). Suppose
that p(eiO) # 0, 0 < 8 5 27. Then, since the zeros of a polynomial are isolated, p(z) is not zero in some open annulus A that contains lzl = 1 in its interior.
Thus [p(z)]' is regular there, hence has a Laurent expansion
 b.21) = which converges absolutely in A, and ~ ( z ) (Ij=_ I. It follows that the series
! converges, and one has p(n) (IT=blnl) = 1.
i Theorem 3.3.2. Let p(eiO) # 0, 0 5 0 5 2 ~ . Then
n Proof. Make use of n = I to regroup the terms 
in 17 ,= b , 73. Circulant Inversion by FFT Techniques. Let C = circ Y = circ (cl, c2: ..., . c n ) = F*diag(hl, ..., hn)F. Then
ci = F*diag(X;, hi, . . . , h;)F. Let C' = circ 8: then
from (3.2.7) or (3.3.1)
n1/2F*yT = (Alr ..., in) T , B~ = nll2F(h;, hi, ..., A*)
Thus
T (3.3.24) 8 = F(F*~~)'. I 
The notation ( ) means apply "t" element by element. A somewhat more aesthetic form of (3.3.24) is as
Multiplication and Inversion of Circulants 91
follows. For c = circ y, write c7 = circ y7. Then
(3.3.25) (yilT = F(F*~~)'.
From (3.3.25) it appears that a circulant inverse (or generalized inverse) can be computed in two Fourier transforms plus n ordinary reciprocations. Thus it can be done in O(n log n) multiplications.
The same line of reasoning allows us to compute f(C) where f is any function defined on the eigenvalues A of the circulant C. Write C = circ y and f(C) = k circ 8. Then nl'*E'*yT = (Al, h2, ..., But the
eigenvalues of f (C) are f (Al), . . . , f (An), so that fiT = n1'2F(f(~l), f (A2), ..., f(An))T. Thus
where we use the notation
PROBLEM
1. Let C be a circulant of order n with representer p(z) and characteristic polynomial q(z). Prove
that zn  1 divides q (p (2)).
3.4 ADDITIONAL PROPERTIES OF CIRCULANTS
Multiplication of Circulants. Let us look more closely at the product of circulants. Let C,, k = 1, .. 2, ..., p be circulants with diagonalization C = F*AkF, Ak = diagonal. Then k
92 Circulant Matrices
From this it follows that the eiqenvalues of the product C1C2..C P are the product of the eigenvalues.
This is an essential feature of a11 families of mat rices that are  simultaneously diagonalizable by a fixed matrix.
A special case of (3.4.1) is
Rank. The rank of a diaqonalizable matrix is equal to the number of its nonzero eigenvalues. Hence, if C = F*AF, A = diag(hl, . . . , A ) , then r(C) = number of the
X's that are not zero. From (3.4.2) it follows that
Trace. Let C = circ(clr C2, ..., c ) = F*AF, A =  n diag(hl, ..., An). Then
where y = (cl, C2. . . . r Cn).
From (2.7.16) we have 
(3.4.6) t r ( ~ ~ * ) = tr(~*Ax~) = tr(bh)
Determinant. The determinant of circ(cl, c2, ..., Cn) is a homogeneous polynomial of degree n in the
Additional Properties of Circulants 9 3
variables cl, ..., c n ' There are no "simple' formulas.
We note the first four cases:
(3.4.7) n = 1, det circ(cl) = cl,  
n = 2, det circ(cl, c21 = c 2 2 1  C24
n = 3, 3 3 det circ(cl, c2, c3) = cl + c2 + c 3 3
 3 c c c 1 2 3'
n = 4, det circ (c 1 r C2, C3, c4) =
4 4 4 c1  C2 + C3 4  C4  2c3c: + 2c2c4)
Spectral Decomposition. Let C = F*AF where A = diag(X1, X2, ..., An). Introduce the diagonal matrices
where the 1 occurs in the kth position. Now A =
diaq(hl. .... An) = ~ ~ = l ~ k ~ k , so that C = ~ ~ = l l k ~ * ~ k ~ .
If we set
then we can write
The matrices Bk are the component or principal idem
potent matrices of the circulant C. The matrices B k
are, of course, circulants. Note that B.B = I k
'1
H
o
w3
H
.
a
a
a.
w
r
ti
wr
t
I1 
P
xu
 xu
.
11
>w
wq
'1
t
ip
**
M
II
(D
(D
>>
x 3
0 X
u.
I1
Sm
wm
mI
I*
>
w I
rm
r.
x.
e
m
or
t
m
0
ti
P
II ti
rm
W 3t
3q
c 0
mr
t
m
0
07
3
rt
I1
 r
om
mr
.
m m
11
3x
*x
m
m
$
x  H
. 11
w 'i U
'
rt
m
wm
x
7O
OH
* m
H
3
3
>r
.F
am
x
*a
>
mx
(D
I1 (D
m
X
 *
n
F
I 7
C m

mr
rt
3
w r
3Y

r
tr
r
 m
\n
,
r
3
m
w
. 2 22:
3 
rt
Yt
Y
r
m3
0
rt
m
tT
+f
(D
f 
ft
T r
t n
m
7
r
(D
. N
w  (D
I+
P
. +
c "
2
3
(0
.n
3
.r
t
C
r.
w 0
! +
3 c m
f
a m
 (D
rm 0
3
r.
m
 3
0
rt
a s
I1 0
0
rt3
0
7
r.
m*
ti
r
I
S z
$2
m
r.
. o
5'
" ?
2G:
0
m
a
P
r..
ti
wr
t '
n
0
 m
u
lr
a
w m
c
oo
w
0
ti
03
(D
n
z a
ul
F
ma
\
w n
W
N
O
WN
Zr
a
n
II n
n
r.
ti
0
* ?;
g c
a
ww
c
I
r.
a
rt
rt
w w
0
r.
(D
3
ID
ao
.
rt
 rt
3
Y
n VF
w 0
P \
m
\N
<
N
 W

Y
F?
2
,I
ma
ti
\
N
OF
<
 m
in 
E.
Y
n
.
xr
r
7
. \ m
.
N
 .
rt
n
Y
0
r
3r
\
Ym
h
N
79

 m
c
N
'1
3
W .
ti
I1 r.
m
m
n
9 6 Circulant Matrices
2. Let B. be the matrices of (3.4.9). Let C be a 3
circulant with eigenvalues nlr ..., qn. Prove that
B.C = 11 .B.. I 3 I
2 3. ~ e t y = (1, w, w , ... , wnl)T. Prove that B.Y =
3 6.Y where 62 = 1, 6. = 0 otherwise. Prove that I  3
8.y = ~ . y where E = 1, E . = 0 otherwise. 3 3 n 3
4. Outer product expansion. Let A be of order n and have the singular value decomposition A = UDV* where u and V are unitary and D = diag(dl, ..., d ) (see (2.8.3.1)). Let Ak be as in (3.4.8) and n set Bk = uA~v*, k = 1, ..., n. Let u. be the ith
1
column of U and v* be the jth row of V*. Show 3
(a) B = u v* (the outer product of uk and vk); k k k
(b) The matrices Bk have rank (1); B.B" 0, 1 I
i f j; (d) ~ ! B . B ? = I; (e) tr(BiBT) = 1 (see 1=1 1 1
Minimal Polynomial of a Circulant. Let A be a matrix whose characteristic polynomial is
where X ..., X are distinct and the integers ak 1. 1' s Then the minimal polynomial of A has the form
with 1 < f3; 5 a;, j = 1, 2, . . . , s. Now, it is known  J ,
that a matrix is simple (diagonalizable) if and only if its minimal polynomial has only simple zeros. There fore if A is simple, in particular, if A  circulant, then
Additional Properties of Circulants 97
In other words, m(A) is that monic polynomial of minimal degree which has as its zeros all the distinct eigenvalues of A. Of course, one has m(A) = 0.
Derivatives of Circulants and of Determinants of Circulants. Let A be an m x n matrix whose elements a,, = a,,(t) are differentiable functions of t on
A J 'J some common interval. By dA/dt or we mean the m x n matrix I (d/dt)a. . I . It is easy to verify the
1 I identities
d d A dB (3.4.22) =(aA + 68) = a dt + 6 . dt' n, 6 scalar
constants,
d (3.4.23)  dA da dt a A = u  +  A , dt dt u = scalar function.
If A and B are compatible for multiplication,
d  d A dB (3.4.24) d t ( A B ) =  B + A  d t dt'
If A is square and nonsingular,
Now let A = A(t) = circ(cl. c2, ..., c ) where c. n 3 = c.(t) are differentiable functions. Then by (3.2.2)
3
where A = diag(Al(t), ..., hn(t)) and
Then
with
Of course, one also has from (3.1.4)
98 Circulant Matrices
n1 dc.
Let c. = c.(t) be differentiable functions and I 3
set A = A(t) = det circ(cl, c2, ..., c,). The follow ! ing identity is valid.
c c . . . c' C' n1 n
dA cncl ... c (3.4.27) = n det n 2 'n9 . . . . . . .
2 ... C C n 1
From the ordinary law of determinant differentiation, one has
... C' C'
... C
 dA = det d t
C n 1
C1 C2 ... C C n 1
C' C' ... C'
+ det
C2 C3 . a . 'n C 1
+ ... I C1 C2 . .. C
+ det (: ! I c c . . . c' c' ! i
n 1
Additional Properties of Circulants 9 9
Now it turns out that these n determinants are all equal; hence the theorem.
In order not to get lost in a welter of notation, we show this in the case n = 3. It is merely a row column interchange. The method is perfectly general. Note that
and
Since n* =  1 n , we find, upon taking determinants, that all the determinants in the previous expansion are equal.
3.5 CIRCULANT TRANSFORMS
Let C = circ y, y = (cl, c2, ..., cn) be a circulant of order n. Let Z = (zl, z2, ..., z ~ ) ~ and w =
T (wl, w2, ..., wn) . If W is related to Z by means of
(3.5.1) W = CZ,
then W is called the circulant transform of Z by C. It is also called the circular convolution or the wrapped convolution of y and 2 .
We mention a number of circulant transforms of of particular interest:
(1) C = I = circ(1, 0, . . . , 0). This is the identity.
(2) n = circ(0, 1, 0, . . . , 0). This is the fundamental circulant. n causes a circular shifting of the components of Z.
100 Circulant Matrices
r (3) For integer r, n causes a circular shifting of the components of Z by r positions.
( 4 ) D = I  n = circ(1, 1, 0, 0, ..., 0) Since DZ = (zl  z2, z2  z3, T ..., z  zl) , n it is clear that D is a circular differenc ing operator. 
(5) For integer r 2 0, D~ = (I  is a circular differencing operator of the rth order. 
(6) For s, t > 0, s + t = 1, the circulant transform C = SI + tn is, as we shall show later, a smoothing operator.
Let C = F*AF; then (3.5.1) becomes
so that if one writes 2 and ii for the Fourier trans forms of Z and W, one has
If C is nonsingular, then the inverse transform is given by
and is itself a circulant transform. If C is singular, then (3.5.1) may be solved in
the sense of least squares, yielding
This, again, is a circulant transform that is often of interest. r As a concrete instance of (3.5.4). select C = n , r = 0, +1, t2, ... . Then nrZ is just Z shifted
circularly by r indices. Since nr = F*R~F, R = 2 n1)
diag(1, w, w , . . . , w , one has
Circulant Transforms 101
This is known as the shift theorem.
PROBLEM
1. Is the circular convolution of two vectors a commutative operation?
3.6 CONVERGENCE QUESTIONS
Convergence of Sequences of Matrices. Let MI, M2, ... be a sequence of matrices all of the same order. Iteration problems often lead to questions about whether certain infinite sequences or infinite products of matrices converge. In the case of infinite products, particular importance attaches to whether the limiting matrix is or is not the zero matrix.
Prior to discussing this question, we recall the definition of matrix convergence. Let
be a sequence of matrices all of size m x n. We shall say that
(3.6.1) lim Ar = A = (a. ) if and only if rm lk
lim a!r) = a , Jk'
for j = 1, 2, ..., m; r+.. Jk k = 1, 2, ..., n.
The notation l:=l~r = A is an abbreviation for k rn
limk+mlr=lAr = A and the notation IIrZ1Ar = A is an
abbreviation for limk,,II~=,Ar = A. One sometimes ~  
writes A_  A for convergence. L
Elementary properties of convergent sequences of matrices are:
(1) If Ar + A , then uAr + aA; u, scalar.
(2) If Ar, Br are of the same size, then Ar + A,
102 Circulant Matrices
Br + B implies Ar + Br + A + B. (3) If AT are m x n and Br are n x p and if Ar 
+ A, Br + B then ArBr + AB.
(4) If A i s m x n and / / A / / designates the matrix norm
k=l
then Ar + A if and only if limr+,l /AArI I = 0.
If Ar is a sequence of square matrices of order n the m
question of the convergence of lIr=lAr may be a diffi  
cult one. Somewhat simpler to deal with is the case in which all the Ar are simultaneously diaqonalizable
by one and the same matrix.
Theorem 3.6.1. Let Ar = MArMl, r = 1, 2, . . . , where (r) M is a nonsingular matrix and where Ar = diag(hl ,
. . . , h (l) ) . Then Ilm A exists if and only if n r=l r cm
x!~) exists for j = 1, 2, . . . , n. In such a case, =r=1 ,
k Proof. l$.=l~r = k (MA,Ml) = M(IIr,lAr)~l and  nr=l
A =. r l l r ~ . Hence n!=l~r converges if and
only if II:=l~r does. But Il:,l~r = diag (l~!=~h,!~))
The theorem now follows.
Corollary. An infinite product of circulants conver ges if and only if the infinite products of the respective eiqenvalues converge.
Proof. All circulants are simultaneously  diagonalizable by F.
Convergence Questions 103
Note. We have said that IIT=lhr CO if and
only if limk IIk h exists. This terminology is at + r=l r
variance with some parts of complex variable theory
which requires also that lim Ilk X # 0. kf r=l r
Corollary. If C is a circulant with eigenvalues X1, k h2, ..., An, then limk+_C exists if and only if
k If limk+_C exists we shall designate its limit
ing value by C. It is useful to have an explicit
form for the limiting value C of a circulant C. Let JC designate the subset of integers r = 1, 2,
..., n for which X = 1. r
Corollary. Assuming (3.6.2).
C = Br Jc # (the null set), rtJC
(3.6.3) if
Proof. If C = F*AF, A = diag(hl, A 2 , ..., In), then C = F*AF, = diag(X7, A;, . . . , A:), where CO
Xr = 1 if Xr = 1 and 0 if lArl C 1. The statement now
follows from (3.4.8) and ( 3 . 4 . 9 ) .
Corollary. Let C be a circulant with eigenvalues hl, X2, ..., An. Then the C6saro mean
1 lim(I + C + ... + crl) = c r rt
exists if and only if
(3.6.4) l h r 1 5 l, r = 1, 2, ..., n. The representation (3.6.3) persists with replacing
c.
104 Circulant Matrices
Proof. Write C = F*AF, A = diag(A1, X2, ..., An). Then
1 1 (I + . . . + crl) = F*diag(,(l + A . + A? + . r I 3
+ A?'))F. I
Now
1 and
It is clear that or converges if and only if ihl 5 1. It converses to 1 if and only if X = 1 and to 0 if and only if X # 1, IA j 5 1 .
In discussing convergence problems, it is useful to introduce the spectral radius or norm, p(M), of a matrix M by means of
(3.6.5) P (M) = max , , j=1,2, ..., n 151
where A , are the eigenvalues of M. I
Inasmuch as circulants are a special case of a . . diagonalizable matrix, we append a table of the beha
. ,,
vior of M~ as r +  for diagonalizable matrices. All , 1 r . 1:
results are obtained by using M~ = S A S and an
examination of the individual behavior of A: as r + m .
BY a unimodular eigenvalue we mean an eigenvalue A, for which 1 A, 1 = 1. .. ..
It is of interest to contrast this tabulation m
with the general theorem on the existence of M , where M is not necessarily diagonalizable.
I 1 Theorem 3.6.2 cm
(a) If X = 1 is an eigenvalue of M, then M exists if and only if A = 1 is a slmple root of the minimal polynomial of M and if all other roots are less than 1 in absolute value.
(b) If A = 1 is not an eigenvalue of M, then M
exists if and only if p (M) < 1, in which case M = 0.
What is the general form of infinite powers?
Omit the trivial case M = 0. Assume M has order n. Then, since the Jordan blocks corresponding to the eigenvalue X = 1 all must be of dimension 1, it fol lows that M can be Jordanized as follows:
where S is nonsingular and where Q has the form
Convergence Questions 105
Behavior of M ~ , r + ; M Diagonalizable
Necessary and Sufficient Behavior Conditions
Converges to 0 P(M) < 1
Converges to M # 0 P(M) = 1; all unimodular eigenvalues equal 1
Diverges boundedly P(M) = 1; not all unimodular eigenvalues equal 1
Cgsaro mean converges P (M) = 1, no unimodular to 0 eigenvalue equals 1
~Qsaro mean converges, p(M) = 1, at least one, but but not to 0 not all unimodular eigen
values equal 1
Finite number of limit p(M) = 1, not all unimodular points eigenvalues equal 1. ~ l l
unimodular eigenvalues are roots of unity
Infinite number of p(M) = 1, at least one uni limit points modular eigenvalue is not a
root of unity
Diverges unboundedly P (M) > 1
10 6 C i r c u l a n t M a t r i c e s
I n ( 3 . 6 . 7 ) , Im i s t h e i d e n t i t y m a t r i x o f a c e r t a i n
o r d e r m, 1 < m 5 n , and X i s (n  m ) x (n  m ) and 
p ( X ) < 1. Hence X = 0, s o t h a t
 1 T h e r e f o r e , M'" = SQ S . NOW w r i t e S i n b lock form a s s = ( A I B ) where A i s ( n x m ) and B i s ( n n  m ) .
where C i s ( m x n ) and D i s (n  m ) x n . W r i t e s  ~ =
Then from ( 3 . 6 . 6 ) it f o l l o w s t h a t M~ = AC.
PROBLEMS
1. I n v e s t i g a t e t h e convergence o f sequences of d i r e c t sums.
2. I n v e s t i g a t e t h e convergence o f sequences o f Kronecker p r o d u c t s .
3. Prove t h a t i f Ak a r e s q u a r e , l imk+,Ak = A , and A
i s n o n s i n g u l a r , t h e n f o r k s u f f i c i e n t l y l a r g e , Ak  1
i s n o n s i n g u l a r and l i m k + m ~ ~ l = A . 4 . L e t A , B be s q u a r e of same o r d e r and commute. L e t
k l i m k +  A = Am, Bk = Bm e x i s t . Then l i m k , , ( ~ ~ ) k =
AB,.
5. Show t h a t t h e i d e n t i t y o f Problem 4 may n o t be 5 2
v a l i d i f AB # BA. Take A = ( ' 0 0 ) , B = A*.
6. What f u n c t i o n s o f m a t r i c e s a r e c o n t i n u o u s under m a t r i x convergence? For example: d e t e r m i n a n t , r a n k , e t c .
7 . L e t A = 1 be a n e i g e n v a l u e o f A and a s i m p l e r o o t
o f i t s minimal polynomial ~ ( h ) . L e t Am e x i s t . Then, i f one w r i t e s L I ( ~ ) = ( A  l ) q ( h ) , q ( l ) # 0,
one h a s Am = ( q ( l ) )  l q ( ~ ) . ( G r e v i l l e . ) a b
8. When i s (c a n i n f i n i t e power?
Convergence Q u e s t i o n s 107
9. Leve l s p i r i t s . Take t h r e e g l a s s e s , c o n t a i n i n g d i f f e r e n t amounts o f vodka. By p o u r i n g , a d j u s t t h e f i r s t two g l a s s e s s o t h a t t h e l e v e l i n b o t h is t h e s a m e . A d j u s t t h e l e v e l i n t h e second and t h i r d g l a s s e s . Then i n t h e t h i r d and f i r s t g l a s s e s . I t e r a t e . P r e d i c t t h e r e s u l t a f t e r n i t e r a t i o n s . What happens a s n + a? What i f t h e g l a s s e s d o n o t have t h e same c r o s s  s e c t i o n ? What i f t h e g l a s s e s d o n o t have c o n s t a n t c r o s s  s e c t i o n a l a r e a ? What i f a f t e r t h e k t h l e v e l i n g , a n amount v is drunk from b o t h o f t h e l e v e l e d g l a s s e s ? k
10. Prove t h e s t a t e m e n t a t t h e end o f S e c t i o n 1 .3 . G e n e r a l i z e i t .
REFERENCES
C i r c u l a n t m a t r i c e s f i r s t a p p e a r i n t h e mathemat ica l l i t e r a t u r e i n 1846 i n a paper by E. C a t a l a n .
I d e n t i t y (3 .2 .14) f o r t h e d e t e r m i n a n t o f a cir c u l a n t is e s s e n t i a l l y due to Spot t i swoode , 1853.
For a r t i c l e s o n c i r c u l a n t s i n t h e o l d e r l i t e r a t u r e see t h e b i b l i o g r a p h i e s o f Muir , ( 11  161.
C i r c u l a n t s : A i t k e n , 111 , 121; Bellman, [ l ] ; C a r l i t z ; Charmonman and J u l i u s ; Davis , L l l , [ 2 ) ; Marcus and Minc. [21; Muir , [ l l ; Muir and M e t z l e r , (71 ; O r e ; Trapp; Varga.
zTransform: J u r y .
F r o b e n i u s theorem: Taussky.
Convergence: G r e v i l l e , (11; Ortega .
Skew c i r c u l a n t s ; { k )  c i r c u l a n t s : Beckenbach and Bellman; Smith , [ l ] .
T o e p l i t z m a t r i c e s : Gray, [ I ]  [ 4 1 ; Grenander and Szeg6; Widom.
D e t e r m i n a n t a l i n e q u a l i t y : Beckenbach and Bellman.
Outer p r o d u c t : Andrews and P a t t e r s o n .
SOME GEOMETRICAL APPLICATIONS OF CIRCULANTS
We are interested here in the quadratic form
where Q is a circulant matrix. The reader will perceive that some of what is presented is valid ih a wider context. In (4.0.1) we have written Z =
T (zl, . . . , zn) . Insofar as Q = F*AF, A = diag(il,
A2' ..., An), one has
This is the reduction of Q(Z) to a sum of squares. If one writes for the Fourier transform of Z,
A ,, A
(4.0.3) Z = (zl, z 2 , . . . , ;n)T = FZ,
then one has
4.1 CIRCULANT QUADRATIC FORMS ARISING IN GEOMETRY
We list a number of specific quadratic forms Q(Z) in which Q are Hermitian circulants and which are of importance in geometry.
Circulant Quadratic Forms 109
= polar moment of inertia around z = 0 of the ngon Z whose vertices are unit point masses.
From (4.0.4),
which expresses the isometric nature of the unitary transformation F.
= sum of squares of the sides of the ngon Z.
where k is a positive integer. Z*QZ = sums of squares of the kthorder cyclic difference of the vertices of Z. For example,
We wish next to exhibit the area of an ngon as a quadratic form in Z. Since for a general Z, the geometrical ngon may be a multiply covered figure, it is more convenient to deal with the oriented or signed area of Z.
Let zk = xk + iyk, k = 1, 2, 3 be the vertices . ~ .. ..
of a triangle T taken in counterclockwise order. From 1 . 2 1 5 we have
rt We
r'
1 r
t1
rt
m
m
0
c 
3 v
N m
e  I1
PlW
rt
a
m
m
rr
 r.
rt
rn r
t
1 w
om
3
NN
N
me^
WN
P
m P
rt
mm
1 0
NI N
I N
I
";5
W
NV
r
t
I 1
a
v
o\
n
I
PI
N
H3
rn
NIM
U
. n
LO
N
I
PU
. 0
<+
n
mI
X
u
rn
Ym
x
m <
3
m
0 'I
m *
P
. n m
rn
 X X X WN
V
YL
CL
C
WN
V
n
NN
N
WN
P
NI N
I N
I
WN
V
I
w w 0
m O
F
k w
mm
rt
3:
h
In
N O
w
112 Some Geometrical Applications The Isoperimetric Inequality 113
T 2. Let J = (1, 1, ..., 1) . Prove that Q3(Z + cJ) =
Q3(Z). Interpret geometrically.
3. Prove that Q3 (nz) = Q3 (2). Interpret.
4. Prove that Q3(TZ) = Q3(Z). (See p. 28 for T.) Interpret.
4.2 THE ISOPERIMETRIC INEQUALITY FOR ISOSCELES POLYGONS
Consider a simply connected, bounded, plane region9
I with a rectifiable boundary. If A designates its area and L the length of its boundary, the nondimensional
ratio A/L~ is known as its isoperimetric ratio. The famous isoperimetric inequality asserts that for all 9
and that equality holds in (4.2.1) if and only if @ 1 is a circle. :? If @is a regular polygon of n sides each of
. " I 2 length  2a, it is easily shown that L = 2na, A =
. I L i na cot v/n. Hence the isoperimetric ratio for a
' ' . , a regular polygon of n sides is
A  1 1 1    Ti 4n cot  = <  ~2
n 4n tan n/n  4n
It is a reasonable conjecture that if @ is any equilateral polygon of n sides, with area A and peri meter L, then
(4.2.2.) < 1
L 2  4n tan n/n
with equality holding if and only if @ is regular, that is, equiangular as well. We can now establish the truth of this conjecture. Write (4.2.2) in the form
From (4.1.9) we have, using the double angle formula and observing that the first term of the series vanishes,
n n TI n('  1) 4n (tan )A = 4n 1 tan (:)sin In n j=2
. cos n(j  n 1) l;j12,
NOW if @ is equilateral, then for some b > 0,  2.1 = b, j = 1, 2, ..., n, so that L = nb, 1 3
2 2 2 n 2 L = n b . NOW Q2(Z) = lj=lI~j+l  zjI2 = nb2 = L /n.
Thus from (4.1.8), since the first term of the series vanishes,
For j = 2, we have (tan n/n) (sin n/n) (cos n/n) =
sin2 n/n, so that
n  [sin (1   tan  cos (j  l)n 2 n n n I IGjl
Notice that sin[(j  l)nl/n > 0 for j = 3, 4, . n. The bracketed quantity
sin ( j  l)n  tan 1 cos (j  l)n n n n
(1  l)nKtan TI = cos (1  l)n  tan ].
n n n
When cos[(j  l)n]/n = 0, then sin[(j  l)nl/n > 0. When the cos > 0, the tan > 0 and tan[(j  l)nl/n > tan n/n. When the cos < 0, the tan < 0. Therefore
the coefficients of 1;. l 2 are always positive. It I
follows that I,'  4n(tan n/n)A t 0, and equality holds if and only if i3 = A   . . . = = 0. To interpret
24A n the equality, one has Z = FZ so that
114 Some Geometrical Applications Side Conditions 115
for some a, B . Thus, in the case of equality,
and these are the vertices of a regular polygon of n sides.
4.3 QUADRATIC FORMS UNDER SIDE CONDITIONS
Pick an r with 1 5 r 2 n. Let z'~) be an eigenvector of Q corresponding to A,. Then, up to a scalar
factor, Ztr) = F*(O, ..I, 0, 1, 0, ..., O)T, where the 1 is in the rth position. Suppose now that
Z I z"), that is, z*z(~) = 0. Then Z*F*(O, ..., 0, 1, 0, ... , O)T = (FZ)*(O, ..., 0, 1, 0, ..., o)T = 0. This is valid if and only if 1 = 0. Hence r
2 (4.3.1) z I z(~) implies Q(Z) = 1 AklGkl .
k#r
For distinct rl, r2, ..., r 0 < m < n, m r  
(rk) (4.3.2) Z I Z k = 1, 2, ..., m, implies
Q(z) = 1 xk~'k~2. k#rlrr 2r...,r m
In particular, since Z (I) = (l/fi) (1, 1, ..., 1) T ,
n 2 implies p(Z) = 1 XklGkl . k= 2
The eigenvalues Ak are, of course, generally
neither real nor positive. For a given matrix Q, the set of all values Q(Z)
with 1121 1 = 1 is the field of values of Q (see Page 63).
It is easily shown, using the fact that a normal matrix is unitarily diagonalizable, that the field of values of a normal matrix is the convex hull of its eigenvalues. Since circulants are normal, the same may be asserted for the field of values of a circulant. The X, are real if and only if a circulant Q is .. Hermitian. Then from (4.0.4), Q(Z) will be real for all 2. In this case, one has the Rayleiqh inequal ities arrived at as follows. Let Amin and Xmax be
the smallest and largest of the Ak. Then
Hence, from (4.1.1') and (4.0.4),
2 (4.3.4) Amin/ 121 / 5 Q(Z) I Amaxl 121 12.
Therefore, for any Z # 0,
In all our work so far with circulants, it has been convenient to number the eigenvalues so that A . =
I p(wlI), where p is the representer of the circulant [cf. (3.2.611. To derive equality conditions and further conclusions along the lines of what is now called the CourantFisher theorem, it is convenient briefly to renumber the eigenvalues and vectors so that one has
116 Some Geometrical ~ ~ ~ l i c a t i o n s ~ Side Conditions 11
(j I The corresponding eigenvectors of Q will be Z . Suppose now that we have a vector Z # 0 for which
2 2 (4.3.7) Q(Z1 = hminl 121 I = Anl 121 I . Then
n 2 2 2 Q(Z) = 1 hk/Gkl = Anl 121 I = Anl 1
k=l
2 Thus (Ak  An) / zk/ = 0. Since (Ak  A n )  > 0, k =
1, 2, ..., n, it follows that (Ak  A,) /zkl2 = 0, k =
1, 2, . . . , n. Now assume that
(4.3.8) > A > ... i 1  2  5 > An.
Then (Ak  An) # 0 for k = 1, 2, .. . , n  1. Thus A  (4.3.7) holds if and only if = A 22  "'
= Z = 0. n 1
Therefore, Z = F*? = F* (0, 0, . . . , 2 ) = g 2'"). In n n other words, (4.3.7) holds if and only if Z is an eigenvector corresponding to An (i.e., to A . ) . mln
Let now Z be a vector such that Z I 2'"). As observed, z = 0, and from (4.0.4) n
or briefly,
(n) for all vectors Z I Z .
Make the further hypothesis that
> A > ... (4.3.10) A1    2 An3 > 'n2  'n1 ' 'n and suppose that equality holds in (4.3.9):
Then
SO that
Since (Ak  Anl) L 0 for k = 1, 2, ..., n1, it follows that (Ak  Anl) lgkl2 = 0 for k = I, 2, ..., n1. Hence, by (4.3.10), zk = 0 for k = 1, 2, ..., .. n3. The structure of must therefore be 2 = (0, 0, ..., 0, zn2, z ~  ~ , 01 for arbitrary $n2, A A
z so that Z = F*z = z ("2) (n1) n1' n2 + znlz
In summary, if (4.3.10) holds, then (4.3.11) holds if and only if Z is a linear combination of the
eigenvectors Z ("I1 and Z (n21
We now present an application of these ideas. Select Q = (I  n)*(I  r r ) . From (4.1.81, the eigen values of Q are (in the usual ordering)
A . = 4 sin (jl)', , = I , 2, ..., n. I n
The eigenvalue of smallest value ?s 0, corresponding to j = 1. The next two in size are paired, corres ponding to j = 2 and j = n. The common value is
.
4 sinZ n/n. Thus we arrive at
Theorem 4.3.1. Let zl, z2, ...,  'n'  zl be
complex numbers with lklzk = 0. Then
118 some Geometrical Applications
Equality in (4.3.12) holds if and only if
k1 + Bakl, (4.3.13) z = aw k
k = 1, 2, ..., n for constants a, B.
Proof  n 2
Q(Z) = Z*(I  n)*(I  n)Z= 1 Izk+l zkl . k= 1
The eigenvalue of Q of lowest value is 0; the corres ponding eigenvector is (1, 1, ..., 1). The eigen values next in size are paired; the eigenvectors are
:I 2
n (l,w,w ,..., w ) and (1, wnI, w"~, ... , w) (second and last columns of F*). . $
The inequality (4.3.12) goes by the name of the discrete inequality of Wirtinger.
;' For upper bounds we must obtain I :I ,: = max 4 sin 2 ( j  l)n . I j 'max j n
For n = 2p, one has h 3 4, occurring when j = p + 1 ! I max .. , For n = 2p + 1, one has X 2
= 4 sin (pn/n) = .: j
2 max i .,! 4 cos (n/2n) , occurring doubled when j = P + 1, P + 2.
I, , c c:: This information may now be inserted in (4.3.5).
PROBLEMS
n 1. Letzl, z2, . . . , z be complex numbers with lkc1zk n
= 0. For other integers k, define zk cyclically.
Let A designate the difference operator (Azk =
Z 2
k+l  zk, A zk = A(azk), etc.). Then for all
integers p 2 0, use (I  n)' to prove that
Side Conditions 119
2. For real x. write the Wirtinqer inequality in the form 1'
 2n ! x2 < [ 2n/n 2 2n k  2 n k  2 sin n/nl IF 1 ( 2n/n 1 1 k= 1 k=l
Use this, together with n + m, to prove that if 2 n
f (t) has period 2n and f (t) dt = 0, then
2 n 2 n 0
j f2(t) dt 5 /(f* (t) )2 'It. 0 0
What integrability conditions on f(t) are required here? This is Wirtinger's integral inequality.
3. Let zk, k = 1, 2, ..., n be as in Problem 1. Prove ~
that the z, are the real affine images of the .. vertices of a regular nqon (see p. 123 for "affine").
4. Let C be a circulant whose eigenvalues have equal moduli 0. Then, for all vectors 2, I I C Z I I = ollzl I .
5. Prove that the field of values of any matrix is a convex set in the complex plane.
6. Prove that for any matrix, the convex hull of its eigenvalues is contained in the field of values.
4.4 NESTED nGONS
(See Section 1.4.) Let Z = (zl, z2, . . . , zn)T desig nate the vertices of an nqon and let the transforma tion C (= Cs) be applied iteratively where
= SI + tn, s > o , t > O , s + t = l .
The eigenvalues of C are hk = s + twkl, k = 1, 2,
..., n. These numbers are strictly convex combina
tions of 1 and wkl. Hence, il = 1 and for k = 2, ...,
120 Some Geometrical Applications
Figure 4.4.1
n, one has jXkj < 1. See Figure 4.4.1. In fact, these
numbers lie on a circle interior to and tangent to the unit circle at z = 1. One has
2n(k  1) = Is2 + t2 + 2St COS I t
It is clear that the eigenvalues ofabsolute value next in size to A1 = 1 are A2 and hn (= h2) for which *
2 n (4.4.3) 1 h 2 2 = A = s 2 + ti + 2st cos _ l L .
From (3.4.14) one has for r = 0 , 1, ...,
hence
r (4.4. 4') lim C Z = BIZ.
r+
Since from (3.4.13), B1 = l/n circ (1, 1, . . . , T
B z = (l/n) (zl + z2 + ... + zn) (1, 1, ..., 1) . ~ence, 1
as r + , each component of CrZ approaches the c.9. of
Nested nGons 121
z with geometric rapidity. It is useful, therefore, to assume that this c.9. is at z = 0, eliminating the first term in (4.4.4). Thus we assume that
(4.4.5) Z + Z + ... 1 2 + Zn = 0 .
Further asymptotic analysis may be carried out along the line of the power method in numerical analysis for the computation of matrix eigenvalues. Write
(4.4.6) crz = h > 2 ~ + h r ~ z + (hrg + ... n n 3 3 + x ~  ~ B ~  ~ ) Z 
Then, since / A / = Ih2/, n
Now Since 1 h3 / t 1 h4 1, . . . t 1 1 < 1 AZ 1 , the term in the parentheses approaches 0 as r + m. We designate it by E (r). (It is a column vector.) Let
(4.4.8) h2 = li21eie,
8 = tan 1 t sin 2n/n (S + t cos 2n/n).
An = ~h~/e~',
Therefore,
Write
(4.4.10) Y = eire ire r B 2 Z + e BnZ,
SO that
122 Some Geometrical Applications
Since from (3.4.9) Bk = F*AkF, we have
ir8 ire Y = e B2Z + e r BnZ
Hence
= constant (as far as r is concerned).
From this follows immediately that if the second and nth components of FZ, the Fourier transform of Z, are not both zero, then the Yr are a family of nonzero
ngons of constant moment of inertia.
In this case, then, the rate of convergence of
crZ is precisely I 1, Ir, r + . Notice from (4.4.3)  or Figure 4.4.1 that as n + m, X2 t 1, so that the more vertices in the ngon, the slower the convergence.
r The sequence of ngons crz/I X 7 / will be called 
normalized, and the normalized ngons "approach" the family Y_. It is of some interest to look at the
geometric nature of Yr. T
Lemma. Let Z = (Zl, Z2t    8 Zn)  Let  n1
(4.4.12) pz(u) = zl + z 2 u + z 3 u2 + *   + znu . For r = 1, 2, . . . , n, let
Nested nGons 123
Then
In particular,
1 2 n1 T (4.4.15) B2Z = ,(P, (W)) (1, W, W , . . . , W ) ,
k 2k Proof. From (3.4.12), Br = l/n circ(1, w , w , ..., W (n1) k . Hence each row of Br is the previous row multiplied by Ck. The identities should now be obvious.
Lemma. Let z = x + iy, z' = x' + iy', 'rl, T~ complex. Then
is an affine transformation of the (x, y)plane. It is nonsingular if and only if / T 1 # I T 1 . 1 2
Proof. Write rl = t1 + inl, r2 = c 2 + in2, where the S's and n's are real. Then the transfGrmation (4.4.17) can be written as
x' = (C1 + S2)x + (nl  r12)y, (4.4.18)
Y' = (nl + n2)X + (C2  C1)y.
This is an affine transformation of the x, y plane. The determinant A of the transformation is
2 2 2 2 2 2 A = E2  El  n1 + n2 = 1 ~ ~ )  1 ~ ~ 1 ,
so that A # 0 if and only if # 1 ~ ~ 1 . Theorem 4.4.1. If /g 2 / # I;*/, the ngons Yr are nonzero, and of constant moment of inertia. They are the affine images of the regular unit polygon of n sides, hence are convex.
124 Some Geometrical Applications
Proof. We have 
Hence if we write T~ = (l/n)eir8  pZ(w), T2 
(l/n) eire 
pZ(w), the vertices in Yr are the images of 2 
(1, w, w , . . . , wnl) under z' = 7 , z + T ~ Z . Since A 
pZ(w) = < and pZ(" = ;2, it follows that 1 ~ ~ 1 # 1 ~ ~ 1 . This is a nonsingular affine transformation and all
I such transformations send convex figures into convex * figures. t ..
i For further analysis, one makes the assumption
that 8 is a rational multiple of 2n. In this case, ; one can identify llmits of subsequence of the normal ,! ized figures c ~ z / I x , ~ ~ , r = 0, 1, 2, ... . 
Instead of working generally, we shall assume that
1 . This leads immediately to
TI (4.4.20 i h I = cos , 8 = ' 7 2 n n
so that (4.4.9) becomes
Let now
(4.4.22) r = 2jn + b, O ~ b 2 2 n  1,
] = 0, 1, ... . Then (4.4.21) becomes
Nested nGons 125
Writing
(4.4.24) Ub = e Tib/n~2z + e nib/ng n
one now has
C2jn+b (4.4.25) lim 2jn+b = 'b' b = 0, 1, 2, ...,
j+ (COS n/n) 2n  1,
so that the normalized ngons approach 2n limiting ngons, each of which is an affine transform of a regular ngon. See Figure 4.4.2.
PROBLEMS
1. Prove that if I i2 / # 1 , the sequence of cor responding normalized vertices of the nested ngons r = 0, 1, 2, ... lie asymptotically on an ellipse.
2. Analyze what happens when Z is taken as the vertices of a regular polygon.
2 4 3 T 3. Take Z = (1, w , w , w, w ) . w5 = 1 (a regular pentagram). What happens under C 1/2? Do the successive iterates ever become convex?
4. Analyze what happens when Z is taken as the affine image of a regular polygon.
5. Let C 1 = circ(l/r, 1(l/r), 0, 0, ..., O), r = r 
1, 2, 3, . . . . Discuss n m r=l C 1, and apply it to nested ngons. r
I
i\+
I
. z
. z
0
0
0
I4 4
m
+ 0
q
r.
R Q
C
ti
ID
P
/+ ft"
2 +
+
/+
51
0
Smoothing and Variation Reduction 131
Fi~ure 4.4.2 (Continued)
4.5 SMOOTHING AND VARIATION REDUCTION
The smoothing or filtering of data is a common opera tion and is worthy of discussion within the present framework. We assume that we have a finite sequence
of data values Z = (zl, ..., zn) and we subject the
data to a linear transformation with matrix A: A
(4.5.1) Z = AZ.
What properties of the matrix A will be required for smoothing? Numerous definitions have been put forward. Greville has proposed the following. A matrix A will be called smoothing if:
(1) A has A = 1 as an eigenvalue,
(2) A = lim exists. PA"
The rationale behind this definition is as follows. The eigenspace S of vectors corresponding to A = 1 has the property that if z E S, Az = 2 . Call S the set of smooth vectors. Then vectors that are already smooth are unaffected by the operation A. Now take any vector Z and "smooth" it over and over again by
applying A. Then this will approach A2. NOW since
A(A~Z) = A"Z, A ~ Z E S, hence it is a smooth vector. Referring to Theorem 3 . 6 . 2 , we see that the
necessary and sufficient condition for A to be smooth Ing in the sense of Greville is that:
(1) ,i = 1 be an eigenvalue of A.
(2) h = 1 be a simple root of the minimal polynomial of A and if h # 1 is an eigenvalue, then i h l < 1.
If A is a circulant then the criterion simplifies somewhat.
Theorem 4.5.1. A circulant C is a smoothing operator if and only if
(1) A = 1 is an eigenvalue of C.
(2) If i # 1 is an eiqenvalue of C, then l h / < 1.
rt
w
s
C
J
m rt
m
m
m
. s r
. o
m
N
w
ar
OM

rm
m
Po
+
'"I
rt
m
<
jila
N
 m
5:
W
N
Wr
t
I1 
YO
 Pw
?
I1 *
tf
N
N
P

sm
r w
m
w

+ N
n
ci
 P
Q
YN
O
W
c n
N
P
N
I w
 n
1
Y
r.
o
II I
N
rt
n.
3
N
r
..
m
r

rt
. r
t\
I
NY

w3
w
3
N
+ Y
N r
tn
3  
Y
m r.
rt

. "I
N
N
s
H
o
N
m
 +
tf
r
I <
m

w
N
N
ci
w
P
1
w
r

. r"
a
I N
r
tr"
r.
rt
N
+ 0
r"
P
Y
. 
.
N.
<
o
m
r
m
n
 rt
. N
0
 ci
<
g
r. e c
i n
r
ms
mo
Y
ca
3 m
r
mm
tf
r.
wr
. a
x<
rt1
m
m
m D
P
* r
t Y
r.
3
r.w
o
m
V
ICH
Y
mr
t c
w
0
J
YC
mm
r.
ms
rt
m
cim
m
wr
.1
0
n
ua
mm
3 r.
mw
mm
1
YO
0
0
<a
1 m
r
w
mc
m
Prt
U .
c r
rm
m
w r.
m
 rtm
rt
m
m
c
ua
rr
H
r.
v
s
s
C
m
m
m
11 1
3
m
m
F
U
c,
E! Y

0
0
J
* 
or
.
s
yr
tm
m
P
C
rt
m
. rts
0
ws
m rt
 w
0
. r
tm
v
C
m
0" K
r"
m
ci
xrt
m
m 
AP
r.
P
0
a
rt
rr
m
s
. m rt
3
E w
s
zz
m.
r
0
. X
Mm
c.
m
r.2

r
ou
ar
m
<
mr
t~
~~
0 3
cis
0
I  w
r.
rn
P
C(

0
4C
O
3
o
mP

C
mr
m
Ilr"
. r.
x 3
rt
a
s
aP
g
m
m
m
r.n
ci
1
m m
r..
rt
c
r rt x
0 3
rm
w
1
rtz
m
s w
m
Y rt
r s
mm
m
r. r
.
Qu
a
m
ZZ
2
<9
w
L
g ;
Cr
t
m
m r.
mm
0
I m
0 rt
m
 rt
r
3: ci m
0
m
rr
3
r E
0
7
em
c
iw
3
rt
cng
.w
o
mz
mr
m
o
m
r
m
N
WZ
sW
O
OO
Y
I1 r
rr
m
a
rt
 a
ms
N
G
P..
IU
m
m
rt
N
n
 C
C
<
Pm
m
\m
m
r.
N
m
. rt
P.
Y
w
m m
1 c
rt P
0 0
N
rt
Y a
134 Some Geometrical Applications
have for real pk 2 0, D = diag(pl, ..., un) and
unitary U, A*A = U*DU. Hence nI  A*A = U*(nI  D)U. So the eigenvalues of 01  A*A are T?  uk. Thus 0  <
< 0 is necessary and sufficient. "k 
corollary. I / AZ/ I 5 qlIZ/ 1 for all Z if and only if p (A*A)  < n .
1f 0 5 0 5 1, condition (4.5.6) may be described by saying that A is norm reducing (more strictly: norm nonincreasing). If 0 < n < 1, A is a contraction. [A contraction generally means that (4.5.6) is valid with 0 5 T < 1 where ( 1 I I can be taken to be any vector norm.]
Lemma. Let Mk, k = 1, 2, ..., be a sequence of  matrices. Then
(a) lirnk+_MkZ = 0, for all 2, if and only if
Proof. Using a compatible matrix norm, I I M I 1, one h V M k ~ ] 1  < / / M k / l ljzli. NOW limk+_M k = 0 if
and only if limk+, / lMk 1 1 = 0. Hence (b) + (a) . Con
versely, (b) follows from (a) if, in (a), one selects Z successively as all the unit vectors.
Theorem 4.5.2. Let Mk, k = 1, 2, ..., be a sequence of matrices and set ok = p(MgMk) = spectral radius of MGMk. Let
r (4.5.8) lim Il o k = 0.
r+ k=l
Then
for all 2 , hence
Smoothing and Variation Reduction 135
Proof. From the previous corollary,
If we wish to obtain a condition such as (4.5.7) or (4.5.8) directly on the eigenvalues of M (and not on those of M*M), it is convenient to hypothesize that M is normal.
For in this case M = U*diag(A1, ..., hn)U so that M*M = u*diag(AIX1, A2X2, ..., h )U, and the eigen n n
2 2 2 values of M*M are precisely 1 hl 1 , / X2 I , . . . , / h n l . In this way we are led to our next result.
Theorem 4.5.3. Let Mk, k = 1, 2, ... be a sequence of normal matrices. Assume that
m
(4.5.11) n P ( M ) = 0. k=l k
Then  (4.5.12) n M ~ = o .
k=l
In the case of a sequence of circulants, see corollary to Theorem 3.6.1 for a stronger statement.
2 2 We return now to the inequality I I A Z I I S I I B Z ~ 1 . We have already seen that a necessary and sufficient condition for this is that B*B  A*A be positive semidefinite. We should like to be able to "decouple" the matrices A and B. To this end, we make the hypo thesis that A and B are normal and commute. (Recall   that this means that A*A = AA*, B*B = BB*, AB = BA.) Such pairs of matrices are remarkable in that they are simultaneously unitarily diagonalizable. We shall now prove this basic fact.
Theorem 4.5.4. Let A and B be square matrices of the same order. Then A and B are normal and commute if
136 Some Geometrical Applications
and only if they are simultaneously diagonalizable by one and the same unitary matrix.
Proof. ''If." Let A = U*DIU, B = U*D U where  2 is unitary and Dl, D2 are diagonal. Then A*A =
U * ~ ~ U U * D ~ U = u * D ~ D ~ u = U*D 5 U = AA* so that A is norma 1 1 Similarly for B. Now AB = U*DIUU*D 2 U = U*D1D2U =
U*D2D1U = BA.
"Only if." Assume that A, B are normal and commute. Since A is normal, we have for some unitary U and diagonal D, A = U*DU. Since AB = BA, we have U*DUB = BU*DU. Hence D (UBU*) = (UBU*)D. Set C = UBU*. Hence B = U*CU. Then DC = CD. Write
where p , U 2 ..., us are distinct and where ul is repeated a, times, ..., us is repeated as times,
  a, + a., + .  . + aq = n. This displays the possible   multiplicities of the eigenvalues of A. If now C = (c. ) , then DC = CD implies lk
U.C.  1 ]k  %cjk j, k = 1, 2, ..., n. Therefore
if u j # u k thenc. = 0 , 1 k
if u . = uk then c. = arbitrary. I 1 k
Therefore C must be of the form C = C, Q C ? Q . .  Q Cq    where Cr is of order ar and is arbitrary. Since B is
normal, so is C. Since C is normal, so is each C k' k = 1, 2, ..., r (as is easily established). Hence for appropriate unitary V and diagonal Ak of order k ak, we have Ck = VgAkVk. Thus,
Smoothing and Variation Reduction 137
where
v = v Q v2 @ ... 1 Q Vs,
A = n Q A, @ ... 1 Q As.
Now
= U*(ulV;Vl @ p2V;V2 Q  . . Q u V*V )U S S S
= u* (V? Q v; Q .   @ v;) (lJlIL1 @ . .  @ P I ) 1 as
(V1 0 v2 Q "' Q VS)U
= U*V*DVU.
Therefore VU diaqonalizes A and B. It is easily verified that VU is unitary.
Theorem 4 . 5 . 5 . Let A and B be normal and commute. Then / 1AZI I < 1 ~ B Z I I for all z if and only if there is an ordering of the eigenvalues of A and B
hl, h 2 , ..., h ; n ul, u 2 r ..., ' n (under a simultaneous diagonalization) such that
Proof. Let A and B be normal and commute. Then we can find a unitary U such that A =
138 Some Geometrical Applications
Hence B*B  A*A = U*d' 2 2 2 2 1ag(lu1/  lill E 1 1 *  Ih21 1
. . . , 2 2 ILJn 1  A n U . Condition (4.5.13) is now .. ..
equivalent to the positive semidefiniteness of B*B  A*A.
Corollary. If A and B are circulants, then (4.5.13) is necessary and sufficient for 1 ~ A Z I 1 < 1 I B Z ~ 1 for all Z.
Proof. Circulants are normal and commute. 
In dealing with pairs of matrices that are normal and commute, it is useful to assume that their eigen values have been ordered so as to be consistent with the simultaneous diagonalization by unitary U.
Let M be a square matrix. We shall call a matrix A Mreducing if
(4.5.141 1 1 ~ ~ ~ 1 I 5 1 I M Z I I for all Z.
Theorem 4.5.6
(a) A is Mreducing if and only if M*M  (MA)*MA is positive semidefinite.
(b) Let A and M be normal and commute. Let A,, A
..., An; pl, ..., vn be the eigenvalues of A and M. Let JM be the set of integers r = 1, 2, ..., n for which vr # 0. Then a necessary and sufficient condi

tion that A be Mreducing is that
(4.5.15) 1 . h k l  < ' for k E JM.
Proof. Under the hypothesis, there is a unitary  U such that A = U*diag (Al, . . . , h )U, M = U*diag ( L I ~ , n . U . Therefore T = M*M  (MA)*(MA) =
n   U*diaq(ilkuk  hkhkukuk)U. Hence the condition for
2 2 positive semidefiniteness of T is 111 (1  I A 1 ) 0, k k 
k = 1, 2, ..., n. This is equivalent to (4.5.15).
Smoothing and Variation Reduction 139
Corollary. A is variation reducing [see (4.5.5)l if and only if (I  n)* (I  n)  ((I  n)A) * ((I  n)Al is positive semidefinite.
Proof. Set M = I  n. 
Corollary. Let A be a circulant with eigenvalues Al,
..., A n . Then a necessary and sufficient condition .. that A be variation reducing is that
j1 Proof. The eigenvalues of M = I  n are 1  w ,
j = 1 . n. Hence JIn = 12, 3, ..., nl.
PROBLEM
1. Consider the nonautonomous system of difference equations Z = G Z where n+l n n
Show that p(Gn) < 1, but the sequence Z may n diverge. (MarkusYamabe, discretized.)
4.6 APPLICATIONS TO ELEMENTARY PLANE GEOMETRY: nGONS AND KrGRAMS
We begin with two theorems from elementary plane geometry.
Theorem A. Let zl, z2, z 3' z4 be the vertices of a
quadrilateral. Connect the midpoints of the sides cyclically. Then the figure that results is always a parallelogram (Figure 4.6.1). Write P = (z
T 1' z2r
Z3' z4) C1/2 = circ(l/2, 1/2, 0, 0). This means
that C,,,P is always a parallelogram. Hence the I/ '.
transformation C is not invertible. (For if it 1/2
Some Geometrical Applications
Figure 4.6.1
were, there would be quadrilaterals whose midpoint quadrilaterals would be arbitrary.)
Theorem B. Given any triangle, erect upon its sides outwardly (or inwardly) equilateral triangles. Then the centers of the three equilateral triangles form an equilateral triangle (see Figure 4.6.2). This is known as Napoleon's theorem.
Figure 4.6.2
Applications to Elementary Plane Geometry 141
Our object is now to unify and generalize these two theorems by means of circulant transforms and to derive extremal properties of certain familiar geo metrical configurations by means of the MP inverses of relevant circulants.
Let us first find simple characterizations for equilateral triangles and parallelograms. Let zl, z2,
z3 be the vertices of a triangle T in counterclockwise
order. Then T is equilateral if and only if
(4.6.la) 2 2% i Z1 t WZ + W z3 = 0, w = exp
while
(4.6.lb) 2 Z1 t w z2 + WZ3 = 0
is necessary and sufficient for clockwise equilateral ity. The proof is easily derived from the fact that if zl, z2, z3 are clockwise equilateral they are the
2 images under z + a + bz of 1, w, w ; that is, if and only if for some a, b, zl = a + b, z2 = a + bw, 
2 z3  a t bw . Of course, if b = 0, the three points degenerate to a single point. The center of the triangle is defined to be z = a = c.g. (zl, z 2r z3).
Let zl, z2, z3, z4 be a nonselfintersecting
quadrilateral Q given counterclockwise. Then Q is a parallelogram if and only if . (4.6.2) z1  z2 + 2
3  z = 0.
4
This is readily established. For integer n  > 3 and integer r set w = exp(2ni/n)
and set
1 r 2r (4.6.3) K =circ(1, w , w , ..., w (nl)r r n )  Notice that the rows of Kr are identical to the
first row 1, wr, .. ., w (nl)r !. , mul'iiplied by some w . In particular, one has
i L (4.6.4) n = 3, r = 1 : K = c~rc(l, w, w ) , 1 3
w = exp (2ni/3),
142 Some Geometrical A p p l i c a t i o n s
1 . ( 4 . 6 . 5 ) n = 4 , r = 2 : K = + l r c ( l , 1, 1, 2 4
 1 1 ,
w = e x p ( 2 n i / 4 ) = i.
W e see f rom ( 4 . 4 . 1 ) and ( 4 . 4 . 2 ) t h a t P i s e q u i l a t e r a l or a p a r a l l e l o g r a m ( i n t e r p r e t e d p r o p e r l y ) if and o n l y i f KP = 0 , t h a t i s , i f and o n l y i f P l i es i n t h e n u l l s p a c e o f K . T h i s l e a d s t o t h e d e f i n i t i o n
D e f i n i t i o n . An ngon P = ( z l , Z 2 , . . . , z n ) w i l l
be c a l l e d a gram i f and o n l y if
o r e q u i v a l e n t l y i f and o n l y i f
The r e p r e s e n t e r p o l y n o m i a l f o r Kr i s p ( z ) = ( l / n )
2 r z 2 + 0 . . + w ( n  1 ) r n1 (1 + wrz + w z ) = ( ( w r z ) "  1)/
r n ( w z  1). The e i q e n v a l u e s o f K r a r e p (wj ' ) , .  1 =
1, 2 , . n . NOW f o r j  1 # n r , p(wl') = 0. n  r + l
w h i l e p ( w ) = 1. Thus i f
t h e n Kr = F * d i a g ( O , 0 , ..., 0 , 1, 0 , ..., O)F, t h e 1
o c c u r r i n g i n t h e j t h p o s i t i o n . T h i s means t h a t
( 4 . 6 . 8 ) Kr = F*A.F = B . [ s e e ( 3 . 4 . 9 ) ] . I I
The B . are t h e p r i n c i p a l i d e m p o t e n t s o f a l l c i r c u l a n t s 3
o f o r d e r n . We h a v e [ s e e a f t e r (3.4.1011
I f C is a c i r c u l a n t o f r a n k n  1, t h e n by ( 3 . 3 . 1 3 ) , f o r some i n t e g e r j . 1  < j  < n ,
From ( 4 . 6 . 8 ) , ( 4 . 6 . 9 ) , and S e c t i o n 2 .8 .2 , p r o p e r t i e s (1) and ( 2 ) .
A p p l i c a t i o n s t o E l e m e n t a r y P l a n e Geometry 1 4 3
S e v e r a l more i d e n t i t i e s w i l l b e o f u s e . A g a i n , r 2 r l e t Kr = ( l / n ) c i r c ( l , w , w , ..., w ( n  l ) r I . L e t Y
be a n a r b i t r a r y c i r c u l a n t so t h a t o n e c a n w r i t e Y = F* d i a g ( q l . v 2 . ..., n n l F f o r a p p r o p r i a t e q . . NOW K ~ Y
1 = (F*A . F ) ( F * d i a g ( n l , . . . ,
I n n ) F l = F * d i a g ( O , . .. , 0 , q j ,
0 , ..., O ) F = q .F*A.F = n.K . Thus 1 3 I r
6 . 1 1 KrY = n . K . I r
I n p a r t i c u l a r , i f Y i s m e r e l y a column v e c t o r
Y = ( y o , y l , ... , Y ~  ~ ) ~ ~ t h e n
( 4 . 6 . 1 2 ) K r Y = n . f c ( K r ) 3
where t h e n o t a t i o n f c ( K d e s i g n a t e s t h e f i r s t column r
o f K r . One a l s o h a s
( 4 . 6 . 1 3 ) K Y = o ( l , w ( "  ' ) ~ ( n 21 r r W r T
r , ..., w )
where
( 4 . 6 . 1 4 ) o = yo + ylw r + ... ( n  l ) r
+ Ynlw
L e t Y be f u r t h e r s p e c i a l i z e d t o Y = f c ( K r ) . Then Y =
1 / 1 1 w ( n  l ) r ( n  2 ) r . w r T , ..., w 1 . T h e r e f o r e f rom ( 4 . 6 . 1 4 ) . o = 1 , and f rom ( 4 . 6 . 1 3 )
Each c i r c u l a n t C o f r a n k n  1 d e t e r m i n e s a n i n t e g e r j u n i q u e l y , and t h r o u g h ( 3 . 3 . 1 3 ) and ( 4 . 6 . 9 ) a m a t r i x K r , h e n c e a c l a s s o f K grams. I n t h e f o l  r l o w i n g t h e o r e m s t h i s d e t e r m i n a t i o n w i l l be assumed.
Theorem 4 .6 .1 . L e t P b e a n ngon. Then t h e r e e x i s t s
a n nqon 6 s u c h t h a t CG = P i f and o n l y i f P i s a K  gram. r
144 Some G e o m e t r i c a l A p p l i c a t i o n s
Proof. The sys tem of e q u a t i o n s CB = P h a s a
s o l u t i o n i f and o n l y i f P = C C ~ P . T h i s i s e q u i v a l e n t t o P = ( I  Kr)P = P  KrP o r KrP = 0 [by ( 4 . 6 . 9 ) l .
C o r o l l a r x . L e t P be a Krgram. Then t h e g e n e r a l
s o l u t i o n t o C; = P i s g i v e n by
(4 .6 .16) $ = C ~ P + T f c ( K r )
f o r an a r b i t r a r y c o n s t a n t T.
P r o o f . I f P i s a Krgram, t h e n t h e g e n e r a l
s o l u t i o n t o c6 = P i s g iven by 6 = c T p + ( I  C ~ C ) Y = C'P + K,Y f o r a n a r b i t r a r y column v e c t o r Y. From
( 4 . 6 . 1 7 ) . K r Y = n . f c ( K r ) and t h e s t a t e m e n t f o l l o w s . I
C o r o l l a r y . P i s a Krgram i f and o n l y i f t h e r e i s a n
ngon Q s u c h t h a t P = CQ.
P r o o f . L e t P = CQ. Then KrP = KrCQ. S i n c e KrC   = 0 , i t f o l l o w s t h a t KrP = 0 s o t h a t P i s a Krgram.
A
Converse ly , l e t P be a Krgram. Now t a k e f o r Q any P
whose e x i s t e n c e i s g u a r a n t e e d by t h e p r e v i o u s c o r o l  l a r y .
C o r o l l a r y . Given a n ngon P which i s a Krgram. Then A
given an a r b i t r a r y complex number zl , we c a n f i n d a A
unique ngon P = (il , ", ..., ;,IT, w i t h :1 a s i t s
f i r s t v e r t e x and such t h a t ~6 = P.
P r o o f . S i n c e t h e g e n e r a l s o l u t i o n o f C$ = P i s 
P = C'P + T f c ( K r ) , q i v e n gl, we may s o l v e u n i q u e l y
f o r a n a p p r o p r i a t e T s i n c e t h e f i r s t component o f f c ( K r ) i s 1 (# 0 ) .
Theorem 4.6 .2 . L e t P be a n ngon which i s a Krgram.
Then t h e r e is a u n i q u e nqon Q which i s a Krgram and
such t h a t CQ = P. I t i s g i v e n by Q = C ~ P .
A p p l i c a t i o n s t o Elementary P l a n e Geometry 145
Proof O S i n c e P i s a Krgram, it h a s t h e form P = CR
f o r some R. Hence Q = C'P = cTcR = c(c 'R) . Hence Q i s a K gram. r
( b ) Q i s a s o l u t i o n of CQ = P, as we c a n s e e by s e l e c t i n g T = 0 i n t h e above.
(c ) A l l s o l u t i o n s a r e o f t h e form P = C ~ P + A
T f c ( K r ) . Now P i s a Krgram i f and o n l y i f K,$ = 0.
T h a t is , i f and o n l y i f K , C ~ P + rKrfc(Kr) = 0. Now
K ~ C  = 0. But Krfc ( K r ) = Kr. T h e r e f o r e T = 0.
Theorem 4.6.3. L e t P be a Krgram. Among t h e i n f i n 
i t e l y many ngons R f o r which CR = P, t h e r e is a u n i q u e one o f minimum norm 1 / R / 1 . I t i s q i v e n by R =
CTp. Hence it c o i n c i d e s w i t h t h e u n i q u e Krgram Q such t h a t CQ = P.
P r o o f . Use t h e l a s t theorem and t h e l e a s t  s q u a r e s c h a r a c t e r i z a t i o n of t h e MP i n v e r s e .
Suppose now t h a t P i s a g e n e r a l ngon and we wish t o approx imate it by a Krgram R such t h a t I I P  R I / = minimum. Every K gram c a n be w r i t t e n a s R = CQ f o r r some ngon Q s o t h a t o u r problem is: g i v e n P , f i n d a Q such t h a t 1 I P  C Q I I = minimum. T h i s problem h a s a s o l u t i o n , and t h e s o l u t i o n is un ique i f and o n l y i f t h e columns o f C a r e l i n e a r l y i n d e p e n d e n t . T h i s i s n o t t h e c a s e ( t h e r a n k o f C b e i n g n  l ) , hence Q =
C ~ P i s t h e s o l u t i o n w i t h minimum 1 1 . Thus, R = CQ
= C C ~ P i s t h e b e s t a p r o x i m a t i o n o f t h e ngon P by a K gram w i t h minimum 7 I Q I I . W e p h r a s e t h i s a s f o l l o w s . r
Theorem 4 .6 .4 . Given a g e n e r a l ngon P = ( z T 1' ""
zn) . The un ique Krgram R = CQ f o r which I I P  R I 1 =
minimum and / I Q I / = minimum i s g i v e n by
(4 .6 .17) R = CCP = (1  Kr)P = P  KrP
= P  o ( 1 , w ( n  l ) r ( n  2 ) r r~ , w , ..., w )
146 Some Geometrical Applications
where o = zl + z wr + .. + znW . Alternatively, 2
this can be written as
where n. is determined from I
circ(zl, z2, . . . , z n ) = F*diag(nl, q2, . . . ,nn)F.
proof. AS before, R = C C ~ P = (I  K )P = P   r K,P. BY (4.6.12), K ~ P = n .fc (K ) . Notice that R is
3 r a K gram because KrR = Kr (P  Ii . f c (Kr) = KrP 
r 3 q.K fc(Kr). Since by (4.6.15) Krfc(Kr) = fc(Kr), 3 r
K R = 0. r
Notice also that if P is already a Krgram, a =
Z + z wr + * . . 2 + znw = 0. In this case, from
1 (4.6.17), R = P: so, as expected, P is its own best approximation.
Generally, of course, the operation R(P) = CCIP is a projection onto the row or column space of C.
4.7 THE SPECIAL CASE: circ(s, t, 0, ..., 0) An interesting class of cyclic transformations comes about from circ(s, t, 0, 0, ..., O), of order n, where one assumes that s + t = 1, st # 0, and that the rank is n  1. Write
The representer polynomial is p(z) = s + (1  s)z, so k k
that the eigenvalues of Cs are p(w ) = s + (1  s)w , k = 0 1, .  1 . Suppose that for a fixed j, 0  <
j  < n  1, s + (1  s)w3 = 0. Thus, there will be a
zero eiqenvalue if and only if s = w3/(w3  11, t =
1  w . For such s, Cs can have no more than one k zero eiqenvalue since s + (1  s)w = s + (1  s)w3 = 0
k implies that w = w3, or k = j. Thus we have
The Special Case 147
Theorem 4.7.1. The circulant Cs has rank n  1 if and only if for some integer j, 0 < j < n  1,  
In this case,
If s is real, then C has rank n  1 if and only if n s is even and s = t = 1/2.
Proof. The j + 1st eigenvalue of Cs is zero. Hence (4.7. 2) follows by (4.6.7), (4.6.9). If s is
real, so is 1  s and hence 1  wl. Therefore wJ is
real. Since j = 0 is impossible (s = m ) , w3 = 1. This can happen if and only if n is even. From (4.7.2). s = t = 1/2.
If s is real, the transformation induced by Cc  is interesting visually because the vertices of P = CsP lie on the sides (possibly extended) of P. More
over, if s and t are limited by
 that is,a convex combination, then P is obtained from
P in a simple manner: the vertices of $ divide the sides of P internally into the ratio s: 1  s. (Cf. Section 1.2.)
If s and t are complex, we shall point out a geometric interpretation subsequently.
As seen, if n = even and s is real, then C9 is  singular if and only if s = t = 1/2. In all other real cases, the circulant Cs is nonsingular and hence, given
an arbitrary ngon P, it will have a unique preimage 5 under Cs: csP = P.
Example. Let n = 4, s = t = 1/2. If Q is any quadri lateral, then ClI2Q is mbtained from Q by joining suc
cessively the midpoints of the sides of Q. rt is
148 Some Geometrical Applications
therefore a parallelogram. Hence, if one starts with a quadrilateral Q, which is not a parallelogram, it can have no preimage under ClL2.
Since in such a case the system of equations can be "solved" by the application of a generalized inverse, we seek a geometric interpretation of this process.
4.8 ELEMENTARY GEOMETRY AND THE MOOREPENROSE INVERSE
select = even, s = t = 1/2. Then Cs = circ(l/2, 1/2,
0, . 0 For simplicity designate C112 by D:
This corresponds to j = n/2 in (4.7.2). Hence by (4.7.3)
(4.8.2) DD= = I  K 4 2
where by (4.6.3)
For simplicity we write K = K. n/2
It is of some interest to have the explicit
expression for D.
Theorem. Let D = circ(l/2, 1/2, 0, 0, ..., 0) be of order n, where n is even. Let
(1) (n/2)1 (4.8.4) E = circ n 1 n 2 1 ( n  1 . . . ,
5, 3, 1, 1, 3, 5, ..., (1) (42)1 (nl) ) .
Then E = D ~ .
As particular instances note:
1 n = 4: D~ = circ (3, 1, 1, 3) 4  1 n = 6: D' = circ z(5, 3, 1, 1, 3, 5).
Elementary Geometry 14 9
Proof (a) A simple computation shows that
DE = c i r c n n  1 1  1 1  1 ..., 1, 1)
Hence DED = (I  K)D = D  KD = D, since by (4.6.10) (or by a direct computation) KD = 0.
(b) On the other hand, EDE = DEE = (I  K)E = E  KE. An equally simple computation shows that KE =
0. Hence EDE = E. Thus by (2.8.2) (1)(4). E = D ~ .
From (4.6.6b) or (4.6.6a). in the case under ~~ 
study, a Kgram is an ngon whose vertices zl, ..., z satisfy n
(4.8.5) zl  z 2 + z 3  Z + " ' + Z 4  z = 0. n 1 n
It is easily verified that for n = 4 the condition
holds if and only if zl, z2, zj, z4 (in that order)
form a conventional parallelogram. Thus, an ngon which satisfies (4.8.5) is a "generalized" parallel ogram. The sequence of theorems of Section 4.6 can now be given specific content in terms of parallel ograms or generalized parallelograms. We shall write it up in terms of parallelograms.
Theorem 4.8.2. Let P be a quadrilateral. Then there
exists a quadrilateral 6 such that DP = 6 (the midpoint property) if and only if P is a parallelogram.
Corollary. Let P be a parallelogram. Then the gen A
era1 solution to DP = P is given by
for an arbitrary constant T.
Corollary. P is a parallelogram if and only if there is a quadrilateral Q such that P = DQ.
150 Some Geometrical ~pplications
Corollary. Let P be a parallelogram. Then, given an
arbitrary number zl, we can find a unique quadrilat A
era1 P with gl as its first vertex such that DP = P.
Theorem 4.8.3. Let P be a parallelogram. Then there is a unique parallelogram Q such that DQ = P. It is
qiven by Q = Dip.
Notice what this is saying. DQ is the parallelo gram formed from the midpoints of the sides of Q. Given a parallelogram P, we can find infinitely many quadrilaterals Q such that DQ = P. The first vertex may be chosen arbitrarily and this fixes all other vertices uniquely. But there is a unique parallelogram
Q such that DQ = P. It can be found from Q = Dip (see Figure 4.8.1).
Figure 4.8.1
Theorem 4.8.4. Let P be a parallelogram. Among the infinitely many quadrilaterals R for which DR = P, there is a unique one of minimum norm I I R I I. It is
given by R = D ~ P . Hence it coincides with the unique paralleloqram Q such that DQ = P.
Theorem 4.8.5. Let P be a general quadrilateral. The unique parallelogram R = DQ for which 1 IP  R / =
minimum and I Q I I = minimum is qiven by R = (1  K)P.
In the theorem of Section 4.7, select n = 3 and
$ = exp(2ni/3), so that w3 = 1. Select j = 1, so that s = W/(W  1). 1  s = 1/(1  w). In view of 1 + w + w2 = 0, this simplifies to s = 1/3 (1  w , 1  s =
2 1 3 1  w . On the other hand, the selection j = 2
Elementary Geometry 151
2 2 2 leads to s = w / ( w  1) = 113 (1  w ) , 1  s =
1  w 2 = 1 3 1  w . The corresponding circulants Cs we shall designate by N (in honor of Napoleon):
1 2 (4.8.7) NI = circ  w, 1  w , O), j = 1
1 2 No = circ ?(1  w , 1  w, O), j = 2
the subscripts I, 0 standing for "inner" and "outer." For brevity we exhibit only the outer case, writing
1 2 (4.8.7') N = circ (1  w , 1  w, 0). 3
We have
1 KO = circ (l, 1, 11, 3 1 2 (4.8.8) K = circ ?(1, W, w ) , KO + K1 + K2 = I 1 1 2 K2 = circ ~ ( 1 , w , w).
From (4.7.3) with n = 3, j = 2,

Theorem 4.8.6. N' = K  WK 0 2'
Proof. Let E = Kg  wK2 Then from (4.8.7'),  2 N = K  w K 2 0 2. Hence, NE = (KO  w K2) (KO  wK2) =
  2 3 2 K + w K = K + K2 = I  Kl [cf. after (4.6.8)l. 0 2 0
2 2 Therefore NEN = (I  K1)(KO  w K ) = K  w K2 = I$. 2 0 Similarly, ENE = (I  K ~ ) (KO  WK ) = K  wK2 = E. 2 0 Thus, by Section 2.8.2, properties (1) to (4). E = N ~ .
It follows from (4.6.la) and (4.6.lb) that a counterclockwise equilateral triangle is a K,gram,
A
while a clockwise equilateral triangle is a K2gram.
Let now (zl, z2, z3) be the vertices of an
arbitrary triangle. On the sides of this triangle erect equilateral triangles outwardly. Let their
vertices be zi, z' z' From (4.6.la), 2' 3'
152 Some Geometrical Applications
The centers of the equilateral triangles are therefore
This may be written as
providing us with a geometric interpretation of the transformation induced by Napoleon's matrix.
The sequence of theorems of Section 4.6 can now be given specific content in terms of the Napoleon operator. In what follows all figures are taken counterclockwise.
Theorem 4.8.7. Let T be a triangle. Then there
exists a triangle ? such that N? = T if and only if T is equilateral. (The "only if" part is Napoleon's theorem. )
Corollary. Let T be equilateral. Then the general A
solution to NT = T is given by
for an arbitrary constant T.
Corollary. T is equilateral if and only if T = NQ for some triangle Q.
Corollary. Given an equilateral triangle T. Given
also an arbitrary complex number There is a
unique triangle ? with as its first vertex such
that N? = T.
Theorem 4.8.8. Let T be an equilateral triangle.
Elementary Geometry 153
Then there is a unique equilateral triangle Q such
that NQ = T. It is given by Q = N~T.
Theorem 4.8.9. Let T be equilateral. Let R be any triangle with NR = T. The unique such R of minimum
norm I I R I I is the equilateral triangle R = N ~ T . It is identical to the unique equilateral triangle Q for which NQ = T. (See Figure 4.8.2.)
Figure 4.8.2
Finally, suppose we are given an arbitrary triangle T and we wish to approximate it optimally by an equilateral triangle. Here is the story.
Theorem 4.8.10. Let T be arbitrary; then the equilat eral triangle NR for which I I T  N R ( ~ = minimum and
such that I I R / ( = minimum is given by R = N'T and NR =
N N ~ T = (I  K )T. 1
PROBLEMS
1. Discuss the matrix circ(l/3, 1/3, 1/3, 0, 0, 0) from the present points of view and derive geomet rical theorems. To start: this matrix maps every 6gon into a parahexagon, that is, a 6gon whose
'(13
rt
3b
'0
r
om
e
c
+a
O
(D
m
r
(D
a
2m
m
H
m.
*M
e
C
0
nru
m
art2
C
5.e
. a
e
rt T
I A
Y (
D
r
< 14
(D
r.
3
as
3
'T
[U rt
~r
.
h
m
OQ
m
3
c ru
02
3
0
P
O
ar
r
ci
'i
u
0
X
nc
; 0
.
mo
2
Hm
l
m1
U
m 
rt r
0
PC
Q
om
2
ti
rt
m
* 'C
"3
Yt
r
zo
r.
m

em
h
n
(D
m h
m
ar
t s
r.
7
Y
ra
(
Dr
tr
D
2m
0
am
Q H
m
3
3
m r
t h r
e r
. ' m
r.
N 0
e Y
rt
e
ra
0
rt
2
m
01 h
1>6 Generalizations of Circulants
places is the same as a shift of g mod n places. By convention, if g is negative, shifting to the right g places will be equivalent to shifting to the left (9) places. ~ h u s , for any integers q, g' with g' g(mod n) a 9'circulant and a gcirculant are synonym ous.
Example 1. A 4circulant of order 6 is
a 4 a 5 a
a3 a4 a5 a6 al a2
al a2 a3
Example 2. A 1circulant is an (ordinary) circulant.
Example 3. A 0circulant is one in which all rows are identical.
Example 4. J = circ(1, 1, ..., 1) is a gcirculant for all g.
Example 5. A (1)circulant (or an (n  1)circulant) has each successive row moved one place to the left. It is sometimes called a left circulant or an anti circulant or a retrocirculant. Thus
is the antiidentity or the counteridentity.
Let A = (a,.). Then, evidently, A is a g 1 I
circulant if and only if
(5.1.2) a. . = a. i, j = 1, 2, ..., n. I t 3 l+l, j+g
Equivalently, if A = (a. . ) = gcirc (a a2, ..., an), then 1 I
Take g > 0 and let (n, g) designate the greatest common divisor of n and g. The gcirculants split into two types depending on whether (n, g) = 1 or (n, g) > 1. The multiples kg, k = 1, 2, . . ., n through a complete residue system mod n if and any; if (n, g! = 1. Hence the rows of the general gcirculant are dlstinct if and only if (n, 9) = 1. In this case, the rows of a 9circulant may be permuted so as to yield an ordinary circulant. Similarly for columns. Hence if A is a 9circulant, (n, g) = 1, then for appropriate permutation matrices P 1' P 2
(5.1.4a) A = P C, 1
(5.1.4b) A = CP2,
where in (5.1.4a) C is an ordinary circulant whose first row is identical to that of A. In a certain sense, then, if (n, g) = 1, a gcirculant is an ordinary circulant followed by a renumbering.
However, the details of the diagonalization, and so on, are considerable. If (n, 9) > 1, this is a degenerate case, and naturally there are further com plications.
Example. Making use of the geometric construction of Section 1.4, we shall illustrate this distinction by the two matrices of order 8:
In the first case, transformation of the vertices of a regular octagon by A, yields a regular octagon in  permuted order (Figure 5.1.1). In the second case, a square covered twice (Figure 5.1.2).
Theorem 5.1.1. A is a gcirculant if and only if
(5.1.5) nA = ~n'.
1 2 ... n Proof. In (2.4.6) take o = (2 ... Then
Pa = n so that if A = a , nA = (a. 1 I + In
(2.4.8). take
9 . 3 , n: 8
F i g u r e 5.1.1
9  2 , n  8
F i g u r e 5 .1 .2
\ 1 + g 2 + g ... 4 1 t h e n P 1 = (% )l = ag. Hence T A ~  ~ =
o (ai+l, j+g) . The r e s u l t now f o l l o w s from ( 4 . 1 . 2 ) .
C o r o l l a r y . L e t A and B be g  c i r c u l a n t s . Then AB* is a 1  c i r c u l a n t . I n p a r t i c u l a r , i f A i s a g  c i r c u l a n t , AA* is a 1  c i r c u l a n t .
P roof . A = n*Ang, B = n*Bn9. Hence AB* =
n * ~ n ~ ? r * ~ ~ * n = n*AB*?r.
Theorem 5.1.2. If A i s a q  c i r c u l a n t and B i s a n h c i r c u l a n t t h e n AB i s a g h  c i r c u l a n t .
h P r o o f . nA = &ng and ilB = Bn . Now
n(AB) = AvgB = ( ~ n ~  l ) (nB) = (Ang') (Bnh)
h h h = ( ~ n ~  ~ ) (nBn ) = ( ~ n ~  ~ ) (Bn ) n
=
Keep t h i s up f o r h t i m e s , l e a d i n g t o
9h ?r(AB) = ( ~ n ~  ~ ) ( ~ n ' ~ ) = (AB)n . Now a p p l y Theorem 5.1 .1 .
W e r e q u i r e s e v e r a l f a c t s from t h e e l e m e n t a r y t h e o r y o f numbers.
Lemma 5.1.3. L e t q , n be i n t e g e r s n o t bo th 0. Then t h e e q u a t i o n
h a s a s o l u t i o n i f and o n l y i f ( n , g ) = 1.
P r o o f . I t is wel l known t h a t g iven i n t e g e r s g , n ,  n o t bo th 0, t h e n t h e r e e x i s t i n t e g e r s x , y such t h a t gx  ny = ( n , g ) . Hence i f ( n , g ) = 1, ( 5 . 1 . 6 ) h a s a s o l u t i o n . Converse ly , i f ( 5 . 1 . 6 ) h o l d s , t h e n f o r some
160 Generalizations of Circulants
integer k, gx  1 = kn. If q and k have a common factor > 1, it would divide 1, which is impossible. I Corollary. For (n, 9) = 1, the solution to qx = 1 (mod n) is unique mod n.
proof. Let gxl = 1 (mod n) and qx2 = 1 mod n;
then q(xl  x ) = 0 (mod n). Since (n, g) = 1. 2
( x ~  x ) = 0 (mod n). 2
For (n, g) = 1 we shall designate the unique  1
solution of (5.1.6) by g . Theorem 5.1.4. Let A be a nonsinqular gcirculant.  Then Al is a g lcirculant. 1
Proof. Since A is nonsingular, it follows that   1  1 (n, g) = 1, hence that g exists with qg = 1 (mod
1"1  n) : Now, from (5.1.5) nA = An9 so that A  T  ~ A  ~ . Hence
g+l 1 g+l 1 1 T12 T A  ~ = n A n = n (A n )
9+1 ("qAl), 2 = 71 2q+lAln2* = Tl
DO this s times, and we obtain
n'4l = n sq+lAins 
Now select s = q and there is obtained nAl =  1 1 . 
Aln9 , which tells us that A is a q lcirculant. I Theorem 5.1.5. A is a gcirculant if and only if (Ai) * is a gcirculant. I
Proof. Let A be a gcirculant. Then A = nlAn9.  1 g i
Hence (since n , n , n9 are unitary) A+ = n A n.
~ h u s ( A ~ ) * = n* (A+) * (n9)* = IT~(A+)*T~. Therefore
( A $ ) * is a gcirculant. Conversely, let (Ai)* be a gcirculant. Then by 1
what we have just shown, ( ((Ai) * ) ' I * is also a g circulant. But this is precisely A.
Corollary. If A is a qcirculant then AAi is a 1 circulant.
Proof. In the corollary to Theorem 5.1.1, take  B = (A7)*. This is a qcirculant by what we have just
shown. Hence AB* = AA7 is a lcirculant.
If A is a gcirculant, then AA* is a lcirculant. Hence it may be written as AA* = F*hAA,F where h an* is
.. the diagonal of eigenvalues of AA*. Now by Problem 16
of Section 2.8.2, for any matrix M, M7 = M*(MM*)~. Hence
Theorem 5.1.6. If A is a qcirculant, then
(5.1.7) AT = A*(AA*)~ = A*F*A~ AA*F.
We now produce a generalization of the represen tation (3.1.4). Let
Notice that Q_ is a permutation matrix and is '3
unitary if and on1.y if (n, 9) = 1. (For in this case and only in this case will Qn have precisely one 1 in
3
each row and column.)
Theorem 5.1.7
Proof. The positions in A occupied by the symbol a are precisely those occupied by a 1 in Q . The 1 4 positions occupied by the symbol a2 in A are one
place to the right (with wraparound) of those occupied
162 Generalizations of Circulants
by al. Since right multiplication by n pushes all
the elements of A one space to the right, it follows that the positions occupied by a2 in A are precisely
those occupied by 1 in Q n. Similarly for a 9
3r ..., a n' Corollary. A is a 9circulant if and only if it is of the form Q C where C is a circulant.
9 Proof. Use (3.1.4).
Since
one has
Corollary. A is a (1)circulant if and only if it has the form A = TC where C is a circulant and where the first rows of A and C are identical.
Corollary. A is a (1)circulant if and only if it has the form
where A is diagonal. In this case,
for integer values of n.
Proof. A = TC with circulant C. But such C =  F*AF, so that A = (TF*)hF. From the corollary to
Theorem 2.5.2, F * ~ = T* = r so that TF* = F * ~ = F*T and (5.1.10) follows.
If A = diag(X1, . . . ,An), then
The eigenvalues of the (l)circulant A are identical to those of FA and the latter are easily computed. (See Section 5.3.)
Note also that
5 . 1 2 trn)' = diag(hlhl, hnA2, An1X3, .. ., i2in) so that the even powers of T A are readily available.
PROBLEMS
1. Prove that gcirculants form a linear space under matrix addition and scalar multiplication.
2. Let S denote the set of all matrices of order n that are of the form aA + BB where A is a circu land and B is a (1)circulant. Show that they form a ring under matrix addition and multiplica tion.
3. What conditions on n and g are sufficient to guarantee that the gcirculants form a ring?
k 4. Let A be a gcirculant. Then for integer k, T A =
~ n ~ ~ . Hence if g / n, nnIg~ = A.
5. Let (n, g) = 1 and suppose that A is a gcirculant. Prove that there exists a minimum integer r 5 1, such that A~ is a circulant. Hint: use the Euler Fermat theorem. See Section 5.4.2.
6. Let (n, g) = 1. Prove that if A is a gcirculant, each column can be obtained from the previous
column by a downshift of g' places.
If g = 0, each row of A is the previous row "shifted" zero places. Hence all the rows are identical. Since the rows are identical, r (A) 5 1. If r (A) = 0, A = 0, and the work is trivial. Suppose, then, that r(A) = 1. Then, by a familiar theorem (see Lancaster [l], p. 56), A must have a zero eigenvalue of multiplicity 2 n  1. Its characteristic polynomial is therefore of the form
An  ohn'. If we write A = 0circ(a 1, a2, ..., an) =