Linear Algebra Notes

Chapter 1

Systems of linear equations

1.1 Examples

1.1.1 Example

Consider the equation

x1 + 2x2 = 4

Let’s assume that x1 and x2 are real numbers. That will be the case for much of this course.We can solve for x1 in terms of x2 to get

x1 = 4− 2x2

There are infinitely many solutions to this equation:

x1 = 4− 2t, x2 = t, t ∈ R

1.1.2 Example

This time let’s consider the solution of two equations

x1 + 2x2 = 4

x1 − 4x2 = 7

We can multiply the first equation by -1 and add to the second equation to get

x1 + 2x2 = 4

− 6x2 = 3

1

2 CHAPTER 1. SYSTEMS OF LINEAR EQUATIONS

We can now divide the second equation by -6 to solve for x2.

x1 + 2x2 = 4

x2 = − 1/2

We can now multiply the second equation by -2 and add to the first.

x1 = 5

x2 = − 1/2

Note that this time we get one unique solution and not infinitely many.

1.1.3 Example

x1 + 2x2 = 4

2x1 + 4x2 = 7

We can multiply the first equation by -2 and add to the second equation to get

x1 + 2x2 = 4

0 + 0 = − 1

This is a contradiction and we conclude that the system of equations has no solution.

1.2 Idea

Note how simple the idea is: We multiply an equation by a scalar and then add to anotherequation. This simplifies the system. We keep simplifying until we get a solution or we geta contradiction.

1.3 Nonlinear equations

The situation with nonlinear equations is very different.

1.4. SOLVING SYSTEMS OF LINEAR EQUATIONS 3

1.3.1 Example

How do we solve

tanx = x

Do we know that this nonlinear equation has a solution? How many solutions does it have?It actually has infinitely many solutions.

1.3.2 Example

tanx = x

x2 + y2 = 1

How many solutions does this system have?

1.4 Solving systems of linear equations

1.4.1 Basic operations

The basic operations used to solve a system of linear equations are

• scaling a row

• interchanging rows

• replacing a row by the sum of the row and a scalar multiple of another row

1.4.2 Matrix notation

Consider the system of linear equations

1x1 + 1x2 + 1x3 = 62x1 + 1x2 + 1x3 = 71x1 + 1x2 + 2x3 = 12

In matrix notation we would write this system1 1 12 1 11 1 2

x1

x2

x3

=

6712


1.4.3 Coefficient matrix1 1 12 1 11 1 2

︸︷︷︸

coefficient matrix

x1

x2

x3

=

6712

1.4.4 Augmented matrix

We also often represent the system by an augmented matrix1 1 1 62 1 1 71 1 2 12

1.4.5 Apply elementary row operations to solve the system

-2 * Row 1 + Row 2 replaces Row 21 1 1 60 −1 −1 −51 1 2 12

-1 * Row 1 + Row 3 replaces Row 31 1 1 6

0 −1 −1 −50 0 1 6

1 * Row 3 + Row 2 replaces Row 2 1 1 1 6

0 −1 0 10 0 1 6

-1 * Row 3 + Row 1 replaces Row 1 1 1 0 0

0 −1 0 10 0 1 6

-1 * Row 2 replaces Row 2 1 1 0 0

0 1 0 −10 0 1 6

-1 * Row 2 + Row 1 replaces Row 1

1.5. REDUCED ROW ECHELON FORM 5

1 0 0 10 1 0 −10 0 1 6

1.4.6 Solution

We now read off the solution

x1 = 1, x2 = −1, x3 = 6

which we might also write as a vector 1−16

1.5 Reduced row echelon form

1.5.1 Leading coefficient or pivot of a nonzero row

The first nonzero element from the left end of the row.

1.5.2 Example

Consider the matrix 1 1 1 60 2 1 51 1 2 12

The leading coefficient or pivot of the second row is 2.

1.5.3 Row echelon form

A matrix is in row echelon form if

• All rows with at least one nonzero element are above any rows which only containzeros. Rows with all zeros should be at the bottom of the matrix.

• The leading coefficient (pivot) of a nonzero row is always to the right of the leadingcoefficient of the row above it.

• All entries in a column below a leading coefficient are zeroes.


1.5.4 Example


This matrix is not in row echelon form because the leading coefficient 1 in the first rowdoes not have all zeros below it, equivalently the leading coefficient 1 in the third row is notstrictly to the right of the leading coefficient 1 in the first row.

1.5.5 Example

Consider the matrix 1 1 1 60 2 1 50 0 0 00 0 2 6

This matrix is not in row echelon form because there is a row of zeros above a nonzero row.If we interchange the last two rows

1 1 1 60 2 1 50 0 2 60 0 0 0

the matrix is now in row echelon form.

1.5.6 Reduced row echelon form

A matrix is in reduced row echelon form if

• It is in row echelon form

• The leading coefficients of each nonzero row is 1 and there is no other nonzero entryin the same column

1.5.7 Example

Consider the matrix 1 1 1 60 2 1 50 0 2 60 0 0 0

1.6. GAUSS-JORDAN ELIMINATION 7

This matrix is in row echelon form but not in reduced row echelon form because there arenonzero rows which have pivots that are not 1 and there are nonzero entries above some ofthe pivots.

1.5.8 Example contd


This matrix is not in reduced row echelon form. We can scale the second and third rows toget 1 1 1 6

0 1 1/2 5/20 0 1 3

We can now multiply the third row by -1 and add to the first row and multiply the thirdrow by -1/2 and add to the second to get:1 1 0 3

0 1 0 10 0 1 3

We can now add -1 times the second row to the first to get:1 0 0 2

0 1 0 10 0 1 3

This matrix is in reduced row echelon form.

1.6 Gauss-Jordan elimination

The process of using elementary row operations:

• scaling a row

• interchanging rows

• replacing a row by the sum of the row and a scalar multiple of another row

to put a matrix into row echelon form or reduced row echelon form is called Gauss-Jordanelimination or Gaussian elimination.


1.7 Geometric interpretations

1.7.1 Hyperplanes

An equation of the form

c1x1 + c2x2 + · · ·+ cnxn = d

with the ci and d constants defines an n− 1 dimensional hyperspace in Rn.

1.7.2 Example

The equation

x+ y + z = 7

defines a two dimensional plane in three (or perhaps higher) dimensional space.

1.7.3 Example

The equation

x+ y = 7

defines a one-dimensional line in two (or perhaps higher) dimensional space.

1.7.4 Geometry and solutions

Solutions of systems of linear equations then can be interpreted as intersections of hyper-planes.

1.7.5 Example

Consider the system of equations

x+ y = 3

x− y = − 1

If we draw the lines defined by these two equations then they intersect at exactly one point(x = 1, y = 2) which is the solution to the system.

At how many points can two lines intersect?

1.8. EXERCISES 9

1.7.6 Example


x+ y = 3

2x+ 2y = 7

If we draw the one-dimensional lines defined by these two equations then they run parallelwithout ever touching. There is no point of intersection on the graph and no solution to thissystem of equations.

1.7.7 Example


x+ y + z = 1

x− y + z = 7

The two equations are of two-dimensional equations which intersect in a line and the solutionof system is

x = t, y = −3, z = 4− t, t ∈ R

How many ways can two planes intersect (or not intersect) in three dimensions? How aboutthree planes?

1.8 Exercises

1.8.1 Exercise


x1 + x2 + x3 = 5

2x1 + 3x2 + 5x3 = 8

4x1 + 5x3 = 2

Write the augmented matrix that represents this system. Put the augmented matrix in re-duced row echelon form. Find the solution.

Answer: x1 = 3 , x2 = 4 , x3 = −2.


1.8.2 Exercise


x1 + x2 + x3 = 5

2x1 + 3x2 + 5x3 = 8

Write the augmented matrix that represents this system. Put the augmented matrix in re-duced row echelon form. Find the solution.

Answer: x1 = 7 + 2t , x2 = −2− 3t , x3 = t, t ∈ R

Chapter 2

Matrices

2.1 Example

2.1.1 Problem

We consider solving a 3x3 linear system:

x+ y + z = − 1

2x+ y + z = − 2

x+ 2y + 3z = − 3

2.1.2 Gaussian elimination

Gaussian elimination is the process of putting the augmented matrix in row echelon form.We will use this first:

The augmented matrix is 1 1 1 −12 1 1 −21 2 3 −3

We get zeros below the first pivot of the first row by

-2*R1 + R2-1*R1 + R3

1 1 1 −10 −1 −1 00 1 2 −2

11

12 CHAPTER 2. MATRICES

We get zeros below the first pivot of the second row by

1*R2 + R3

1 1 1 −10 −1 −1 00 0 1 −2

2.1.3 Row echelon form

The augmented matrix is now in row echelon form after performing Gaussian eliminationand the solutions can be found by scaling and back substitution:

The third row gives

z = − 2

The second row gives

−y = 0 + z ⇒ y = −z = −(−2) = 2

Then the first row gives

x = − 1− y − z = − 1− 2− (−2) = − 1

2.1.4 Gauss-Jordan elimination

Gauss-Jordan elimination puts the matrix in reduced row echelon form. We can think of itas continuing on with Gaussian elimination until we get to reduced row echelon form.

2.1.5 Continuing

We had 1 1 1 −10 −1 −1 00 0 1 −2

We scale so that the pivots are all 1: 1 1 1 −1

0 1 1 00 0 1 −2

We can get zeros above the pivot of the second row by

2.2. MATRICES 13

-1*R2 + R1

1 0 0 −10 1 1 00 0 1 −2

We can get zeros above the pivot of the third row by

-1*R3 + R2

1 0 0 −10 1 0 20 0 1 −2

The augmented matrix is now in reduced row echelon form and the results can be read offdirectly from the matrix:

x = − 1, y = 2, z = − 2

2.2 Matrices

2.2.1 Definition

Rectangular arrays of real numbers as follows.

2.2.2 Example

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...ai1 ai2 · · · ain...

... · · · ...am1 am2 · · · amn

is an m× n matrix with m rows and n columns.

aij is the element of A that is in the ith row and the jth column.

2.2.3 Other examples

B =

(1 24 5

)


is a 2x2 matrix and

C =

12345

is a column matrix with five elements.

2.2.4 Equality of matrices

Two matrices Am×n and Bm′×n′ are equal if they have the same dimensions and each corre-sponding element is the same:

m = m′, n = n′

aij = bij

for all 1 ≤ i ≤ m, 1 ≤ j ≤ n.

2.2.5 Addition of matrices

If two matrices

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...ai1 ai2 · · · ain...

... · · · ...am1 am2 · · · amn

and

B =

b11 b12 · · · b1n

b21 b22 · · · b2n...

... · · · ...bi1 bi2 · · · bin...

... · · · ...bm1 bm2 · · · bmn

are of the same dimension then their sum C = A+B is defined to be a matrix of the samedimension whose elements are given by adding the corresponding elements of A and B:

2.2. MATRICES 15

C =

c11 c12 · · · c1n

c21 c22 · · · c2n...

... · · · ...ci1 ci2 · · · cin...

... · · · ...cm1 cm2 · · · cmn

with

cij = aij + bij

2.2.6 Example (1 1 12 2 2

)+

(1 1 11 1 1

)=

(2 2 23 3 3

)

2.2.7 Subtraction of matrices

If two matrices

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...ai1 ai2 · · · ain...

... · · · ...am1 am2 · · · amn

and

B =

b11 b12 · · · b1n

b21 b22 · · · b2n...

... · · · ...bi1 bi2 · · · bin...

... · · · ...bm1 bm2 · · · bmn

are of the same dimension then their difference C = A− B is defined to be a matrix of thesame dimension whose elements are given by subtracting the corresponding elements of Aand B:


C =

c11 c12 · · · c1n

c21 c22 · · · c2n...

... · · · ...ci1 ci2 · · · cin...

... · · · ...cm1 cm2 · · · cmn

with

cij = aij − bij

2.2.8 Example (1 1 12 2 2

)−(

1 1 11 1 1

)=

(0 0 01 1 1

)

2.2.9 Multiplication by a scalar multiple

If Am×n is a matrix

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...ai1 ai2 · · · ain...

... · · · ...am1 am2 · · · amn

and c ∈ R then the scalar multiple (cA)m×n is given by multiplying each entry of A by c

(cA)ij = c · aij

cA =

ca11 ca12 · · · ca1n

ca21 ca22 · · · ca2n...

... · · · ...cai1 cai2 · · · cain

...... · · · ...

cam1 cam2 · · · camn

2.2.10 Example

If

A =

(1 1 12 2 2

)

2.3. PRODUCTS OF MATRICES 17

then 4A is

4A =

(4 4 48 8 8

)

2.3 Products of matrices

2.3.1 Definition

The product of matrix Am×k and Bk×n ( note that the number of columns of A must be thesame as the number of rows of B is defined by

(AB)ij =k∑p=1

aipbpj

We can think of this as taking dot products of the the ith row of A with the jth column ofB.

2.3.2 Example

Suppose

A =

(1 2 20 5 1

)and

B =

314

A has three columns and B has three rows so the product of A and B is defined. A has tworows and B has one column so the product will have two rows and one column.

(AB)11 =∑p

a1pbp1 = 1 · 3 + 2 · 1 + 2 · 4 = 13

(AB)21 =∑p

a2pbp1 = 0 · 3 + 5 · 1 + 1 · 4 = 9

Then

AB =

(139

)


2.3.3 Lemma

If A and B are matrices and the product AB is defined then the jth column of AB is givenby

A[jth col of B]

2.3.4 Lemma

If A and B are matrices and the product AB is defined then the ith row of AB is given by

[ith row of A]B

2.3.5 Multiplication is not commutative

Given two matrices A and B, we may have that AB is defined and BA is not defined. Ifboth products are defined there is no requirement that BA = AB.

2.3.6 Exercise

Given

A =

(−1 02 3

)

B =

(1 23 0

)Show that AB 6= BA.

2.4 Transpose of a matrix

2.4.1 Definition

If A is a matrix

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...ai1 ai2 · · · ain...

... · · · ...am1 am2 · · · amn

then the transpose At of A is

2.4. TRANSPOSE OF A MATRIX 19

A =

a11 a21 · · · an1

a12 a22 · · · an2...

... · · · ...a1i a2i · · · ani...

... · · · ...a1m a2m · · · anm

or

(At)ij = (A)ji = aji

2.4.2 Example

Consider the matrix

A =

(1 2 20 5 1

)The transpose At of A is

At =

1 02 52 1

2.4.3 Transpose of a transpose

If A is a matrix then

(At)t = A

2.4.4 Transpose of a sum

If A and B are matrices of the same size then

(A+B)t = At +Bt

2.4.5 Transpose of a scalar multiple

If A is a matrix and c is a real number then

(cA)t = cAt


2.4.6 Transpose of a product

If A and B are matrices such that the product AB is defined then

(AB)t = BtAt

2.5 Trace of a square matrix

2.5.1 Definition

If A is a square matrix

A =

a11 a21 · · · an1

a12 a22 · · · an2...

... · · · ...a1i a2i · · · ani...

... · · · ...a1n a2n · · · ann

the trace tr( A ) of A is the sum

trA =∑i

aii

2.5.2 Example

A =

1 0 22 5 72 1 4

tr A = 10.

2.6 Some properties of matrices

Suppose that in this section A, B, and C are matrices and a , b, and c are real numbers.Suppose that the operations discussed in this section are defined.

2.6.1 Addition is commutative

A+B = B + A

2.6.2 Addition is associative

(A+B) + C = A+ (B + C)

2.7. PROOFS OF THE PROPERTIES 21

2.6.3 Multiplication is associative

(AB)C = A(BC)

2.6.4 Distribution

A(B + C) = AB + AC

(B + C)A = BA+ CA

2.6.5 Scalar multiplication

a(B + C) = aB + aC

a(bC) = (ab)C

a(BC) = (aB)C = B(aC)

2.7 Proofs of the properties

In this section the elements of A are denoted aij, the elements of B are denoted bij, and theelements of C are denoted cij.

2.7.1 Proof

A+B = B + A

Proof.

(A+B)ij = aij + bij

= bij + aij

= (B + A)ij

2.7.2 Proof

Am×n(Bn×p + Cn×p) = Am×nBn×p + Am×nCn×p

Proof.


(A[B + C])ij =n∑k=1

Aik(B + C)kj

=n∑k=1

(AikBkj + AikCkj)

=n∑k=1

AikBkj +n∑k=1

AikCkj

= (AB)ij + (AC)ij

2.7.3 Exercise

Using the previous two proofs as a model, give the proof for the other properties presentedin the previous section.

2.8 Identity matrix

2.8.1 Definition

In×n is the square matrix with all zeros except for ones on the main diagonal.

2.8.2 Example

I =

1 0 00 1 00 0 1

2.8.3 Property

If An×n is a square matrix then

In×nAn×n = An×nIn×n = An×n

2.8.4 Exercise

I =

1 0 00 1 00 0 1

A =

2 3 57 1 04 9 1

Confirm that IA = AI = A.

2.9. ELEMENTARY MATRICES 23

2.9 Elementary matrices

2.9.1 Definition

Any matrix formed from the identity matrix by an elementary row operation is called anelementary matrix.

2.9.2 Example

From the identity matrix

I =

1 0 00 1 00 0 1

we can form the elementary matrix E by multiplying the second row by 3:

E =

1 0 00 3 00 0 1

Now if

A =

2 3 57 1 04 9 1

then EA is

EA =

2 3 521 3 04 9 1

E multiplies the third row of A by 3.

2.9.3 Example


I =

1 0 00 1 00 0 1

we can form the elementary matrix E by interchanging the first and second rows:

E =

0 1 01 0 00 0 1

Now if


A =

2 3 57 1 04 9 1

then EA is

EA =

7 1 02 3 54 9 1

E exchanges the first and second rows of A.

2.9.4 Example


I =

1 0 00 1 00 0 1

we can form the elementary matrix E by multiplying the first row by -2 and adding to thethird row:

E =

1 0 00 1 0−2 0 1

Now if

A =

2 3 57 1 04 9 1

then EA is

EA =

2 3 57 1 00 3 −9

E multiplies the first row of A and adds it to the third row of A, replacing the third row.

2.10 Exercises

2.10.1 Exercise

Solve the system of linear equations or show that it does not have a solution.

2.10. EXERCISES 25

2x1 + x2 = 7

x1 − x2 = 5

2.10.2 Exercise


2x1 − 5x2 + 4x3 = 8

2x1 + 2x3 = 4

− x1 − 2x2 + x3 = 2

2.10.3 Exercise


2x1 − 5x2 + 4x3 = 8

2x1 + 2x3 = 4

2.10.4 Exercise

If

A =

(2 1 00 1 5

)then 3A = ?

2.10.5 Exercise

If

A =

(2 1 00 1 5

)

B =

1 1 10 1 10 0 1

then AB =


2.10.6 Exercise

If

A =

1 1 10 1 10 0 1

then tr A =

2.10.7 Exercise

Prove the properties of matrices given in this chapter.

2.10.8 Exercise

What does an elementary matrix do when it multiplies (from the left) a matrix A? Can youprove this?

2.10.9 Exercise


x+ y + z = 4

x− y + z = 2

2x− z = 0

Write the augmented matrix for this problem. What elementary matrices would be used toreduce the augmented matrix to reduced row echelon form?

Chapter 3

Inverse of a matrix

3.1 Inverse of a matrix

3.1.1 Definition

If An×n is a square matrix, then A−1n×n is that matrix (if such a matrix exists) so that

AA−1 = A−1A = In×n

3.1.2 Example

Consider the matrix

A =

(3 11 1

)Then A has an inverse and

A−1 =1

2

(1 −1−1 3

)We can confirm that

AA−1 =

(1 00 1

)= A−1A

3.1.3 Invertible

A matrix A is said to be invertible if it has an inverse A−1.

3.2 Inverse of a two by two matrix

3.2.1 Inverses don’t always exist

As an example of a matrix with no inverse, consider the zero matrix

27

28 CHAPTER 3. INVERSE OF A MATRIX

(0 00 0

)No matter what you multiply this matrix by, you would always get the zero matrix back andcould never get the identity matrix.

3.2.2 Inverse of a two by two matrix

Suppose that

A =

(a bc d

)is such that ad− bc 6= 0. Then

A−1 =1

ad− bc

(d −b−c a

)

3.2.3 Proof

The result can be proven by direct calculation:

AA−1 =

(a bc d

)1

ad− bc

(d −b−c a

)=

1

ad− bc

(a bc d

)(d −b−c a

)=

1

ad− bc

(ad− bc 0

0 ad− bc

)=

(1 00 1

)and

A−1A =1

ad− bc

(d −b−c a

) (a bc d

)=

1

ad− bc

(ad− bc 0

0 ad− bc

)=

(1 00 1

)

3.3. INVERSE OF THE INVERSE 29

3.2.4 Example

The inverse of

A =

(1 48 5

)is

A−1 =−1

27

(5 −4−8 1

)

3.3 Inverse of the inverse

3.3.1 The inverse of the inverse is the original matrix

If A is an invertible matrix then

(A−1)−1 = A

3.3.2 Proof

Follows directly from the definition of the inverse.

3.4 Inverse of a product of invertible matrices

3.4.1 Inverse of a product of invertible matrices

Suppose that A and B are both invertible. Then

(AB)−1 = B−1A−1

3.4.2 Proof

(AB)(B−1A−1) = A(BB−1)A−1 = AIA−1 = AA−1 = I

(B−1A−1)(AB) = B−1(A−1A)B = B−1IB = B−1B = I

3.4.3 Extension

If A1 , A2 , ... , An are invertible matrices then what is the inverse of A1A2 · · ·An?


3.5 Powers of matrices

3.5.1 Definition

If A is a square matrix (invertible or otherwise) then

A0 ≡ I

A1 = A

A2 = AA

An = AA · · ·A︸︷︷︸n times

3.5.2 Inverse of a power of a matrix

Suppose that A is invertible. Then

(An)−1 = (A−1)n

3.5.3 Proof

By induction.

3.5.4 Notation

For an invertible matrix A and n a positive integer

A−n ≡ (A−1)n

3.5.5 Exponent rules

If A is an invertible square matrix and r, s ∈ Z

ArAs = Ar+s

(Ar)s = Ars

3.5.6 Proof

By induction.

3.6. INVERSE OF A TRANSPOSE 31

3.5.7 Polynomials of matrices

If A is a square matrix and p is a polynomial function defined by

p(x) = a0 + a1x+ a2x2 + · · ·+ anx

n

then

p(A) = a0I + a1A+ a2A2 + · · ·+ anA

n

3.5.8 Example

If

A =

(0 1−1 0

)and

f(x) = x3 + x2 + x

then

f(A) = − I2×2

3.6 Inverse of a transpose

3.6.1 Inverse of a transpose of an invertible matrix

Suppose that A is an invertible matrix. Then At is also invertible and

(At)−1 = (A−1)t

3.6.2 Proof

(A−1)tAt = (AA−1)t = I t = I

At(A−1)t = (A−1A)t = I t = I

3.7 Inverses of elementary matrices

3.7.1 Inverse of scaling elementary matrix

One elementary row operation is to multiply the ith row by a real number c 6= 0. The ele-mentary matrix for this operation is found by multiply the ith row of the identity matrix byc. The inverse of this elementary matrix is found by multiplying the ith row of the identity


matrix by 1/c.

This can be confirmed by directly computing the product of the two matrices.

3.7.2 Inverse of an interchanging rows elementary matrix

One elementary row operation is to multiply the ith row and the jth row of a matrix. Theelementary matrix for this operation is found by interchanging the ith row and jth row ofthe identity matrix. This matrix is its own inverse.This can be confirmed by directly computing the product of the matrix with itself.

3.7.3 Inverse of a add multiple of a row to a row elementarymatrix

One elementary row operation is to multiply the ith row of a matrix by c 6= 0 and add itto the jth row of a matrix. The elementary matrix for this operation is found by doing thesame operation to the identity matrix.

The inverse of this matrix is found by multiplying the ith row of the identity matrix by−c 6= 0 and adding it to the jth row of the identity matrix.This can be confirmed by directly computing the product of the matrix with itself.

3.7.4 All elementary matrices are invertible

We see that all elementary matrices are invertible and the inverses are also elementarymatrices.

3.8 Inverses and solutions of systems

3.8.1 Example


x1 + x2 = 3

x1 − x2 = − 1

If we let

A =

(1 11 −1

)Then we can write

3.9. FUNDAMENTAL THEOREM OF LINEAR ALGEBRA 33

A

(x1

x2

)=

(3−1

)The inverse of A is

A−1 =−1

2

(−1 −1−1 1

)=

(1/2 1/21/2 −1/2

)Then we can multiply from the left on both sides to get

A−1A

(x1

x2

)= A−1

(3−1

)And so (

x1

x2

)=

(1/2 1/21/2 −1/2

)(3−1

)=

(12

)

3.8.2 Generalizing

If An×n is an invertible square matrix and xn×1 is a column matrix of unknowns x1, x2, ..., xnand bn×1 is a column of real numbers then the equation

Ax = b

has the unique solution

x = A−1b

3.8.3 Proof

Multiply by A−1 on both sides.

3.9 Fundamental theorem of linear algebra

3.9.1 Name

It’s not actually called that. But it’s important enough that it should be. We will see atdifferent times in this course quite a few statements that are all equivalent and are extremelyimportant in linear algebra.

3.9.2 FTLA

The following are equivalent:

1) An×n is invertible.


2) The equation Ax = 0 has only the trivial solution. ( All entries of x are zero ).

3) The reduced row echelon form of A is the identity matrix I.

4) A is a product of elementary matrices.

3.9.3 Example

We know that the equation (1 20 5

)(x1

x2

)=

(00

)has only the trivial solution (

x1

x2

)=

(00

)because the matrix on the left of the equation is invertible.

3.10 Calculating the inverse by hand

3.10.1 Method

Suppose An×n is a square matrix and we want to find the inverse. We set up a matrix

(A|In×n)

and then we perform elementary operations until we get(I|A−1

)If we can’t get I on the left it means that the matrix A was not invertible.

3.10.2 Example

If we want to invert

A =

1 1 10 1 10 0 1

we start with the matrix 1 1 1 | 1 0 0

0 1 1 | 0 1 00 0 1 | 0 0 1

and perform elementary row operations until we get

3.11. PROOF OF THE FTLA 35

1 0 0 | 1 −1 00 1 0 | 0 1 −10 0 1 | 0 0 1

and so the inverse is

A−1 =

1 −1 00 1 −10 0 1

3.11 Proof of the FTLA

3.11.1 Theorem


a) An×n is invertible

b) The equation Ax = 0 has only the trivial solution x = 0.

c) The reduced row echelon form of A is the I.

d) A is a product of elementary matrices.

3.11.2 Proof a) implies b)

If A is invertible then

Ax = 0 ⇒A−1(Ax) = A−10 ⇒(A−1A)x = 0 ⇒

Ix = 0 ⇒x = 0

3.11.3 Proof b) implies c)

Suppose that the equation Ax = 0 has only the solution x = 0. Then the augmented matrixin rref looks like


1 0 0 0 | 00 1 0 0 | 00 0 1 0 | 0

. . .

0 0 0 1 | 0

and so the rref of A is I.

3.11.4 Proof c) implies d)

If the rref of A is I then A can be transformed to I by elementary row operations. Since theelementary row operations can be done by elementary matrices, we have

EnEn−1 · · ·E2E1A = I

for some elementary matrices E1, E2, ... , En. Then inverting gives

A = E−11 E−1

2 · · ·E−1n

3.11.5 Proof d) implies a)

If we can show that d) implies a) then we have completed the proof.

If A is a product of elementary matrices then A is a product of invertible matrices sinceall elementary matrices are invertible. The product of invertible matrices is invertible so Ais invertible.

3.12 Even more FTLA

3.12.1 Theorem




c) The reduced row echelon form of A is the I.


NEW

e) Ax = b is consistent for all n× 1 matrices b

3.12. EVEN MORE FTLA 37

f) Ax = b has exactly one solution for all n× 1 matrices b

3.12.2 Proof so far

We have a) implies b) implies c) implies d) implies a).

3.12.3 e) implies a)

Consider

e1 =

100...0

There is some column matrix v1 so that

Av1 = e1

Consider

e2 =

010...0

There is some column matrix v2 so that

Av2 = e2

If ej is the column vector of n rows which has jth entry 1 and otherwise zero, then there arecolumn matrices vj so that

Avj = ej

for j = 1, 2, ..., n. If we write collect the vj as columns of a square matrix and the ej as thecolumns of the identity matrix then we can rewrite the result as

A(v1 v2 · · · vn

)=(e1 e2 · · · en

)A(v1 v2 · · · vn

)= I

and so


A−1 =(v1 v2 · · · vn

)3.12.4 Proof a) implies f)

Exercise for the student.

3.13 Exercises

3.13.1 Exercise

Consider the matrix

A =

1 1 11 0 10 1 1

Find the inverse of this matrix.

3.13.2 Exercise


x1 + x2 + x3 = 3

x1 + x3 = 4

x2 + x3 = 5

What is the solution of this system?

3.13.3 Exercise

We proved that for invertible matrices A1 and A2 of the same size that

(A1A2)−1 = A−12 A−1

1

Given this fact show that

(A1A2A3)−1 = A−13 A−1

2 A−11

Assume that for some positive integer k and any k invertible matrices of the same size that

(A1A2A3 · · ·Ak−1Ak)−1 = A−1

k A−1k−1 · · ·A

−12 A−1

1

Prove that for another invertible matrix Ak+1 of the same size that

3.13. EXERCISES 39

(A1A2A3 · · ·Ak−1AkAk+1)−1 = A−1k+1A

−1k A−1

k−1 · · ·A−12 A−1

1

What can be concluded by mathematical induction?

3.13.4 Exercise

Consider the matrix

A =

0 1 10 0 10 0 0

Show that A3 = 0.

3.13.5 Exercise

Suppose that a square matrix A is such that A4 = 0. Prove that A does not have an inverse.Hint: Assume that it does have an inverse and that this assumption leads to a contradiction.

3.13.6 Exercise

Consider the matrix

A =

1 1 11 0 10 1 1

Write A as the product of elementary matrices. What does the FTLA tell you?

3.13.7 Exercise

A square matrix is said to be diagonal if all the entries of the matrix not on the main diagonalare zero. Then entries on the main diagonal may or may not be zero.

Show that if a diagonal matrix has no entries on the main diagonal that are zero thenthe diagonal matrix is a product of elementary matrices. Find the inverse of such a matrix.

3.13.8 Exercise

What is the formula for a positive integer power of a diagonal matrix?


3.13.9 Exercise

Suppose that A is a square matrix. Consider the two equations

Ax = b, Ax = c

Is it possible for the first equation to have exactly one solution and for the second equationto have more than one solution?

3.13.10 Exercise

Suppose that there is a polynomial function p such that

p(x) = a0 + a1x+ a2x2

You are given that

p(0) = 1, p(1) = 3, p(−1) = 1

What is the polynomial function p?

3.13.11 Exercise


x1 + x2 = 0

cx1 + x2 = 0

where c is some real number. This system obviously has the trivial solution. For what valuesof c does this system have only the trivial solution?

3.13.12 Exercise

Is this matrix A invertible? 4 0 00 3 00 0 0

3.13.13 Exercise

Go through the proof of the FTLA carefully.

Chapter 4

Some types of matrices

4.1 Diagonal matrices

4.1.1 Examples

The following square matrices are examples of diagonal matrices

I =

1 0 00 1 00 0 1

A =

1 0 00 5 00 0 7

B =

0 0 00 1 00 0 0

4.1.2 Definition

A square matrix which has the property that all entries not on the main diagonal are zerois said to be diagonal. The entries on the main diagonal can be zero or not zero.

D =

d1 0 0 0 · · · 00 d2 0 0 · · · 00 0 d3 0 · · · 0

0 0 · · · . . . · · · 00 0 · · · · · · · · · dn

41

42 CHAPTER 4. SOME TYPES OF MATRICES

4.2 Diagonal and elementary matrices

4.2.1 Elementary matrices that scale rows

Let’s use 3x3 examples:

E1(d1) =

d1 0 00 1 00 0 1

This elementary matrix performs the elementary row operation of multiplying the first rowof a matrix by d1.

The inverse if d1 6= 0 is

E1(1/d1) =

1/d1 0 00 1 00 0 1

This elementary matrix performs the elementary row operation of multiplying the first rowof a matrix by 1/d1.

Now consider

E2(d2) =

1 0 00 d2 00 0 1

This elementary matrix performs the elementary row operation of multiplying the secondrow of a matrix by d2.


E2(1/d2) =

1 0 00 1/d2 00 0 1

This elementary matrix performs the elementary row operation of multiplying the secondrow of a matrix by 1/d2.

Now consider

E3(d3) =

1 0 00 1 00 0 d3

This elementary matrix performs the elementary row operation of multiplying the third rowof a matrix by d3.


4.3. INVERSES OF DIAGONAL MATRICES 43

E3(1/d3) =

1 0 00 1 00 0 1/d3

This elementary matrix performs the elementary row operation of multiplying the third rowof a matrix by 1/d3.

4.2.2 3x3 diagonal matrix

Then we can get

D(d1, d2, d3) =

d1 0 00 d2 00 0 d3

= E1(d1)E2(d2)E3(d3)

4.2.3 General result

Exercise.

4.3 Inverses of diagonal matrices

4.3.1 Inverses

If none of the di are zero then

D−1 =

1/d1 0 0 0 · · · 0

0 1/d2 0 0 · · · 00 0 1/d3 0 · · · 0

0 0 · · · . . . · · · 00 0 · · · · · · · · · 1/dn

4.3.2 What if one of the di is zero?

Then the matrix is not invertible.

4.3.3 Example

If

A =

1 0 00 5 00 0 7

then


A−1 =

1 0 00 1/5 00 0 1/7

4.3.4 Product of elementary matrices

If a diagonal matrix has no zero entries on the main diagonal then it is the product ofelementary matrices.

4.3.5 Proof

Exercise. Hint: Consider performing a set of row operations on the identity matrix that willeventually give a diagonal matrix with no zeros on the main diagonal.

4.3.6 Inverses

A diagonal matrix with no zeros on the main diagonal is invertible.

4.3.7 Proof

Previous result plus the FTLA.

4.4 Powers of diagonal matrices

4.4.1 Recall

When the diagonal matrix D(d1, d2, ..., dn) multiplies another matrix A from the left it mul-tiplies the ith row of A by di.

4.4.2 Repeated multiplication by a diagonal matrix

If we apply D twice the ith row of A will be multiplied by di twice or d2i . If we apply D to

A for a total of k times then the ith row of A will be multiplied by dki . Then

(D(d1, d2, . . . , dn))k = D(dk1, dk2, . . . , d

kn)

4.5 Multiplying from the right

4.5.1 Question

We said that if D is a diagonal matrix which multiplies A from the left, then it scales therows of A. What if instead of DA we had AB?

4.6. IMPORTANCE OF DIAGONAL MATRICES 45

4.5.2 Example

Consider

D =

1 0 00 2 00 0 3

and

A =

1 0 11 1 01 0 1

From what we have seen before, we can write

DA =

1 0 12 2 03 0 3

without even doing full matrix multiplication. We know that the rows of A will be scaled.Now consider that

AD =

1 0 11 1 01 0 1

1 0 00 2 00 0 3

=

1 0 31 2 01 0 3

4.5.3 Exercise

Show that multiplication by a diagonal matrix from the right scales columns.

4.6 Importance of diagonal matrices

It is very easy to find the inverses and powers of a diagonal matrix. We will see later in thecourse that this simplifies many calculations.

4.7 Symmetric matrices

4.7.1 Definition

A square matrix A is said to be symmetric if At = A.

4.7.2 Alternatively

If its entries aij are such that

aij = aji


4.7.3 Example

The identity matrix is symmetric.

4.7.4 Example

A =

1 2 32 0 43 4 −7

is a symmetric matrix.

4.7.5 Transpose of a symmetric matrix

If A is symmetric then At is symmetric.

4.7.6 Proof

If A is symmetric then At = A. So

(At)t = At = A = At

4.7.7 Sum of symmetric matrices

If A and B are symmetric matrices then their sum A+B is symmetric.

4.7.8 Proof

A is symmetric so At = A. B is symmetric so Bt = B. Then (A + B)t = At + Bt by therules for transpose. Then (A+B)t = A+B.

4.7.9 Sum of symmetric matrices

If A and B are symmetric matrices then their difference A−B is symmetric.

4.7.10 Proof

Exercise.

4.7.11 Scalar multiple of a symmetric matrix

If A is a symmetric matrix and k is a real number then the matrix kA is symmetric

4.7.12 Proof

If the elements aij of A have the property that aji = aij then the elements bij = kaij havethe property that bji = kaji = kaij = bij. Thus kA is symmetric.

4.8. TRIANGULAR MATRICES 47

4.7.13 Inverse of a symmetric matrix

If A is a symmetric matrix and is invertible then its inverse is also symmetric.

4.7.14 Proof

Suppose that A is symmetric. That is, At = A. Suppose that A is invertible. Then

(A−1)tAt = (AA−1)t = I t = I

and

At(A−1)t = (A−1A)t = I t = I

But as A = At

(A−1)tA = I

and

A(A−1)t = I

And so

(A−1)t = A−1

4.7.15 Product of a matrix and its transpose

Suppose Am×n is a general matrix and Atn×m. Then their product AtA is symmetric.

4.7.16 Proof

(AtA)t = At(At)t = AtA

4.8 Triangular matrices

4.8.1 Upper triangular matrices

A square matrix is upper triangular if all the entries below the main diagonal are zero. En-tries on the main diagonal may be zero or not.

That is, A is upper triangular if its entries aij are such that aij = 0 whenever i > j.


4.8.2 Example

A =

0 1 20 5 10 0 12

is an example of an upper triangular matrix.

4.8.3 Lower triangular matrices

A square matrix is lower triangular if all the entries above the main diagonal are zero. En-tries on the main diagonal may be zero or not.

That is, A is lower triangular if its entries aij are such that aij = 0 whenever i < j.

4.8.4 Example

A =

1 0 02 3 04 5 6

is an example of a lower triangular matrix.

4.8.5 Identity matrix

The identity matrix is both upper triangular and lower triangular. In fact, diagonal matricesare both lower triangular and upper triangular.

4.9 Properties of triangular matrices

4.9.1 Transpose of an upper triangular matrix

The transpose of an upper triangular matrix is lower triangular.

4.9.2 Proof

A is upper triangular if its entries aij are such that aij = 0 whenever i > j. Let the entriesof At be called bij. Then bij = aji and are such that bij is zero whenever j > i. Thus At islower triangular.

4.9.3 Transpose of an upper triangular matrix

The transpose of a lower triangular matrix is upper triangular.

4.9.4 Proof

Exercise.

4.9. PROPERTIES OF TRIANGULAR MATRICES 49

4.9.5 The sum of upper triangular matrices

If A and B are upper triangular matrices of the same size then their sum is upper triangular.

4.9.6 Proof

A is upper triangular if its entries aij are such that aij = 0 whenever i > j. B is uppertriangular if its entries bij are such that bij = 0 whenever i > j. The sum C = A + B hasentries cij = aij + bij. If i > j then both aij and bij are zero and so cij = 0 + 0 = 0 if i > j.Then C is upper triangular.

4.9.7 The sum of lower triangular matrices

If A and B are lower triangular matrices of the same size then their sum is lower triangular.

4.9.8 Proof

Exercise.

4.9.9 The difference of upper triangular matrices

If A and B are upper triangular matrices of the same size then their difference A − B isupper triangular.

4.9.10 Proof

Exercise.

4.9.11 The difference of lower triangular matrices

If A and B are lower triangular matrices of the same size then their difference A−B is lowertriangular.

4.9.12 Proof

Exercise.

4.9.13 Invertibility of upper triangular matrices

This depends on the the main diagonal.

If an upper triangular matrix has no zeros on the main diagonal then it is invertible.


4.9.14 Proof

We can use elementary matrices to scale the pivots of each row to 1. We can then useelementary matrices to produce zeros above the pivots. Thus the rref of the matrix is theidentity matrix and then the matrix is invertible by the FTLA.

4.9.15 Invertibility of lower triangular matrices

A lower triangular matrix is invertible if and only if it has no zeros on the main diagonal.

4.9.16 Proof

Exercise.

4.9.17 Example

The matrix

A =

1 1 10 2 20 0 3

is invertible and the matrix

B =

1 1 1 40 0 2 50 0 3 70 0 0 1

is not invertible.

4.9.18 Form of inverses of triangular matrices

The inverse of an invertible upper triangular matrix is upper triangular and the inverse ofan invertible lower triangular is lower triangular.

4.9.19 The upper triangular case

Suppose that Un×n is an upper triangular matrix that is invertible. We consider how to findthe inverse using

(U |I)

The matrix I is upper triangular. Every operation that we will perform will not change thezeros below the main diagonal and so the resulting matrix will be upper triangular.

4.10. LU DECOMPOSITION 51

4.9.20 The lower triangular case

Exercise.

4.10 LU decomposition

4.10.1 Idea

Suppose that A is an invertible matrix and we want to solve

Ax = b

We decompose A as the product of a lower triangular matrix L and an upper triangularmatrix U .

A = LU

Now we want to solve

LUx = b

If we write

y = Ux

then we can solve

Ly = b

for y and then solve

Ux = y

for x.

4.10.2 Benefits

In some applications, A will be fixed while b keeps changing. So one factors A only oncewhich is equivalent to Gaussian elimination.

4.10.3 Example

Suppose that we have the system 2 −1 3−4 5 04 2 18

x1

x2

x3

=

1−20

One LU decomposition could be of the form


L =

1 0 0l21 1 0l31 l32 1

U =

u11 u12 u13

0 u22 u23

0 0 u33

Note that when we multiply L and U the first row of the result will be the first row of U , sothe first row of U should be the first row of A

L =

1 0 0l21 1 0l31 l32 1

U =

2 −1 30 u22 u23

0 0 u33

Consider that l21 · 2 = a21 = −4. Then l21 = −2. So we have

L =

1 0 0−2 1 0l31 l32 1

U =

2 −1 30 u22 u23

0 0 u33

Also (−2)(−1) + (1)(u22) = 5. So u22 = 3.

L =

1 0 0−2 1 0l31 l32 1

U =

2 −1 30 3 u23

0 0 u33

Also l31(2) = 4 so l31 = 2.

L =

1 0 0−2 1 02 l32 1

U =

2 −1 30 3 u23

0 0 u33

Also (2)(−1) + l32(3) = 2 so l32 = 4/3.

4.11. EXERCISES 53

L =

1 0 0−2 1 02 4/3 1

U =

2 −1 30 3 u23

0 0 u33

The (−2)(3) + (1)u23 = 0 so u23 = 6.

L =

1 0 0−2 1 02 4/3 1

U =

2 −1 30 3 60 0 u33

Finally (2)(3) + (4/3)(6) + u33 = 18 so u33 = 4.

L =

1 0 0−2 1 02 4/3 1

U =

2 −1 30 3 60 0 u33

It is left to the reader to solve the system.

4.10.4 Pivoting

A pivoting matrix P that exchanges rows is often used with a decomposition given by

A = PLU

We don’t want to go too deeply into numerical linear algebra at this point and leave furtherdiscussion to later courses.

4.11 Exercises

4.11.1 Exercise

Find the inverse of the diagonal matrix or show that the inverse does not exist.

D =

1 0 00 2 00 0 4


D =

1 0 00 0 00 0 4

4.11.2 Exercise

The exponential function exp : R→ R is defined by

exp(x) ≡ 1 + x+x2

2!+x3

3!+ · · ·+ +

xk

k!+ · · ·

The exponential function for square matrices A is defined by

exp(A) ≡ I + A+A2

2!+A3

3!+ · · ·+ +

Ak

k!+ · · ·

If

D =

1 0 00 2 00 0 4

then what is exp(D)?

4.11.3 Exercise

A square A matrix is said to be skew-symmetric if At = −A. Prove that a skew-symmetricmatrix has all zeros on its main diagonal.

4.11.4 Exercise

Show that a square matrix A can always be written as the sum of a symmetry matrix anda skew-symmetric matrix.

4.11.5 Exercise

Consider the lower triangular matrix

L =

1 0 02 1 02 2 1

Find the inverse of this matrix.

4.11. EXERCISES 55

4.11.6 Exercise

Consider the lower triangular matrix

L =

1 0 02 1 02 2 1

and the matrix

A =

1 2 32 6 72 8 11

Find an upper triangular matrix U so that A = LU .


Chapter 5

Determinants

5.1 Idea

The determinant is a measure of what a linear transformation represented by a square ma-trix does to a unit area. A determinant of two would mean that the area of the output ofthe transformation is twice the area of the input. The sign represents a change in orientation.

In this course, we will be interested in determinants for what they can tell us about matrices.

5.2 For 2x2 systems

5.2.1 Definition

Given

A2×2 =

(a bc d

)we define

|A| ≡ ad− bc

or

det(A) ≡ ad− bc

5.2.2 Example

If

A =

(1 23 4

)we define

57

58 CHAPTER 5. DETERMINANTS

|A| =

∣∣∣∣1 23 4

∣∣∣∣ = 1 · 4− 2 · 3 = − 2

5.3 For general square matrices

What if the matrix is bigger than a 2x2?

5.3.1 Minors

If the square matrix An×n has entries aij let mij be the determinant of the matrix given bycrossing out the row and column of aij.

5.3.2 Example

Suppose that we have a matrix

A =

1 2 34 5 67 8 9

Then

m11 =

∣∣∣∣5 68 9

∣∣∣∣ = 5 · 9− 6 · 8 = − 3

m23 =

∣∣∣∣1 27 8

∣∣∣∣ = 1 · 8− 2 · 7 = − 6

5.3.3 Cofactors

If A is a square matrix with entries aij and the associated minors are mij then the cofactorcij is given by

cij = (−1)i+jmij

5.3.4 Example

We calculated for

A =

1 2 34 5 67 8 9

the minors

m11 = − 3

5.3. FOR GENERAL SQUARE MATRICES 59

m23 = − 6

The associated cofactors are

c11 = (−1)1+1m11 = (1)(−3) = − 3

c23 = (−1)2+3m23 = (−1)(−6) = 6

5.3.5 Determinants

For a square matrix An×n we define

|A| = a11c11 + a12c12 + · · ·+ a1nc1n

5.3.6 Example

For

A =

1 2 34 5 67 8 9

|A| = (1)

∣∣∣∣5 68 9

∣∣∣∣+ (−2)

∣∣∣∣4 67 9

∣∣∣∣+ (3)

∣∣∣∣4 57 8

∣∣∣∣5.3.7 General definition

You can follow whatever row or column you like, just make sure to get the alternating signsright. For a square matrix An×n expanding along the ith row, we have

|A| =n∑j=1

(−1)i+jaij|Aij|

where Aij is the matrix resulting from crossing out the ith row and j column of A.

For a square matrix An×n expanding along the jth column, we have

|A| =n∑i=1

(−1)i+jaij|Aij|


5.3.8 Example

For

A =

1 0 34 0 67 0 9

we can use

|A| = a12c12 + a22c22 + a32c32 = 0 + 0 + 0 = 0

5.3.9 Example

What is the determinant of

A =

1 2 3 40 5 6 70 0 1 20 0 0 6

The result is immediate if you make the right choice for the rows and columns to use.

5.4 Some properties of determinants

5.4.1 Determinants of matrices with rows/cols of zeros

The determinant of a matrix with a row of all zeros is zero. The determinant of a matrixwith a column of all zeros is zero.

5.4.2 Examples

Consider a 2x2 matrix (a b0 0

)The determinant is clearly zero.

Consider a 3x3 matrix

A =

a b c0 0 0d e f

If we do the calculation using the second row we get

|A| = 0| · |+ 0| · |+ 0| · | = 0

5.4. SOME PROPERTIES OF DETERMINANTS 61

Similarly, if we have a row of zeros in any n× n matrix A we would get

|A| = 0| · |+ 0| · |+ 0| · |+ · · ·+ 0| · | = 0

The same reasoning holds for a column of zeros.

5.4.3 Multiplying a single row by a real

Suppose that An×n is a square matrix and B is the matrix given by multiplying each elementof a row by k. Then

|B| = k|A|

5.4.4 Example ∣∣∣∣∣∣1 2 02 4 03 5 0

∣∣∣∣∣∣ = 0

and so ∣∣∣∣∣∣5 10 02 4 03 5 0

∣∣∣∣∣∣ = 5 · 0 = 0

5.4.5 Proof

Suppose that An×n has elements aij. Let Aij be the (n − 1) × (n − 1) matrix formed bycrossing out the ith row and jthe column of A.

Suppose that we multiply row s of A by a real number k to get B which has elementsbij. Let Bij be the (n − 1) × (n − 1) matrix formed by crossing out the ith row and jthecolumn of B.

Note that

bsj = kasj

from the definition of B and

Bsj = Asj

since A and B only differ in row s which is being crossed out in Asj and Bsj.

Then calculating the determinant of B by expanding along the sth row of B:


|B| = (−1)s+1bs1|Bs1|+ (−1)s+2bs2|Bs2|+ · · ·+ (−1)s+nbsn|Bsn|= (−1)s+1kas1|As1|+ (−1)s+2kas2|As2|+ · · ·+ (−1)s+nkasn|Asn|= k

{(−1)s+1as1|As1|+ (−1)s+2as2|As2|+ · · ·+ (−1)s+nasn|Asn|

}= k|A|

5.4.6 Multiplying all elements by a real

Suppose that An×n is a square matrix and B is the matrix given by multiplying each elementof A by k. Then

|B| = |kA| = kn|A|

5.4.7 Proof

Apply the proof for a single row n times.

5.4.8 Result for multiplying a column by a real

Suppose that An×n is a square matrix and B is the matrix given by multiplying a column ofA by k. Then

|B| = |kA| = k|A|

5.4.9 Proof

Exercise

5.4.10 Determinant after interchanging two adjacent rows

Suppose that An×n is a square matrix and the matrix Bn×n is obtained from A by swappingtwo ADJACENT rows of A. Then

|B| = − |A|

5.4.11 Example ∣∣∣∣∣∣1 2 30 4 50 0 6

∣∣∣∣∣∣ = 24

so


∣∣∣∣∣∣1 2 30 0 60 4 5

∣∣∣∣∣∣ = − 24

5.4.12 Proof

Let A be the original matrix with entries ai,j. Let Ai,j be the matrix obtained by crossingout the ith row of A and the jth column of A.

Let B be the matrix obtained by interchanging two adjacent rows of A with entries bi,j.Let Bi,j be the matrix obtained by crossing out the ith row of B and the Bth column of B.

Let’s suppose that B is obtained from A by interchanging row i and i+ 1.

Note that

bi+1,k = ai,k

and that

Bi+1,k = Ai,k

from the definition of B.

We calculate the determinant of B by expansion along row i+ 1

|B| = (−1)i+1+1bi+1,1|Bi+1,1|+ (−1)i+1+2bi+1,2|Bi+1,2|+ · · ·+ (−1)i+1+nbi+1,n|Bi+1,n|= (−1)i+1+1ai,1|Ai,1|+ (−1)i+1+2ai,2|Ai,2|+ · · ·+ (−1)i+1+nai,n|Ai,n|= (−1)

{(−1)i+1ai,1|Ai,1|+ (−1)i+2ai,2|Ai,2|+ · · ·+ (−1)i+nai,n|Ai,n|

}= (−1)|A|

5.4.13 Determinant after swapping ANY two rows

Suppose that An×n is a square matrix and Bn×n is the matrix obtained from A by inter-changing any two rows of A. Then

|B| = − |A|

5.4.14 Example ∣∣∣∣∣∣1 2 30 4 50 0 6

∣∣∣∣∣∣ = 24


so ∣∣∣∣∣∣0 0 60 4 51 2 3

∣∣∣∣∣∣ = − 24

5.4.15 Proof

Suppose that row r and row s are the two rows to be interchanged with 1 ≤ r < s ≤ n.We think of interchanging row r with the row below it repeatedly, including with row s andthen leaving it as the sth row of the new matrix. There were s− r interchanges.

Now that row s is one row above row r do repeated interchanges with the row above ituntil there row s is now the rth row of the new matrix. There were s− r − 1 interchanges.

Then by the result for interchanges of adjacent rows

|B| = (−1)s−r+s−r−1|A| = (−1)2(s−r)−1|A| = − |A|

5.4.16 Determinant after interchanging two columns

Suppose that An×n is a square matrix and Bn×n is the matrix obtained from A by inter-changing any two columns of A. Then

|B| = − |A|

5.4.17 Proof

Exercise.

5.4.18 Determinant of a square matrix with two identical rows

The determinant of a square matrix with two identical rows is zero.

5.4.19 Example ∣∣∣∣∣∣∣∣1 1 1 11 2 3 44 3 2 11 1 1 1

∣∣∣∣∣∣∣∣ = 0


5.4.20 Proof

Suppose that A is a square matrix with two rows that are the same. From the result onrow swaps, we can swap two rows and the determinant changes by a minus sign. Then if weswap the two identical rows

|A| = − |A| ⇒ |A| = 0

5.4.21 Determinant of a square matrix with two identical columns

The determinant of a square matrix with two identical columns is zero.

5.4.22 Proof

Exercise.

5.4.23 Determinants and the third row operation

Let An×n be a square matrix. Let k be a real number. Let B be the result of adding k timesrow r of A to another row s of A and replacing the original row s of A. Then

|B| = |A|

5.4.24 Example∣∣∣∣∣∣1 1 12 1 33 2 5

∣∣∣∣∣∣ =

∣∣∣∣∣∣1 1 10 −1 13 2 5

∣∣∣∣∣∣ =

∣∣∣∣∣∣1 1 10 −1 10 −1 2

∣∣∣∣∣∣ =

∣∣∣∣∣∣1 1 10 −1 10 0 1

∣∣∣∣∣∣ = − 1

5.4.25 Proof

Let An×n be a square matrix with entries aij. Let Aij be the matrix formed by crossing outthe ith row and jth column of A.

Let k be a real number.

Let B be the result of adding k times row r of A to row s of A and replacing the origi-nal row s of A. Let the entries of B be bij. Let Bij be the matrix formed by crossing outthe ith row and jth column of B.

Let C be the result of replacing row s of A by row r of A. Let the entries of C be cij.Let Cij be the matrix formed by crossing out the ith row and jth column of B.

Note that

Asj = Bsj = Csj


and

arj = csj

and as C has two identical rows

|C| = 0

We calculate the determinant of B by expanding along row s:

|B| =n∑j=1

(−1)s+jbsj|Bsj|

=n∑j=1

(−1)s+j(asj + karj)|Bsj|

= kn∑j=1

(−1)s+jarj|Bsj|

n∑j=1

(−1)s+jasj|Bsj|

= kn∑j=1

(−1)s+jcsj|Csj|

n∑j=1

(−1)s+jasj|Asj|

= k|C|+ |A|= |A|

5.5 Determinants of matrices and their transposes

5.5.1 Example

Consider the 2x2 case

A =

(a bc d

)

At =

(a cb d

)The determinants are

|A| = ad− bc

5.5. DETERMINANTS OF MATRICES AND THEIR TRANSPOSES 67

|At| = ad− cb

So in the 2x2 case the determinant of a matrix A and its transpose At are the same.

5.5.2 Example

Consider the 3x3 case

A =

a11 a12 a13

a21 a22 a23

a31 a32 a33

At =

a11 a21 a31

a12 a22 a32

a13 a23 a33

We can find the determinant of A by going along its first row to get

|A| = a11

∣∣∣∣a22 a23

a32 a33

∣∣∣∣− a12

∣∣∣∣a21 a23

a31 a33

∣∣∣∣+ a13

∣∣∣∣a21 a22

a31 a32

∣∣∣∣Now to get the determinant of the transpose we can expand along the first column to get

At =

a11 a21 a31

a12 a22 a32

a13 a23 a33

|At| = a11

∣∣∣∣a22 a32

a23 a33

∣∣∣∣− a12

∣∣∣∣a21 a31

a23 a33

∣∣∣∣+ a13

∣∣∣∣a21 a31

a22 a32

∣∣∣∣and we see that the result is exactly the same. Note that the determinants in the expansionare the determinants of transposes of 2x2 matrices, for which we have already establishedthe result.

5.5.3 Theorem

If An×n is a square matrix then

|At| = |A|

5.5.4 Proof

Will do by induction.

Let Pn be the proposition to be proven. We already have P2. Assume Pk, that for 2,...,k

|At2×2| = |A|


|At3×3| = |A|

...

|Atk×k| = |A|

Now let A be a square (k + 1)× (k + 1) matrix. Then

|A| =k+1∑j=1

(−1)i+jaij|Aij|

where Aij is the k × k matrix that results from crossing out the ith row and jth col of A.

But by the induction hypothesis, |Atij| = |Aij| so

|A| =k+1∑j=1

(−1)i+jaij|Atij|

Note that the terms of the transpose matrix are given by

a′ji = aij

and so

|A| =k+1∑j=1

(−1)i+ja′ji|Atji|

The RHS is the determinant of At expanding along the ith column so

|A| = |At|

As P2 and Pk implies Pk+1 then Pn by mathematical induction.

5.6 Determinants, row operations, and elementary ma-

trices

5.6.1 Identity

|I| = 1

5.6.2 Determinant of row swapping elementary matrix

We find the matrix for row swapping by swapping rows of the identity matrix. By a previousresult the determinant of the elementary matrix will be -1.

5.6. DETERMINANTS, ROW OPERATIONS, AND ELEMENTARY MATRICES 69

5.6.3 Determinant of row multiplying elementary matrix

We find the matrix for row multiplying by multiplying a row of the identity matrix. By aprevious result the determinant of the elementary matrix will be the multiple.

5.6.4 Determinant of elementary matrix that adds a multiple of arow to another row

We find the matrix for this operation by performing this operation on the identity matrix.The determinant of the new matrix must be the same as the determinant of the identity bya previous result, and so is 1.

5.6.5 Determinant of a product involving an elementary matrixand a square matrix

Suppose that E is an elementary matrix and A is a square matrix of the same size. Then

|EA| = |E||A|

5.6.6 Proof

We consider the three types of elementary matrices and show that the result holds for eachtype.

Suppose that E is an elementary matrix that interchanges two rows. Then

|EA| = − |A|

from a previous result. But |E| = −1 so

|EA| = − |A| = |E||A|

Suppose that E multiplies a row by a real number k. Then

|EA| = k|A| = |E||A|

Suppose that E multiplies a row and adds it to another row. Then

|EA| = |A| = 1 · |A| = |E||A|

5.6.7 Determinant of an invertible matrix is nonzero

Suppose that a square matrix A is invertible. Then its determinant is non-zero.


5.6.8 Proof

By the FTLA A is invertible if and only if it is the product of elementary matrices

A = E1E2 · · ·EsThen

|A| = |E1E2 · · ·Es| = |E1||E2 · · ·Es| = |E1||E2| · · · |Es|

The determinants of elementary matrices are not zero by our previous discussion so for Ainvertible |A| 6= 0.

5.6.9 Determinant of a non-invertible (singular) matrix is zero

If a square matrix A is not invertible then its determinant is zero.

5.6.10 Proof

We do row reduction on A using elementary matrices to get its rref B. Note that B cannotbe I since A is not invertible and has at least one row of zeros.

B = Ek · · ·E2E1A

Taking the determinant of both sides and using a previous result

|B| = |Ek · · ·E2E1A| = |Ek| · · · |E2||E1||A|

As B has at least one row of zeros by a previous result |B| = 0.

0 = |Ek| · · · |E2||E1||A|

As elementary matrices have nonzero determinant we must have that |A| = 0.

5.6.11 Product of matrices with singular factor

Suppose that A and B are matrices of the same size and A is not invertible. Then theproduct AB has determinant zero.

5.6.12 Proof

We apply a set of elementary matrices to reduce A to its rref

C = EkEk−1 · · ·E2E1A

Note that as A is not invertible its rref has at least one row of zeros. Now multiply by B onthe right.

CB = EkEk−1 · · ·E2E1AB

5.6. DETERMINANTS, ROW OPERATIONS, AND ELEMENTARY MATRICES 71

Since C has a row of zeros, so does CB. Then the determinant of CB is zero and

|CB| = |EkEk−1 · · ·E2E1AB| = |Ek||Ek−1| · · · |E2||E1||AB|

0 = |Ek||Ek−1| · · · |E2||E1||AB|

As elementary matrices have nonzero determinants then |AB| = 0.

5.6.13 Product of matrices with singular factor

Suppose that A and B are matrices of the same size and A is not invertible. Then theproduct AB has determinant zero.

5.6.14 Proof

Exercise.

5.6.15 Determinant of a product of matrices

Suppose that A and B are square matrices of the same size. Then

|AB| = |A||B|

5.6.16 Proof

Suppose that A is invertible. Then A is the product of elementary matrices

A = E1E2 · · ·Ekand we can write

AB = E1E2 · · ·EkB

and then

|AB| = |E1E2 · · ·EkB|= |E1||E2| · · · |Ek|B|= |E1E2 · · ·Ek|B|= |A||B|

The other possibility is that A is not invertible. Then by a previous result

|AB| = 0 = 0|B| = |A||B|


5.6.17 Corollary

If A is invertible then

|A−1| =1

|A|

5.6.18 Proof

|I| = |AA−1| = |A||A−1|

and so

|A−1| =|I||A|

=1

|A|

5.6.19 Theorem

A square matrix A is invertible if an only if its determinant is not zero.

5.6.20 Proof

We saw that if A is invertible then its determinant is nonzero. Suppose that the determinantof A is nonzero. If we apply elementary matrices Ei to get the rref B then

B = EkEk−1 · · ·E2E1A

Taking the determinant of both sides

|B| = |Ek||Ek−1| · · · |E2||E1||A|

and so the determinant of the rref of A is not zero. If B had a row of zeros then |B| = 0and so B must be I

I = EkEk−1 · · ·E2E1A

and then A is invertible.

5.6.21 Theorem

A square matrix A is singular if and only if its determinant is zero.

5.6.22 Proof

Follows logically from previous theorem.

5.7. FTLA 73

5.7 FTLA

5.7.1 Before




c) The reduced row echelon form of A is I.


e) Ax = b is consistent for alln× 1 matrices b.

5.7.2 Now add

f) |A| 6= 0

5.8 Adjoint

5.8.1 Definition - matrix of cofactors

Suppose that A is a square matrix with entries aij and cofactors cij. The matrix C whoseentries are the cofactors cij of A is called the matrix of cofactors of A.

5.8.2 Definition - adjoint

The transpose of the matrix of cofactors of A is called the adjoint of A and is often denotedadj(A).

5.8.3 Example

For

A =

3 2 −11 6 32 −4 0

c11 = 12, c12 = 6, c13 = − 16

c21 = 4, c22 = 2, c23 = 16


c31 = 12, c32 = − 10, c33 = 16

The matrix of cofactors is

C =

12 6 −164 2 1612 −10 16

and then taking the transpose of the matrix of cofactors gives us adj( A ):

adj(A) =

12 4 126 2 −10−16 16 16

5.8.4 Adjoint and inverse

If the determinant of A is not zero then

A−1 =1

|A|adj(A)

5.8.5 Proof

Will do later in class.

5.8.6 Example

For

A =

3 2 −11 6 32 −4 0

we found the cofactors and then can calculate the

|A| = 64

The adjoint matrix was

adj(A) =

12 4 126 2 −10−16 16 16

and so

A−1 =1

64

12 4 126 2 −10−16 16 16

5.9. EXERCISES 75

5.9 Exercises

5.9.1 Exercise

Evaluate the determinant ∣∣∣∣1 22 1

∣∣∣∣5.9.2 Exercise

Evaluate the minors of

A =

1 2 12 1 23 0 5

5.9.3 Exercise

Calculate the determinant of A.

A =

1 2 12 1 23 0 5

5.9.4 Exercise


A =

1 0 0 00 1 2 10 2 1 20 3 0 5

5.9.5 Exercise


A =

1 6 7 120 0 0 00 2 1 20 3 0 5


5.9.6 Exercise


A =

1 6 7 122 12 14 240 2 1 20 3 0 5

5.9.7 Exercise


A =

1 6 7 120 1 14 240 0 1 20 0 0 5

5.9.8 Exercise


A =

1 6 7 120 1 14 240 0 0 50 0 1 2

5.9.9 Exercise


A =

1 6 7 120 1 14 240 0 0 50 0 5 10

5.9.10 Exercise

Calculate the determinant of 2A.

A =

1 6 7 120 1 14 240 0 0 50 0 5 10

5.9. EXERCISES 77

5.9.11 Exercise

Calculate the determinant of 2A.

A =

1 6 12 120 1 24 240 0 5 50 0 10 10

5.9.12 Exercise

Given

A =

1 6 7 120 1 14 240 0 0 50 0 5 10

how many solutions can the equation Ax = 0 have?

5.9.13 Exercise

Suppose that for a given square matrix A the equation Ax = 0 has a single unique solution.What can you say about the determinant of A?

5.9.14 Exercise

Suppose that for a given square matrix A there is a column matrix b so that the equationAx = b does not have any solution.

What can you say about the determinant of A?

5.9.15 Exercise

Find the adjoint of A.

A =

1 6 7 120 1 14 240 0 0 50 0 1 2

Use the adjoint to calculate the inverse of A.


Chapter 6

Vectors

6.1 A vector space

6.1.1 Defining the vector space

For the purposes of this course, the vector space R2 consists of column matrices of the form(x1

x2

)where x1 and x2 are real numbers.

6.1.2 Examples

Some examples are

u =

(15

)and

v =

(21

)

6.1.3 Zero vector

When we use 0 for a vector in R2 we mean (00

)

6.1.4 Addition

R2 consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for

79

80 CHAPTER 6. VECTORS

u =

(u1

u2

)and

v =

(v1

v2

)the sum u+ v is

u+ v =

(u1 + v1

u2 + v2

)


R2 consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for

u =

(u1

u2

)and k ∈ R

ku =

(ku1

ku2

)

6.1.6 Coordinate geometry

We can associate a vector

r =

(xy

)with the point (x, y) in the xy-plane. Or we can think of r as a directed line segment fromthe origin to the point in the plane.

6.2 Properties of the vector space

6.2.1 Vectors are column matrices

We will think of the vectors of the vector space R2 as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.

6.3. ANOTHER VECTOR SPACE 81

6.2.2 Properties

Then the following properties are immediate:

a) For u and v vectors in R2

u+ v = v + u

b) For u, v, and w vectors in R2

(u+ v) + w = u+ (v + w)

b) For u, in R2 and 0 the zero vector

u+ 0 = u = 0 + u

c) For u in R2 we have

u+ (−u) = 0

d) For u and v in R2 and k a real number

k(u+ v) = ku+ kv

e) For u in R2

1u = u

6.3 Another vector space


R3 consists of column matrices of the formx1

x2

x3

where x1, x2 and x3 are real numbers.

6.3.2 Examples

Some examples are

u =

150

and


v =

213

6.3.3 Zero vector

When we use 0 for a vector in R3 we mean000

6.3.4 Addition

R3 consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for

u =

u1

u2

u3

and

v =

v1

v2

v3

the sum u+ v is

u+ v =

u1 + v1

u2 + v2

u3 + v3


R3 consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for

u =

u1

u2

u3

and k ∈ R

ku =

ku1

ku2

ku3

6.4. PROPERTIES OF THE VECTOR SPACE 83

6.3.6 Coordinate geometry

We can associate a vector

r =

xyz

with the point (x, y, z) in three dimensional space . Or we can think of r as a directed linesegment from the origin to the point in the space.



We will think of the vectors of the vector space R3 as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.

6.4.2 Properties


a) For u and v vectors in R3

u+ v = v + u

b) For u, v, and w vectors in R3

(u+ v) + w = u+ (v + w)

b) For u, in R3 and 0 the zero vector

u+ 0 = u = 0 + u

c) For u in R3 we have

u+ (−u) = 0

d) For u and v in R3 and k a real number

k(u+ v) = ku+ kv

e) For u in R3

1u = u


6.5 More vector spaces


Rn for n = 2, 3, 4, ... consists of column matrices of the formx1

x2...xn

where x1, x2, ..., xn are real numbers.

6.5.2 Zero vector

When we use 0 for a vector in Rn we mean00...0

6.5.3 Addition

Rn consists of column matrices with addition being defined as for matrices as previouslydiscussed. So for

u =

u1

u2...un

and

v =

v1

v2...vn

the sum u+ v is

u+ v =

u1 + v1

u2 + v2...

un + vn

6.6. PROPERTIES OF THE VECTOR SPACE 85


Rn consists of column matrices with scalar multiplication being defined as for matrices aspreviously discussed. So for

u =

u1

u2...un

and k ∈ R

ku =

ku1

ku2...

kun



We will think of the vectors of the vector space Rn as being a special class of matrices. Theyinherit the properties of matrices that we have already discussed.

6.6.2 Properties


a) For u and v vectors in Rn

u+ v = v + u

b) For u, v, and w vectors in Rn

(u+ v) + w = u+ (v + w)

b) For u, in Rn and 0 the zero vector

u+ 0 = u = 0 + u

c) For u in Rn we have

u+ (−u) = 0

d) For u and v in Rn and k a real number

k(u+ v) = ku+ kv


e) For u in Rn

1u = u

6.7 The euclidean inner product

6.7.1 Definition

For u and v in Rn we define

u · v =n∑i=1

uivi

where ui is the ith entry of u and vi is the ith entry of v.

6.7.2 Example

Given

u =

−101

v =

10−1

The inner product of u and v is found by

u · v = (−1)(1) + (0)(0) + (1)(−1) = − 2

6.7.3 Magnitude of a vector

If we think of a with elements x1, x2, ..., xn as a directed line segment starting at the ori-gin and going to the associated point then we might also ask what the length of the vector is.

In the two-dimensional case the magnitude squared of

u =

(u1

u2

)is

u · u = u21 + u2

2

In the three-dimensional case the magnitude squared of

6.7. THE EUCLIDEAN INNER PRODUCT 87

u =

u1

u2

u3

is

u · u = u21 + u2

2 + u23

In the n-dimensional case the magnitude squared of

u =

u1

u2...un

is

u · u = u21 + u2

2 + · · ·+ u2n

6.7.4 Norm

Often the term norm is used for the magnitude of the vector

‖u‖ =√u · u

6.7.5 Example

For

u =

(1√2

1√2

)

‖u‖2 =

(1√2

)2

+

(1√2

)2

= 1

and so

‖u‖ = 1

6.7.6 Distance between two two-dimensional vectors

If we identify a vector

r =

(r1

r2

)with the point (r1, r2) in the cartesian plane and another vector


s =

(s1

s2

)with the point (s1, s2) then we are used to using Pythagorus to find the distance betweenthe two points: √

(s1 − r1)2 + (s2 − r2)2 = ‖s− r‖

We define the distance between the two vectors r and s by

‖s− r‖

6.7.7 Distance between two three-dimensional vectors

If we identify a vector

r =

r1

r2

r3

with the point (r1, r2, r3) and another vector

s =

s1

s2

s3

with the point (s1, s2, s3) then we are used to using Pythagoras to find the distance betweenthe two points: √

(s1 − r1)2 + (s2 − r2)2 + (s3 − r3)2 = ‖s− r‖

We define the distance between the two vectors r and s by

‖s− r‖

6.7.8 Distance between vectors

Given two vectors r and s in Rn we define the distance between the two vectors to be

‖s− r‖

6.7.9 Cauchy-Schwartz inequality

For u and v in Rn we have

|u · v| ≤ ‖u‖‖v‖

6.8. PROPERTIES OF THE EUCLIDEAN NORM 89

6.8 Properties of the Euclidean norm

For u in Rn

1)

‖u‖ ≥ 0

2)

‖u‖ = 0 iff u = 0

3) For k a real number

‖ku‖ = |k|‖u‖

4) Triangle inequality

‖u+ v‖ ≤ ‖u‖+ ‖v‖

6.8.1 Proof of triangle inequality

‖u+ v‖2 = (u+ v) · (u+ v)

= u · u+ 2u · v + v · v= ‖u‖2 + 2u · v + ‖v‖2

≤ ‖u‖2 + 2|u · v‖+ ‖v‖2

≤ ‖u‖2 + 2‖u‖‖v‖+ ‖v‖2 by Cauchy-Schwartz

= (‖u‖+ ‖v‖)2

6.9 Orthogonality

6.9.1 Definition

Two vectors u and v in Rn are said to be orthogonal if their dot product is zero

u · v = 0

6.9.2 Example

The vectors

u =

(10

)


v =

(01

)are orthogonal.

6.10 Vectors and systems of linear equations

6.10.1 Example

Recall that we had systems that looked like

2x1 + x2 = 5

x1 + x2 = 7

In the language of vectors and matrices we would write(2 11 1

)(x1

x2

)=

(57

)If we then perform row reduction we get(

1 00 1

)(x1

x2

)=

(−29

)So there is a unique solution to the matrix-vector equation which is

x =

(−29

)

6.10.2 Example

Solve the equation Ax = b for

A =

(1 12 2

)

b =

(11

)Performing row reduction gives us(

1 10 0

)(x1

x2

)=

(01

)The bottom row is a contradiction 0=1 so we conclude that this matrix-vector equation hasno solution.

6.10. VECTORS AND SYSTEMS OF LINEAR EQUATIONS 91

6.10.3 Example

Solve the equation Ax = b for

A =

(1 12 2

)

b =

(12

)Performing row reduction gives us(

1 10 0

)(x1

x2

)=

(10

)So we conclude that x2 could be any real number and

x1 = 1− 1 · x2

So the vector solution to this matrix-vector equation is

x =

(1− x2

x2

)=

(10

)+ x2

(−11

)

6.10.4 Example

Consider the matrix-vector equation1 1 10 1 10 0 1

x1

x2

x3

=

123

Performing row reduction gives1 0 0

0 1 00 0 1

x1

x2

x3

=

−1−13

The solution vector to the matrix-vector equation is

x =

−1−13

6.10.5 Example


x1

x2

x3

=

124


Performing row reduction gives1 0 00 1 10 0 0

x1

x2

x3

=

−120

We see that x3 could be any real number and

x2 = − x3

and

x1 = − 1

Then the solution vector to the matrix-vector equation is

x =

−12− x3

x3

=

−120

+ x3

0−11

6.10.6 Example


x1

x2

x3

=

123

Performing row reduction gives0 1 1

0 0 00 0 0

x1

x2

x3

=

100

We see that x1 and x3 could be any real numbers and

x2 = 1− x3

Then the solution vector to the matrix-vector equation is

x =

x1

1− x3

x3

=

010

+ x1

100

+ x3

0−11

6.11. EXERCISES 93

6.11 Exercises

6.11.1 Exercise

Consider the vectors

u =

(15

)

v =

(−17

)Calculate the vector w = 3u+ 7v.

6.11.2 Exercise

Give examples of three vectors in R2 which are of magnitude 1.

6.11.3 Exercise

Give two vectors in R3 which are orthogonal to

u =

110

6.11.4 Exercise

What is the solution to the equation

(1 1 0 0

)x1

x2

x3

x4

= 0


Chapter 7

Vector Spaces

7.1 The vector spaces already discussed

7.1.1 Examples of vector spaces

We have discussed some examples of vector spaces

R2, R3, Rn

which were column matrices with all real entries.

7.1.2 Zero vector

All of these vector spaces had a zero vector which is a column vector with all zero entries.

0 =

00...0

7.1.3 Property of the zero vector

For any vector u ∈ Rn and the zero vector we have

u+ 0 = 0 + u = u

7.1.4 Addition

For Rn addition was defined entry-wise in such a way that that for u and v vectors in Rn wehave that

u+ v is defined in Rn

95

96 CHAPTER 7. VECTOR SPACES


A scalar multiplication was defined so that for any α ∈ R and any u ∈ Rn

αu ∈ Rn

7.2 Properties we have seen

From the way we defined addition and scalar multiplication for vectors in Rn we can show:

7.2.1 Properties

1. For u and v vectors in Rn

u+ v

is also a vector in Rn.


u+ v = v + u

3. For u, v, and w vectors in Rn

(u+ v) + w = u+ (v + w)

4. For u, in Rn there is a zero vector 0 such that

u+ 0 = u = 0 + u

5. For every vector u in Rn there is a vector −u in Rn such that

u+ (−u) = 0

6. For every vector u in Rn and α in R there is a vector

αu

in Rn.

7. For every u and v in Rn and α a real number

α(u+ v) = αu+ αv

8. For every u in Rn and α and β real numbers

(α + β)u = αu+ βu


7.3. PROPERTIES OF A VECTOR SPACE 97

α(βu) = (αβ)u

10. For u in Rn

1u = u

7.3 Properties of a vector space

7.3.1 Definition

A set V is said to be a real vector space if the properties that we just recalled for the Rn

vector spaces hold.

7.3.2 Properties

1. For u and v vectors in V

u+ v

is also a vector in V .


u+ v = v + u

3. For u, v, and w vectors in V

(u+ v) + w = u+ (v + w)

4. For u, in V there is a zero vector 0 such that

u+ 0 = u = 0 + u

5. For every vector u in V there is a vector −u in Rn such that

u+ (−u) = 0

6. For every vector u in V and α in R there is a vector

αu

in V .

7. For every u and v in V and α a real number

α(u+ v) = αu+ αv

8. For every u in V and α and β real numbers




α(βu) = (αβ)u

10. For u in V

1u = u

7.3.3 Example

We checked previously that Rn is a vector space. It has all the properties required of a realvector space.

7.3.4 Example

Consider the set of all polynomial functions of finite degree. If we define addition and scalarmultiplication for functions in the usually way then this is a real vector space.

7.4 A longer example

7.4.1 Potential vector space

Consider the set V of all elements of R2 of the form(0α

)where α could be any real number.

7.4.2 Question

Is this subset of R2 a vector space?

7.4.3 Check the required properties one by one

Checking 1.

Is the sum of two elements of V in V ? Consider

u =

(0α

)∈ V

and

v =

(0β

)∈ V

7.4. A LONGER EXAMPLE 99

The sum

u+ v =

(0

α + β

)is also in V .

Checking 2.

For vectors u and v in V is u+ v = v + u?

Yes, don’t really need to check this since it was already checked for R2.

Checking 3.

For u, v, and w vectors in V is addition associative?

Yes, don’t really need to check this since it was already checked for R2.

Checking 4.

Is the zero vector in V ? Yes. Don’t really need to check that adding the zero vector to avector doesn’t change the vector since this is already known for all of R2.

Checking 5.

For every vector u in V there is a vector −u in V such that

u+ (−u) = 0

Consider

u =

(0α

)∈ V

Then

−u =

(0−α

)∈ V

and u+ (−u) = 0.

Checking 6.

If u is in V is αu in V ? Consider

u =

(0b

)∈ V

Then


αu =

(0αb

)∈ V

Checking 7.

For every u and v in V and α a real number is α(u+ v) = αu+ αv?

Consider

u =

(0a

)∈ V

and

v =

(0b

)∈ V

Then

α(u+v) = α

(0

a+ b

)=

(0

αa+ αb

)=

(0αa

)+

(0αb

)= α

(0a

)+α

(0b

)= αu+αv

Checking 8.

For every u in V and α and β real numbers is it true that (α + β)u = αu+ βu?

Consider

u =

(0a

)∈ V

Then for α and β real numbers

(α+β)u = (α+β)

(0a

)=

(0

αa+ βa

)=

(0αa

)+

(0βa

)= α

(0a

)+β

(0a

)= αu+βu

Checking 9.

For every u in V and α and β real numbers is it true that α(βu) = (αβ)u?Consider

u =

(0a

)∈ V

Then for α and β real numbers

α(βu) = α

(0βa

)=

(0αβa

)= (αβ)

(0a

)= (αβ)u

7.5. SUBSPACES 101

Checking 10.

For u in V is 1u = u? Don’t really need to check this since we already know that it is truefor all vectors u in R2.

7.5 Subspaces

7.5.1 Definition

If V is a real vector space, and W is a non-empty subset of V which is also itself a vectorspace then W is said to be a subspace of V .

7.5.2 Example

The V ⊂ R2 that we just discussed is a subspace of R2.

7.5.3 Theorem - Checking if a subset is a subspace

If W is a non-empty subset of vectors of a vector space V then W is a subspace of V if andonly if

a) u, v ∈ W implies u+ v ∈ W

b) α ∈ R and u ∈ W implies αu ∈ W .

7.5.4 Proof

Exercise.

7.5.5 Example

Consider the set of vectors W ⊂ R2 which consists of all vectors whose first and secondelements are the same. A typical element looks like

u =

(αα

)∈ W

This is a non-empty subset of R2. If we add two typical elements

u =

(αα

)∈ W

v =

(ββ

)∈ W

we get


u+ v =

(α + βα + β

)which has first and second elements the same and so the sum of two elements of W is in W .

Next, consider

u =

(αα

)∈ W

and a scalar k. We get

ku =

(kαkα

)which has equal first and second entries. Then the scalar multiple of an element of W is in W .

Both required conditions hold and we conclude that W is a subspace of R2.

7.5.6 Example

Suppose that V is a real vector space. Consider the subset

W = {0}which consists only of the zero vector. Then for two vectors u and v in W :

u+ v = 0 + 0 = 0 ∈ WFor any real number α and any vector u in V :

αu = α0 = 0 ∈ VThen the required conditions have been checked and we have found a trivial subspace ofevery vector space.

7.5.7 Example

Consider the vector space R2. Consider the subset

W =

{(11

)}Is this subset W a subspace of R2?

Consider for an element u in W that

0u = 0

which is not in W . So some scalar multiples of elements of W are not in W and so W is nota subspace.

7.6. SUBPSACES AND SOLUTIONS OF SYSTEMS OF EQUATIONS 103

7.6 Subpsaces and solutions of systems of equations

7.6.1 Theorem

If Am×nx = 0 is a system of m linear equations in n unknowns then the set of solutions is asubspace of Rn.

7.6.2 Proof

Let W be the set of all solutions of the equation Ax = 0. Let u and v be elements of W . Asthey are solutions of Ax = 0 we have Au = 0 and Av = 0.

Does u+ v belong to W?

A(u+ v) = Au+ Av = 0 + 0 = 0

So if u is a solution and v is a solution then the sum u+ v is a solution.

Suppose that u is a solution and α is a real number. Then is αu a solution?

A(αu) = α(Au) = α0 = 0

So if u is in W then αu is in W .

Then both conditions have been checked and W , the set of all solutions to the equationAx = 0, is a subspace of Rn.

7.6.3 Example

Consider the system 1 −2 32 −4 63 −6 9

x1

x2

x3

=

000

We form the augmented matrix and do row reduction to get the rref1 2 3 | 0

0 0 0 | 00 0 0 | 0

from which we conclude that

7.6.4 Example

Consider the equation


1 −2 32 −4 63 −6 9

x1

x2

x3

=

000

If we form the augmented matrix and do row reduction we get1 −2 3 |0

0 0 0 |00 0 0 |0

This gives us that x2 and x3 can be any real numbers ( call them s and t ) and

x1 = 2x2 − 3x3

Then the solution vector is

x =

x1

x2

x3

=

2x2 − 3x3

x2

x3

=

2s− 3tst

= s

210

+ t

−301

with s and t in R.

According to the theorem, this set of solutions is in fact a subspace of R3.

7.7 Linear combinations

7.7.1 Linear combination of vectors

A vector v in a real vector space V is said to be a linear combination of vectors v1, v2 , ... ,vn if there are real numbers α1, α2, ... , αn such that

v = α1v1 + α2v2 + · · ·+ αnvn

7.7.2 Example

In R2 the vector

v =

(αβ

)can be written as a linear combination of the vectors

e1 =

(10

)e2 =

(01

)by

7.7. LINEAR COMBINATIONS 105

v = αe1 + βe2

7.7.3 Example

In R2 the vector

v =

(αβ


v1 =

(1−1

)

v2 =

(11

)by

v =α− β

2v1 +

α + β

2v2

7.7.4 Example

In R3 all vectors can be written as linear combinations of the vectors

e1 =

100

e2 =

010

e3 =

001

7.7.5 Example

In Rn all vectors can be written as linear combinations of the vectors

e1 =

10...0


e2 =

01...0

...

en =

00...1

7.7.6 Theorem - linear combinations form a vector space

Suppose v1, v2, ... , vk are vectors in a vector space V .

a) Then the set W of all linear combinations of v1, v2, ... , vk is a subspace of V .

b) It is the smallest subspace that contains all the vi.

7.7.7 Proof of a)

Suppose that p and q are elements of W . Then p and q are both linear combinations of thevi:

p = α1v1 + α2v2 + · · ·+ αkvk

q = β1v1 + β2v2 + · · ·+ βkvk

The sum is

p+ q = (α1 + β1)v1 + (α2 + β2)v2 + · · ·+ (αk + βk)vk

which is a linear combination of the vi and so in W . So the sum of two vectors in W is alsoin W .

Now consider a scalar multiple of p:

γp = (γα1)v1 + (γα2)v2 + · · ·+ (γαk)vk

And then this is also a linear combination of the vi and so is in W . So a scalar multipleof a vector in W is in W . The two required conditions have been checked and so W is asubspace.


7.7.8 Proof of b)

Suppose a subspace contains vectors v1, v2 , ... , vk.

One of the conditions for a subspace is that all scalar multiples of vectors in the subspacemust also be in the subspace. Then for all real numbers α1, α2 , ... , αk we must have thanα1v1, α2v2 , ... , αkvk are also in the subspace.

The other condition for a subspace is that sum of any two vectors in the subspace mustalso be in the subspace. Then

α1v1 + α2v2

is in the subspace. Then

(α1v1 + α2v2) + α3v3

is in the subspace. And so on till

α1v1 + α2v2 + · · ·+ αkvk

is also in the subspace.

Thus v1, v2 , ... , vk in a subspace implies that all linear combinations of the vi are inthe subspace.

7.7.9 Span

Suppose a subspace V contains vectors v1, v2 , ... , vk. The subspace W consisting of all thelinear combinations of the vi is called the space spanned by v1, v2 , ... , vk and v1, v2 , ... ,vk are said to span W .

7.7.10 Example

e1, e2, and e3 span R3.

7.7.11 Example

The vectors 100

1

10


111

span R3.

7.7.12 Example

The vectors 321

2

10

1

11

do NOT span R3.

7.8 Linear independence and dependence

7.8.1 Linear independence

Suppose that v1, v2 , ... , vk are vectors in a real vector space V . The vi are said to belinearly dependent if we can find real numbers α1, α2, .. , αk, not all zero so that

α1v1 + α2v2 + · · ·+ αkvk = 0

Otherwise the set is said to be linearly independent.

7.8.2 Alternatively

Another way of putting this is that the vi are linearly independent if

α1v1 + α2v2 + · · ·+ αkvk = 0

only for all the αi = 0. Then if the vectors are not linearly independent they are said to belinearly dependent.

7.8. LINEAR INDEPENDENCE AND DEPENDENCE 109

7.8.3 Example

The vectors e1 and e2 in R2 are linearly independent because the only way to get

αe1 + βe2 = 0

is for both α and β to be zero.

7.8.4 Example

Are the vectors

u =

212

v =

236

w =

224

linearly independent?

We need to check whether or not the equation

xu+ yv + zw = 0

has only the trivial solution x = y = z = 0 or whether there are other solutions. Let’s writethis out. We want to check if there is a solution other than x = y = z = 0 for

x

212

+ y

236

+ z

224

=

000

We can rewrite this as 2 2 2

1 3 22 6 4

xyz

=

000

If we form the augmented matrix then we get the rref1 0 1/2 | 0

0 1 1/2 | 00 0 0 | 0

and so we get infinitely many solutions and conclude that the vectors considered are linearlydependent.


7.9 Back to the FTLA

7.9.1 Recall

The following are equivalent:a) An×n is invertible.




e) Ax = b is consistent for all n× 1 matrices b.

f) |A| 6= 0.

7.9.2 Add to the FTLA

g) The columns of A considered as vectors in Rn are linearly independent.

7.9.3 Proof

Suppose that A is invertible. Then the only solution to Ax = 0 is x = 0. Then the onlylinear combination of the columns of A that gives zero is one where the scalar coefficients ofthe sum are zero. Then the columns are linearly independent.

If the columns are linearly independent then the only linear combination that gives zerois the one with all scalar coefficients zero. Then Ax = 0 has only the trivial solution and soby the FTLA then A is invertible.

7.10 Dimension

7.10.1 Example


u =

(20

)

v =

(11

)

w =

(−11

)

7.10. DIMENSION 111

in R2. We can consider the vector space W which is spanned by all of these vectors. Theelements of W look like

p = αu+ βv + γw

where α, β and γ can be any real numbers. But the set of vectors {u, v, w} are not linearlyindependent since

u− v + w = 0

So we can write

w = − u+ v

Then vectors in the space W can be written

p = αu+ βv + γw = αu+ βv + γ(−u+ v) = α′u+ β′v

We can write every vector in W as a linear combination of u, v, and w but we can also justwrite the vector as a linear combination of u and v.

It is helpful to use as few vectors as are actually needed to span the space that one isinterested in.

7.10.2 General idea

Suppose that we are interested in a space W that we know is spanned by a set of vectors

B = {v1, v2, . . . , vm}

If the set of vectors is linearly independent, then none of the vectors can be written as alinear combination of the other vectors. If the set is linearly dependent, then some vectorcan be written as a linear combination of the others. Say, vm. We can drop vm from the setand just use

B′ = {v1, v2, . . . , vm−1}

We keep going till we get a linearly independent set.

7.10.3 Basis

Such a minimal set is called a basis.

7.10.4 Minimality of the basis

For a given vector space W with two bases, the number of vectors in both bases is the same.


7.10.5 Example

The vectors

e1 =

(10

)

e2 =

(01

)are a basis for R2. So is

B = {e1 , e1 + e2}

Both bases have two vectors.

7.10.6 Dimension

The dimension of a vector space V is the number of vectors in its basis.

7.10.7 Example

Rn has dimension n.

7.10.8 Example

The space of polynomials on R is of infinite dimension.

7.11 Exercises

7.11.1 Exercise

Show that the set of polynomial functions is a vector space.

7.11.2 Exercise

Give an example of a one-dimensional subspace in R3.

7.11.3 Exercise

Given an example of a two-dimensional subspace in R3.

7.11. EXERCISES 113

7.11.4 Exercise

The standard basis for R3 is

B = {e1, e2, e3}

where

e1 =

100

e2 =

010

e3 =

001

Find another basis for R3 in which all the vectors have length 1.

7.11.5 Exercise

Can a set of five vectors in R4 be linearly independent?


Chapter 8

Euclidean vector spaces

8.1 Euclidean vector spaces

8.1.1 Definition

The vector spaces Rn with the inner/scalar/dot product that we have already discussed arecalled Euclidean vector spaces.

8.1.2 Review

If you don’t recall the dot product and its properties you should review them at this time.

8.2 Dot product and matrix multiplication

8.2.1 Transpose and dot product

Recall that for u and v in Rn we defined

u · v = u1v1 + u2v2 + · · ·+ unvn

Note that

u · v = vtu

by the rules of matrix multiplication.

8.2.2 Example


u =

111

and

115

116 CHAPTER 8. EUCLIDEAN VECTOR SPACES

v =

123

The dot product can be calculated by

u · v = vtu =(1 2 3

)111

= 6

8.2.3 Matrix transpose and dot product

By the previous rule, for a square matrix An×n and vectors u and v in Rn

Au · v = vt(Au) = (vtA)u = (Atv)tu = u · Atv

and

u · Av = (Av)tu = (vtAt)u = vt(Atu) = Atu · v

8.2.4 Example


u =

(11

)and

v =

(25

)and the matrix

A =

(1 23 4

)Then

Au · v = u · Atv =

(11

)·(

1 32 4

)(25

)=

(11

)·(

1624

)= 40

8.2.5 Matrix multiplication and vectors

One way of looking at the rule for matrix multiplication is to think of the first matrix asbeing made up of row vectors

8.3. FUNCTIONS/MAPS 117

Am×k =

R1

R2...

Rm

and the second matrix as being made up of columns

Bk×n =(C1 C2 · · · Cn

)and then the product looks like

(AB)m×n =

R1C1 R1C2 · · ·R1Cn

R2C1 R2C2 · · ·R2Cn...

... · · · ...RmC1 RmC2 · · ·RmCn

8.3 Functions/maps

8.3.1 Domain of functions

Suppose that f : U → V is a function that maps elements of the set A to elements of theset B. We call A the domain of f .

8.3.2 Example

Suppose that a function f : (0, 1)→ R is defined by

f(x) =1√

1− x2

The domain of this function is (0, 1).

8.3.3 Example

Suppose that a function f : R2 → R is defined by

f

(xy

)= x2 + y2

The domain of this function is R2.

8.3.4 Range of a function

Suppose that f : U → V is a function that maps elements of the set A to elements of theset B. We call the subset of B of all the elements that f maps to the range of f .


8.3.5 Example

Suppose that a function f : R2 → R is defined by

f

(xy

)= x2 + y2

The domain of this function is R2. The range is [0,∞).

8.3.6 Example

Suppose that a function f : U → R3 is defined by

U =

{(xy

)| x2 + y2 ≤ 1

}⊂ R2

f

(xy

)=

xy

x2 + y2

The domain of this function is the unit disk U and the range can be graphed as a paraboloidin 3-space.

8.4 Linear transformation

8.4.1 Linearity

Suppose that there is a function L : U → V that maps between real vector spaces U and V .Then L is said to be a linear transformation if the following conditions hold:

For any vectors x and y in U

L(x+ y) = L(x) + L(y)

and for α a real scalar

L(αx) = αL(x)

8.4.2 Question

What do these two conditions remind you of?

8.4.3 Exercise

Show that the range of L is a subspace of V .

8.5. EXAMPLE 119

8.4.4 Exercise

Consider a function f : R2 → R3 defined by

f

(x1

x2

)=

(2x1 + x2

3x2 + 3x2

)

Confirm that

f

(11

)=

(36

)

Show that this f is a linear transformation by checking the two requirements for a lineartransformation.

8.5 Example

8.5.1 Transformation

Suppose that we have a linear transformation L : R2 → R3 defined by

L

(x1

x2

)=

x1 + x2

x1 − x2

x1

8.5.2 Linearity

We confirm that the transformation L is linear because for vectors u and v in R2 and scalarα:

L(u+ v) = Lu+ Lv

L(αu) = αLu


8.5.3 First condition for linearity

L

((x1

x2

)+

(y1

y2

))= L

(x1 + y1

x2 + y2

)

=

(x1 + y1) + (x2 + y2)(x1 + y1)− (x2 + y2)

(x1 + y1)

=

x1 + x2

x1 − x2

x1

+

y1 + y2

y1 − y2

y1

= L

(x1

x2

)+ L

(y1

y2

)

8.5.4 Second condition for linearity

L

(α

(x1

x2

))= L

(αx1

αx2

)=

αx1 + αx2

αx1 − αx2

αx1

= α

x1 + x2

x1 − x2

x1

= αL

(x1

x2

)

8.5.5 Associated matrix

Note that

L

(xy

)=

1 11 −11 0

(xy

)So we can calculate the result of the linear transformation by doing a matrix-vector multi-plication.

8.6 General linear transformation

8.6.1 Transformation

Suppose L : Rn → Rm is defined by

L

x1

x2...xn

=

a11x1 + a12x2 + · · ·+ a1nxna21x1 + a22x2 + · · ·+ a2nxn

...am1x1 + am2x2 + · · ·+ amnxn

=

a11 a12 · · · a1n

a21 a22 · · · a2n...am1 am2 · · · amn

x1

x2...xn

8.7. SOME LINEAR TRANSFORMATIONS 121

We can make a connection between the linear transformation L and the matrix A

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...am1 am2 · · · amn

which yields the same result as L:

L(x) = Ax

8.6.2 Standard matrix

A is often called the standard matrix of L.

8.7 Some linear transformations

8.7.1 Zero transformation

This is the transformation that maps every vector in Rn to the zero vector in Rm.

8.7.2 Exercise

What is the standard matrix for this linear transformation?

8.7.3 Identity operator

The identity operator L : Rn → Rn is defined by

L(x) = x

8.7.4 Exercise

What is the standard matrix for the identity operator?

8.7.5 Reflection operators

2D reflection in the y axis

Consider the operator L : R2 → R2 defined by

L(x) =

(−x1

x2

)


Example

L

(11

)=

(−11

)

Exercise

What is the standard matrix for this reflection operator?

2D reflection about x axis


L(x) =

(x1

−x2

)

Example


L

(26

)=

(2−6

)

Exercise

What is the standard matrix for this operator?

8.7.6 Generally

Operators that reflect vectors in lines in 2-space and 3-space are called reflection operators.

8.7.7 Projection operators

In lower dimensions

In lower dimensions the projection operators are

L

(xy

)=

(x0

)

L

(xy

)=

(0y

)

L

xyz

=

xy0

8.8. ROTATION OPERATORS 123

L

xyz

=

x0z

L

xyz

=

0yz

Example

The projection of the vector

v =

247

onto the yz-plane is

Lv =

047

Exercise

What are the standard matrices for these operators?

8.8 Rotation operators

8.8.1 Rotations in 2-space

Suppose that one identifies the vector

r =

(xy

)with the point (x, y) in the Cartesian plain or with a directed line segment running from theorigin to the point. Suppose that we want to keep the tail of the line segment at the originand rotate the tip (without changing the magnitude of the line segment) through an angleθ in the counterclockwise direction.

Where does the tip of the line segment end up?

8.8.2 Rotation operators in 2-space

Rotations counterclockwise through an angle θ are defined by


Rθ

(xy

)=

(x cos θ − y sin θx sin θ + y cos θ

)

8.8.3 Example

Let

r =

(10

)Then

Rπ/4(r) =

(cos π/4sin π/4

)=

(1√2

1√2

)

8.8.4 Exercise

What is the standard matrix for a rotation operator in 2-space?

8.8.5 Exercise

Show that this operator does not change the distance from the origin. That is

‖r‖ = ‖Rθ(r)‖

8.8.6 Rotation operators in 3-space

Rotation about positive x axis

L

xyz

=

xy cos θ − z sin θy sin θ + z cos θ

Rotation about positive y axis

L

xyz

=

x cos θ + z sin θy

−x sin θ + z cos θ

Rotation about positive z axis

L

xyz

=

x cos θ − y sin θx sin θ + y cos θ

z

8.9. DILATION/CONTRACTION OPERATORS 125

8.8.7 Exercise

What are the standard matrices for the rotations about the axes in 3-space?

8.9 Dilation/contraction operators

8.9.1 Idea

We think of a vector

r =

(xy

)as lying in the Cartesian plane between the origin and the point (x, y). We then considerstretching or contracting the vector but leaving its direction unchanged.

8.9.2 Example

Consider the vector

r =

(23

)The vector that has double the magnitude and points in the same direction as r is

2r =

(46

)

8.9.3 Operator

The operator that contracts or dilates by a factor k > 0 without changing the length of thevector is

Lr =

(kxky

)in 2-space and

Lr =

kxkykz

in 3-space.

8.9.4 Exercise

What is the standard matrix for this operator?


8.9.5 Exercise

What is the extension to n-space?

8.10 Composition of linear transformations

8.10.1 Composition of linear transformations

Suppose that there is a linear transformation P : Rn → Rk and another linear transformationQ : Rk → Rm. The composition is written

Q ◦ P

and is defined by

(Q ◦ P )(x) = Q(P (x))

8.10.2 The composition of linear functions is linear

Proof.

Q(P (x+ y)) = Q(P (x) + P (y)) = Q(P (x)) +Q(P (y))

Q(P (αx)) = Q(αP (x)) = αQ(P (x))

8.10.3 Exercise

What is the standard matrix for a composition of linear functions?

8.10.4 Exercise

Show that composition is not commutative.

Chapter 9

Linear transformations of euclideanspaces

9.1 Recall

9.1.1 Linearity

The map L : Rn → Rm is said to be a linear transformation if for any u and v in the domainand α a real number we have

L(u+ v) = L(u) + L(v)

L(αu) = αL(u)

9.1.2 Significance

The range of a linear transformation is a vector space.

9.1.3 Standard matrix

A linear transformation L : Rn → Rm has a matrix Am×n associated with it such that for uin the domain of L

L(u) = Au

9.2 one-to-one functions

9.2.1 one-to-one

A function f : U → V is said to be one-to-one (injective) if for x and y in U

f(x) = f(y) =⇒ x = y

127

128 CHAPTER 9. LINEAR TRANSFORMATIONS OF EUCLIDEAN SPACES

9.2.2 Example

Consider the function f : [−1, 1]→ R defined by

f(x) = x2

This function is NOT one-to-one (injective) since f(x) = f(y) does not necessarily implythat x = y. For example, if f(x) = 1 = f(y) it could be that x = 1 and y = −1.

9.2.3 Example

Consider the function f : [0, 1]→ R defined by

f(x) = x2

This function is one-to-one (injective) since f(x) = f(y) implies that x2 = y2 and x = ysince all the possible values of x and y are non-negative.

9.3 Onto functions

9.3.1 Onto functions

A function f : U → V is said to be onto V (surjective) if for any y in V there is an x in Uso that

y = f(x)

9.3.2 Example

Consider the function f : (−1, 1)→ R defined by

f(x) = x2

This function is not onto R because 12 is in R and there is no x in (−1, 1) so that x2 = 12.

9.3.3 Example

Consider the function f : [0, 1]→ [0, 1] defined by

f(x) = x2

This function is onto [0, 1].

9.3.4 Exercise

Consider the identity operator I : R3 → R3. Show that this function is both one-to-one andonto.

9.4. INVERTIBLE FUNCTIONS 129

9.4 Invertible functions

9.4.1 Condition

A function f : U → V has a well-defined inverse if f is both one-to-one and onto ( injectiveand surjective ). That is because for every v in V there is a u in U so that f(u) = v (andonly one such u) so it makes sense to define the inverse function by

f−1(v) = u

9.5 FTLA

9.5.1 Recall

We had that the following are equivalent for a square matrix An×n:

a) A is invertible.


c) The rref of A is I.



f) |A| 6= 0.


9.5.2 Linear transformation and standard matrix

Recall that a linear transformation L : Rn → Rn has a square matrix associated with it andevery square matrix can be thought of as the standard matrix for a linear transformationL : Rn → Rn.

What does the invertibility of A tell us about L and how does it relate to the FTLA?

9.5.3 Theorem

Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If A is invertiblethen the range of L is Rn.


9.5.4 Proof

Suppose that A is invertible. We want to show that for any y in Rn there is a x in Rn sothat

Ax = y

Then

x = A−1y

gives us the desired x for any given y. Thus L is onto Rn.

9.5.5 Theorem

Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If the range of Lis Rn then A is invertible.

9.5.6 Proof

Consider the n standard basis vectors for Rn. The ith one is ei which has all zeros exceptfor a 1 in the ith entry. As L is onto, for each ei there is some vector xi so that

Axi = ei

Now consider the matrix B whose columns are the xi:

B = ( x1 x2 · · · xn−1 xn )

Then

AB = A( x1 x2 · · · xn−1 xn ) = ( e1 e2 · · · en−1 en ) = I

and so B is the inverse of A.

9.5.7 Theorem

Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If A is invertiblethen L is one-to-one.

9.5.8 Proof

If x and y are vectors in Rn and

L(x) = L(y)

then

Ax = Ay

9.6. INVERSE OF MATRIX AND INVERSE OF FUNCTION 131

and multiplying by the inverse of A on both sides gives

x = y

Then

L(x) = L(y) =⇒ x = y

9.5.9 Theorem

Suppose An×n is a square matrix and L : Rn → Rn is multiplication by A. If L is one-to-onethen A is invertible.

9.5.10 Proof

Suppose that L is one-to-one. Then

L(x) = L(0) =⇒ x = 0

Ax = 0 =⇒ x = 0

So the equation Ax = 0 has only the trivial solution and A must be invertible by the FTLA.

9.5.11 Additions to the FTLA

h) The range of the linear transformation which is multiplication by A is Rn.

i) The linear transformation which is multiplication by A is one-to-one.

9.5.12 Example

A projection operator P : R2 → R2 which projects onto the x-axis has standard matrix

A =

(1 00 0

)A is not invertible so this operator P is not one-to-one and the range of P is not R2.

9.6 Inverse of matrix and inverse of function

9.6.1 Recall

If An×n is invertible then the operator L : Rn → Rn which is multiplication by A is one-to-oneand onto and so L has an inverse.


9.6.2 Standard matrix for the inverse linear transformation

The standard matrix for the inverse linear transformation is A−1.

9.7 Standard matrix for a linear transformation

9.7.1 Question

Given a linear transformation L : Rn → Rm how do we find its standard matrix?

9.7.2 Consider how linear transformation acts on basis vectors

Suppose that L acts on the basis vectors of Rn as

L(ei) = xi

Once we have that we know what happens to any vector in the domain since it is a linearcombination of the ei:

L(∑i

αiei) = αixi

Now construct Am×n as the matrix whose columns are the xi.

A =

x1

1 x21 · · ·xn1

x12 x2

2 · · ·xn2...

... · · · ...x1m x2

m · · ·x6m

Note that

Aei = xi = L(ei)

so multiplication by A has the same result as L on the basis vectors and so the same is truefor any vector in the domain of L. Then A is the standard matrix for L.

9.7.3 Example

Consider the linear operator L : R2 → R2 which is defined by

L

(xy

)=

(x+ yx− y

)Then

L

(10

)=

(1 + 01− 0

)=

(11

)and

9.7. STANDARD MATRIX FOR A LINEAR TRANSFORMATION 133

L

(01

)=

(0 + 10− 1

)=

(1−1

)Then the standard matrix for L is

A =

(1 11 −1

)


Chapter 10

Least squares

10.1 Orthogonal bases

10.1.1 Given a subspace

Suppose that W is an m-dimensional subspace of Rn with a basis

S = { w1, w2, . . . wm}Every vector in W can be expressed as a linear combination of the basis vectors:

w = α1w1 + α2w2 + · · ·+ αnwm

What are the coefficients αi?

10.1.2 Orthogonal basis

A basis is said to be orthogonal if each basis vector is orthogonal to all of the other basisvectors.

10.1.3 Example

The standard basis for Rn is an example of an orthogonal basis.

10.1.4 Linear combinations with an orthogonal basis

Suppose that a subspace W of a euclidean vector space has an orthogonal basis. Then anyvector w in W can be written as a linear combination of the basis vectors. Now if we takethe dot product with the ith basis vector

w · wi = αiwi · wiand so

αi =w · wiwi · wi

135

136 CHAPTER 10. LEAST SQUARES

Then any vector in W can be written

w =w · w1

w1 · w1

w1 +w · w2

w2 · w2

w2 + · · ·+ w · wmwm · wm

wm

10.2 Example

10.2.1 Problem

Consider the subspace W spanned by the vectors

p =

100

and

q =

111

Any vector in W can be written as a linear combination of these vectors

w = αp+ βq

Note that

p · q = 1 6= 0

10.2.2 Finding an orthogonal basis

We will find a new orthonormal basis

O = { r, s }

for W . Let

r = p

be the first basis vector. Now let’s find a new basis vector in W

s = q + αr

that is orthogonal to r.

r · s = r · q + αr · r = 0

Then

α = −r · qr · r

10.3. GRAM-SCHMIDT ORTHOGONALIZATION PROCESS 137

Then the new orthogonal basis vectors for W are

r = p =

100

and

s = q + αr = q − r · qr · r

r =

111

− 1

1

100

=

011

10.3 Gram-Schmidt orthogonalization process

10.3.1 Problem

Given a basis

S = { w1, w2, . . . , wm}

of a subspace of Rn we would like to find an orthogonal basis

O = { u1, u2, . . . , um}

for W .

10.3.2 Gram-Schmidt orthogonalization process

1) Let

u1 = w1

2) Let

u2 = w2 −w2 · u1

u1 · u1

2) Let

u3 = w3 −w3 · u1

u1 · u1

− w3 · u2

u2 · u2

...2) Let

um = wm −wm · u1

u1 · u1

− wm · u2

u2 · u2

− · · · − wm · um−1

um−1 · um−1

This results in an orthogonal basis for W .


10.3.3 Example again using Gram-Schmidt

Consider the subspace W of R3 spanned by the vectors

w1 =

100

and

w2 =

111

These vectors are not orthogonal. We will use the Gram-Schmidt process to create anorthogonal basis for the same subspace.

10.3.4 Applying Gram-Schmidt

Let

u1 = w1 =

100

u2 = w2 −

w2 · u1

u1 · u1

u1 =

111

− 1

1

100

=

011

Then an orthogonal basis for the subspace W is

O =

u1 =

100

, u2 =

011

10.4 Using a basis to approximate a vector

10.4.1 A vector as a linear combination of basis vectors

Suppose that we have a vector v ∈ R3

v =

125

We can construct this vector ( and every other vector in R3 ) as a linear combination of thestandard basis vectors

v = 1e1 + 2e2 + 5e3

10.4. USING A BASIS TO APPROXIMATE A VECTOR 139

10.4.2 Question

Can the vector v be constructed as the linear combination of only two of the standard basisvectors e1 and e2? No.

10.4.3 What is the best that you could do?

Suppose that you wanted to find a vector

w∗ = α1e1 + α2e2

in the span of e1 and e2 closest ( in some sense ) to v.

Note that the basis vectors for this span are already orthogonal.

The distance squared between v and w∗ is given by

‖w∗ − v‖2 = (α1 − 1)2 + (α2 − 2)2 + (0− 5)2

Differentiating wrt α1 and α2 and setting the partial derivatives to zero gives

2(α1 − 1) = 0

2(α2 − 2) = 0

implying that the closest vector to v in the span of e1 and e2 is

1e1 + 2e2 =

120

10.4.4 What about a different span of vectors?

Suppose that we have vectors

w1 =

100

w2 =

111

and we want the closest vector w∗ to v.

We previously saw that after the Gram-Schmidt orthogonalization process we get a newbasis vectors for the same subspace W :


u1 =

100

u2 =

011

Now we express the approximation w∗ for the vector v in terms of the new basis vectors

w∗ = α1u1 + α2u2 =

α1

α2

α2

The distance squared to v is

‖w∗ − v‖2 = (α1 − 1)2 + (α2 − 2)2 + (α2 − 5)2

If we seek to minimize this distance then we get two equations from the differentiation

2(α1 − 1) = 0

2(α2 − 2) + 2(α2 − 5) = 0

and a bit of algebra gives

α1 = 1, α2 = 7/2

Then the closest vector in the span of w1 and w2 to the vector v is

w∗ = 1u1 +7

2u2 =

17/27/2

10.5 Geometric viewpoint

10.5.1 Geometric form of problem

We have a vector in R3

v =

125

which we identify with the point (1, 2, 5) in Cartesian 3-space.

We also have the span of the vectors

10.5. GEOMETRIC VIEWPOINT 141

w1 =

100

and

w2 =

111

which we view geometrically as a plane in Cartesian 3-space. We want to project v onto thisplane.

Life is a little easier when working with an orthogonal basis so we use the orthogonal basisvectors

u1 =

100

u2 =

011

instead. After projecting onto the plane we get the ’shadow’ of v in W

w∗ =

17/27/2

which we identify with the point in the plane (1,7/2,7/2).

10.5.2 Orthogonality to the plane

We can imagine drawing line segments from the original point v to different points in theplane. Intuitively, the shortest line segment will be the one that is perpendicular to theplane. This corresponds to the difference between v and w∗.

We confirm that w∗ − v is orthogonal to any vector in W :

(w∗ − v) · (α1u1 + α2u2) =

03/2−3/2

·α1

α2

α2

= 0


10.6 The least squares problem

10.6.1 The least squares problem in Rn

Let W be an m-dimensional subspace of Rn. Given a vector v in Rn find a vector w∗ in Wso that

‖w∗ − v‖ ≤ ‖w − v‖ for all w ∈ W

10.6.2 Best least squares approximation

The vector w∗ in W is called the best least squares approximation to v.

10.6.3 Example

We have already done an example, with v ∈ R3 and w∗ in a vector space which is the spanof two vectors w1 and w2. The vector that we found is the best least squares approximationto v.

10.7 Orthogonality result

10.7.1 Theorem

Suppose that v is a vector in Rn. Suppose that W is an m-dimensional subspace of Rn andw∗ in W is such that

(w∗ − v)tw = 0

for all w in W . Then w∗ is the best least-squares approximation to v.

10.7.2 In other words ..

The vector connecting v to w∗ is orthogonal to the subspace W .

10.7.3 Proof

Given the conditions of the theorem

10.8. TESTING FOR ORTHOGONALITY 143

‖w − v‖2 = ‖w − w∗ + w∗ − v‖2

= [(w − w∗) + (w∗ − v)]t[(w − w∗) + (w∗ − v)]

= [(w − w∗)t + (w∗ − v)t][(w − w∗) + (w∗ − v)]

= (w − w∗)t(w − w∗) + (w − w∗)t(w∗ − v) + (w∗ − v)t(w − w∗) + (w∗ − v)t(w∗ − v)

= (w − w∗)t(w − w∗) + 2(w∗ − v)t(w − w∗) + (w∗ − v)t(w∗ − v)

= ‖w − w∗‖2 + 2(w∗ − v)t(w − w∗) + ‖w∗ − v‖2

Now as w and w∗ are in W by the assumption of the theorem

(w∗ − v)tw = 0

and

(w∗ − v)tw∗ = 0

Then

‖w − v‖2 = = ‖w − w∗‖2 + ‖w∗ − v‖2

Now as the magnitude of a vector is greater than or equal to zero we have

‖w − v‖2 ≥ ‖w∗ − v‖2

So

‖w∗ − v‖ ≤ ‖w − v‖

for all vectors w in W .

10.8 Testing for orthogonality

10.8.1 Lemma

Suppose that we have a m-dimensional subspace W of Rn and we want to find out if a givenvector n of Rn is orthogonal to every vector in W . Then n is orthogonal to every vector win W if and only if it is orthogonal to every basis vector of W .

10.8.2 Proof

Suppose that W has a set of basis vectors

S = {w1, w2, . . . wm}

and


n · wi = 0

for all the wi in S. Then as S is a set of basis vectors for W any vector w in S can be writtenas a linear combination of the vectors in S:

w = α1w1 + α2w2 + · · ·+ αmwm

Taking the dot product with n gives

n · w = n · (α1w1 + α2w2 + · · ·+ αmwm)

= n · α1w1 + n · α2w2 + · · ·+ n · αmwm= α1n · w1 + α2n · w2 + · · ·+ αmn · wm= 0 + 0 + · · · 0= 0

Conversely, if n is orthogonal to every vector in W then it is orthogonal in particular to thebasis vectors in W .

10.9 Existence and uniqueness of best approximations

10.9.1 Recall

Any best approximation w∗ in a subspace W for v in Rn will have the property that for anyw in W

(w∗ − v)tw = 0

and to find such a w∗ − v it is sufficient to find one that is orthogonal to very basis vectorof W .

10.9.2 Resulting equations

If W has a set of basis vectors

S = {w1, w2, . . . wm}then the best approximation w∗ for v in Rn has the property that the following equationshold

(w∗ − v)tw1 = 0

(w∗ − v)tw2 = 0

...

(w∗ − v)twm = 0

10.9. EXISTENCE AND UNIQUENESS OF BEST APPROXIMATIONS 145

10.9.3 Uniqueness

If these equations have a unique solution then the best approximation exists and must beunique.

10.9.4 Solving the system of equations

Suppose that

S = {w1, w2, . . . wm}

is a set of orthogonal basis vectors, i.e. each basis vector is orthogonal to the others. Wecan always find such a basis for any subspace of Rn.

We write w∗ as a linear combination of the orthonormal basis vectors for W

w∗ = α1w1 + α2w2 + · · ·+ αmwm

Then the system of equations becomes

(α1w1 + α2w2 + · · ·+ αmwm − v)tw1 = 0

(α1w1 + α2w2 + · · ·+ αmwm − v)tw2 = 0

...

(α1w1 + α2w2 + · · ·+ αmwm − v)twm = 0

and using orthogonality gives

α1‖w1‖2 − vtw1 = 0

α2‖w2‖2 − vtw2 = 0

...

αm‖wm‖2 − vtwm = 0

and then

αi =vtwi‖wi‖2

for i = 1, 2, . . . ,m.


10.9.5 A best approximation

The best approximation for v then is

w∗ =vtw1

‖w1‖2w1 +

vtw2

‖w2‖2w2 + · · ·+ vtwm

‖wm‖2wm =

m∑i=1

vtwi‖wi‖2

wi

10.9.6 Uniqueness

The best approximation w∗ is unique, no other vector in W can be a better least squaresapproximation.

10.9.7 Proof

Suppose that w∗ is the best approximation as we have just constructed. Let w+ be someother best approximation in W . Then

‖w+ − v‖2 = ‖w+ − w∗ + w∗ − v‖2

= [(w+ − w∗) + (w∗ − v)]t[(w+ − w∗) + (w∗ − v)]

= (w+ − w∗)t(w+ − w∗) + (w+ − w∗)t(w∗ − v) + (w∗ − v)t(w+ − w∗) + (w∗ − v)t(w∗ − v)

= ‖w+ − w∗‖2 + 2(w+ − w∗)t(w∗ − v) + ‖w∗ − v‖2

As w∗ − v was orthogonal to any vector in W

‖w+ − v‖2 = ‖w+ − w∗‖2 + ‖w∗ − v‖2

But w∗ and w+ are both best approximations so

‖w+ − v‖2 = ‖w∗ − v‖2

Then

‖w+ − w∗‖2 = 0

and these best approximations are in fact the same vector.

10.10 Example

10.10.1 Problem

Consider the subspace W spanned by the vectors

w1 =

−110

10.10. EXAMPLE 147

and

w2 =

201

We want to find the best least squares approximation in W to the vector

v =

1−24

The spanning vectors are linearly independent so

S = {w1, w2}is a basis. As the basis is not orthogonal, we construct an orthogonal basis for W using theGram-Schmidt process:

Let

u1 = w1 =

−110

u2 = w2 −

w2 · u1

u1 · u1

u1 =

201

− −2

2

−110

=

111

So an orthogonal basis for W is

O =

u1 =

−110

, u2 =

111

Then the best least squares approximation in W is given by

w∗ =v · u1

u1 · u1

u1 +v · u2

u2 · u2

u2

=−3

2u1 +

3

3u2

=−3

2

−110

+

111

=

5/2−1/2

1


10.10.2 Check

Recall that w∗ − v needs to be orthogonal to the basis vectors of W . We have

w∗ − v =

5/2−1/2

1

− 1−24

=

3/23/2−3

We see that

(w∗ − v)tw1 = 0

(w∗ − v)tw2 = 0

10.11 Finding a line that best fits data points

10.11.1 Experiment

I believe that a quantity y varies with time according to some law

y = mt+ b

and now wish to conduct an experiment to find m and t.

10.11.2 Data points

I measure y at various times t and the measurements are recorded as

t y1 12 54 75 11

10.11.3 Trying to find the line

Using my model of y = mt + b and assuming that if I plotted this points they would lie onthe graph of this equation, I get the equations

m1 + b = 1

m2 + b = 5

m4 + b = 7

m5 + b = 11

10.11. FINDING A LINE THAT BEST FITS DATA POINTS 149

This can be written 1 12 14 15 1

(mb)

=

15711

10.11.4 Inconsistency

If I form the augmented matrix and do row reduction it turns out that the system is incon-sistent. Now what?

10.11.5 Strategy

We will have some measurement in any experiment. So, let’s try and find an m and a b thatwill give a best fit of the line to the data points.

10.11.6 Seek to minimize square errors

We then seek an m and a b so as to minimize the sum of the differences for each t of mt+ band the actual measured value:

I =1

2

[(m+ b− 1)2 + (2m+ b− 5)2 + (4m+ b− 7)2 + (5m+ b− 11)2

]Taking partial derivatives

Im = (m+ b− 1) + 2(2m+ b− 5) + 4(4m+ b− 7) + 5(5m+ b− 11) = 0

Ib = (m+ b− 1) + (2m+ b− 5) + (4m+ b− 7) + (5m+ b− 11) = 0

or

(m+ b) + 2(2m+ b) + 4(4m+ b) + 5(5m+ b) = 1 + 2 · 5 + 4 · 7 + 5 · 11

and

(m+ b) + (2m+ b) + (4m+ b) + (5m+ b) = 1 + 5 + 7 + 11

This can be rewritten in terms of matrices as

(1 2 4 51 1 1 1

)1 12 14 15 1

(mb)

=

(1 2 4 51 1 1 1

)15711

Note that we now have the original system multiplied on both sides by the transpose of theoriginal matrix. Then we get


(46 1212 4

)(mb

)=

(9424

)and so (

mb

)=

(46 1212 4

)−1(9424

)=

(11/5−3/5

)

10.12 Least squares solutions to inconsistent systems

10.12.1 Problem

Suppose that the equation

Ax = b

is inconsistent. We cannot find an x which satisfies the equation so we will try to find onethat is the best fit. That is, we want to minimize the magnitude of the vector

r = Ax− b

10.12.2 Solution

We seek to minimize

F (x) = (Ax− b)t(Ax− b)Let h be a small deviation about x

F (x+ h) = [A(x+ h)− b]t[A(x+ h)− b]= [(Ax− b) + Ah]t[(Ax− b) + Ah]

= (Ax− b)t(Ax− b) + (Ah)t(Ax− b) +O(h2)

= F (x) + htAt(Ax− b) +O(h2)

The term At(Ax−b) is a sort of derivative which we will set equal to zero to get a minimum.So, we want to solve

At(Ax− b) = 0 =⇒ AtAx = Atb

10.12.3 Normal equations

The equations

AtAx = Atb

are called the normal equations.

10.13. EXERCISES 151

10.12.4 Properties

a) The normal equations are always consistent.

b) The solutions of the normal equations are the least-squares solutions of Ax = b.

c) If A is m × n the solutions of the normal equations are unique if and only if A hasrank n.

10.13 Exercises

10.13.1 Exercise

Fit a line with an equation of the form y = mx+ b to the following data:t y1 12 34 35 6

10.13.2 Exercise

Fit a a curve with an equation of the form y = ax2 + bx+ c to the following data:t y1 32 63 145 30

10.13.3 Exercise

Consider the vector v in R4

v =

1234

Let W ⊂ R4 be the span of the vector


w =

4321

Find the best least-squares approximation to v in the subspace W .

Chapter 11

Vector Spaces

11.1 Examples

We will illustrate the properties of vector spaces using three example spaces.

11.1.1 Euclidean spaces

Note that linear combinations of two vectors in Rn give vectors in Rn.

11.1.2 Finite degree polynomials

Note that linear combinations of polynomials result in a polynomial.

11.1.3 Solutions to some differential equations

Consider the solutions to the differential equation

y′(x) = 0, x ∈ [0, 1]

Note that linear combinations of two solutions is another solution.

11.2 Zero vector

11.2.1 Zero vector in euclidean spaces

The Rn vector spaces have a zero vector which is a column vector with all zero entries.

0 =

00...0

153


11.2.2 Zero vector in the polynomial vector space

What would be the zero vector in the polynomial vector space?

11.2.3 Zero vector in solutions of the differential eqn

What would be the zero vector in the vector space of the solutions of

y′(x) = 0, x ∈ [0, 1]

11.2.4 Property of the zero vector

For any vector u in the vector space

u+ 0 = 0 + u = u

11.2.5 Exercise

Confirm that the zero vector in each of the three example spaces has the desired property.

11.3 Addition

11.3.1 Addition

If u and v vectors in a vector space V then

u+ v also in V

11.3.2 Example

Euclidean spaces.

11.3.3 Example

Adding two polynomials gives another polynomial.

11.3.4 Example

If u and v solutions to

y′(x) = 0 x ∈ [0, 1]

then

u′(x) = 0 x ∈ [0, 1]

11.4. SCALAR MULTIPLICATION 155

v′(x) = 0 x ∈ [0, 1]

and

(u+ v)′(x) = u′(x) + v′(x) = 0 + 0 = 0 x ∈ [0, 1]

11.4 Scalar multiplication


If u is a vector in a real vector space then for any real number α we have that αu is a memberof the vector space.

11.4.2 Example

A scalar multiplication was defined so that for any α ∈ R and any u ∈ Rn

αu ∈ Rn

11.4.3 Example

The scalar multiplication of polynomials is defined as

α(c0 + c1x+ c2x2 + · · ·+ cnx

n) = ααc0 + αc1x+ αc2x2 + · · ·+ αcnx

n

which is a vector in the space of polynomials.

11.4.4 Example

If u is a solution to

y′(x) = 0 x ∈ [0, 1]

then

u′(x) = 0 x ∈ [0, 1]

and

(αu)′(x) = α(u)′(x) = α · 0 = 0 x ∈ [0, 1]

Then αu is also a solution and a vector in the vector space of solutions of the differentialequation

y′(x) = 0 x ∈ [0, 1]


11.5 Properties we have seen for Euclidean spaces

From the way we defined addition and scalar multiplication for vectors in Rn we can show:

11.5.1 Properties


u+ v

is also a vector in Rn.


u+ v = v + u

3. For u, v, and w vectors in Rn

(u+ v) + w = u+ (v + w)

4. For u, in Rn there is a zero vector 0 such that

u+ 0 = u = 0 + u

5. For every vector u in Rn there is a vector −u in Rn such that

u+ (−u) = 0

6. For every vector u in Rn and α in R there is a vector

αu

in Rn.

7. For every u and v in Rn and α a real number

α(u+ v) = αu+ αv




α(βu) = (αβ)u

10. For u in Rn

1u = u


11.6 Properties of a vector space

11.6.1 Definition

A set V is said to be a real vector space if the properties that we just recalled for the Rn

vector spaces hold.

11.6.2 Property 1


u+ v

is also a vector in V .

11.6.3 Example

We checked this for the space of polynomials and the space of solutions to the differentialequation previously.

11.6.4 Property 2


u+ v = v + u

11.6.5 Example

Addition of polynomial functions is commutative and addition of functions is commutative.

11.6.6 Property 3

3. For u, v, and w vectors in V

(u+ v) + w = u+ (v + w)

11.6.7 Example

Addition of functions in general is associative so this property holds in particular for poly-nomial functions and solutions of the differential equation.


11.6.8 Property 4

4. For u, in V there is a zero vector 0 such that

u+ 0 = u = 0 + u

11.6.9 Example

We already checked this for the example spaces.

11.6.10 Property 5

5. For every vector u in V there is a vector −u in V such that

u+ (−u) = 0

11.6.11 Example

For a polynomial p defined by

p(x) = c0 + c1x+ c2x2 + · · ·+ cnx

n

the polynomial −p is defined by

(−p)(x) = − c0 +−c1x+−c2x2 + · · ·+−cnxn

which is another polynomial and p+−p is such that

(p+−p)(x) = c0 − c0 + (c1 − c1)x+ (c2 − c2)x2 + · · ·+ (cn − cn)xn ≡ 0

11.6.12 Example

If u is a solution to the differential equation then

u′(x) = 0

and

(−u)prime(x) = − u′(x) = − 0 = 0

so −u is also a vector in the space of solutions and

11.6.13 Property 6

6. For every vector u in V and α in V there is a vector

αu

in V .


11.6.14 Example

Already checked these for the example spaces.

11.6.15 Property 7

7. For every u and v in V and α a real number

α(u+ v) = αu+ αv

11.6.16 Example

True for the Euclidean spaces because of the properties of the real numbers. True in generalfor functions so specifically for the polynomial functions and the functions that are membersof the solution space.

11.6.17 Property 8



11.6.18 Examples

True for Euclidean spaces by the properties of real numbers. True in general for functionsand so true in particular for the polynomial functions and the functions that are solutionsto the differential equation.

11.6.19 Property 9


α(βu) = (αβ)u

11.6.20 Examples

True for the Euclidean spaces by the properties of real numbers. True for real valued functionsin general and hence true in particular for the polynomial functions and the functions thatare solutions to the differential equation.

11.6.21 Property 10

10. For u in V

1u = u


11.6.22 Examples

True for Euclidean spaces by the properties of real numbers. True in general for real valuedfunctions by the definition of scalar multiplication.

11.7 Subspaces

11.7.1 Definition

If V is a real vector space, and W is a non-empty subset of V which is also itself a vectorspace then W is said to be a subspace of V .

11.7.2 Theorem - Checking if a subset is a subspace

If W is a non-empty subset of vectors of a vector space V then W is a subspace of V if andonly if

a) u, v ∈ W implies u+ v ∈ W

b) α ∈ R and u ∈ W implies αu ∈ W .

11.7.3 Proof

Exercise.

11.7.4 Example

Consider the subset W of R2 which consists of all of the vectors of the form(xx2

)Is this a subspace?

Consider the addition of two vectors(11

)+

(24

)=

(35

)/∈ W

W is not a subspace since the sum of two vectors in W is not necessarily in W .

11.7.5 Example

Consider the subset W of the space of all polynomials which consists of all polynomials ofdegree 3 or less. This is a subspace because: 1) the sum of two polynomials of degree threeor less is a polynomial of degree three or less; 2) the scalar multiple of a polynomial of degreethree or less is a polynomial of degree three or less.


11.7.6 Example

Every vector space has a trivial subspace which is not very interesting.

11.7.7 Example

Solution space of the homogeneous equation Ax = 0.

11.8 Linear combinations

11.8.1 Linear combination of vectors

A vector v in a real vector space V is said to be a linear combination of vectors v1, v2 , ... ,vn if there are real numbers α1, α2, ... , αn such that

v = α1v1 + α2v2 + · · ·+ αnvn

11.8.2 Example

In R2 the vector

v =

(αβ


e1 =

(10

)

e2 =

(01

)by

v = αe1 + βe2

11.8.3 Example

In R2 the vector

v =

(αβ


v1 =

(1−1

)


v2 =

(11

)by

v =α− β

2v1 +

α + β

2v2

11.8.4 Example

The polynomial function

p(x) = 2 + 5x+ 7x2

is a linear combination of the vectors

1, x, x2

11.8.5 Example

In Rn all vectors can be written as linear combinations of the vectors

e1 =

10...0

e2 =

01...0

...

en =

00...1

11.8.6 Theorem - linear combinations form a subspace

Suppose v1, v2, ... , vk are vectors in a vector space V .

a) Then the set W of all linear combinations of v1, v2, ... , vk is a subspace of V .

b) It is the smallest subspace that contains all the vi.


11.8.7 Proof of a)

Suppose that p and q are elements of W . Then p and q are both linear combinations of thevi:

p = α1v1 + α2v2 + · · ·+ αkvk

q = β1v1 + β2v2 + · · ·+ βkvk

The sum is

p+ q = (α1 + β1)v1 + (α2 + β2)v2 + · · ·+ (αk + βk)vk

which is a linear combination of the vi and so in W . So the sum of two vectors in W is alsoin W .

Now consider a scalar multiple of p:

γp = (γα1)v1 + (γα2)v2 + · · ·+ (γαk)vk

And then this is also a linear combination of the vi and so is in W . So a scalar multipleof a vector in W is in W . The two required conditions have been checked and so W is asubspace.

11.8.8 Proof of b)

Suppose a subspace contains vectors v1, v2 , ... , vk.

One of the conditions for a subspace is that all scalar multiples of vectors in the subspacemust also be in the subspace. Then for all real numbers α1, α2 , ... , αk we must have thanα1v1, α2v2 , ... , αkvk are also in the subspace.

The other condition for a subspace is that sum of any two vectors in the subspace mustalso be in the subspace. Then

α1v1 + α2v2

is in the subspace. Then

(α1v1 + α2v2) + α3v3

is in the subspace. And so on till

α1v1 + α2v2 + · · ·+ αkvk

is also in the subspace.

Thus v1, v2 , ... , vk in a subspace implies that all linear combinations of the vi are inthe subspace.


11.8.9 Example

The set of all of the linear combinations of

u =

(11

)and

v =

(−11

)is a subspace. In fact, it is all of R2.

11.8.10 Span

Suppose a subspace V contains vectors v1, v2 , ... , vk. The subspace W consisting of all thelinear combinations of the vi is called the space spanned by v1, v2 , ... , vk and v1, v2 , ... ,vk are said to span W .

11.8.11 Example

The span of the polynomial functions p, q, and r defined

p(x) = 1, q(x) = 1 + x, r(x) = 1 + x+ x2

is a subspace of the space of polynomial functions.

11.9 Linear independence and dependence

11.9.1 Linear independence

Suppose that v1, v2 , ... , vk are vectors in a real vector space V . The vi are said to belinearly dependent if we can find real numbers α1, α2, .. , αk, not all zero so that

α1v1 + α2v2 + · · ·+ αkvk = 0

Otherwise the set is said to be linearly independent.

11.9.2 Alternatively

Another way of putting this is that the vi are linearly independent if

α1v1 + α2v2 + · · ·+ αkvk = 0

only for all the αi = 0. Then if the vectors are not linearly independent they are said to belinearly dependent.

11.10. BACK TO THE FTLA 165

11.9.3 Example

The standard basis vectors in Rn are linearly independent.

11.9.4 Example

The vectors p , q , r in the space of polynomial functions defined by

p(x) = 1 + x

q(x) = 1− x

r(x) = x

are linearly dependent because

p− q − 2r = 0

11.10 Back to the FTLA

11.10.1 Recall

The following are equivalent:a) An×n is invertible.





f) |A| 6= 0.

11.10.2 Add to the FTLA


11.10.3 Proof

Suppose that A is invertible. Then the only solution to Ax = 0 is x = 0. Then the onlylinear combination of the columns of A that gives zero is one where the scalar coefficients ofthe sum are zero. Then the columns are linearly independent.


If the columns are linearly independent then the only linear combination that gives zerois the one with all scalar coefficients zero. Then Ax = 0 has only the trivial solution and soby the FTLA then A is invertible.

11.11 Dimension

11.11.1 Example

We saw that the vectors p , q , r in the space of polynomial functions defined by

p(x) = 1 + x

q(x) = 1− x

r(x) = x

are linearly dependent because

p− q − 2r = 0

Then

r =1

2p− 1

2q

Then the span of p, q, and r is the same as the span of just p and q.

11.11.2 General idea

Suppose that we are interested in a space W that we know is spanned by a set of vectors

B = {v1, v2, . . . , vm}

If the set of vectors is linearly independent, then none of the vectors can be written as alinear combination of the other vectors. If the set is linearly dependent, then at least onevector can be written as a linear combination of the others. Say, vm. We can drop vm fromthe set and just use

B′ = {v1, v2, . . . , vm−1}

We keep going till we get a linearly independent set.

11.11.3 Basis

Such a minimal set is called a basis.

11.12. EXERCISES 167

11.11.4 Minimality of the basis

For a given vector space W with two bases, the number of vectors in both bases is the same.

11.11.5 Example

B = {p, q}

and

B′ = {p, r}

are both bases for the same subspace of the space of polynomial functions.

11.11.6 Dimension

The dimension of a vector space V is the number of vectors in its basis.

11.11.7 Example

Rn has dimension n.

11.11.8 Example

The space of polynomial functions defined on R is of infinite dimension.

11.12 Exercises

11.12.1 Exercise

Give an example of a two-dimensional subspace in the space of polynomial functions.

11.12.2 Exercise

How many different bases are there for R3?

11.12.3 Exercise

Can a set of n+ 1 vectors in Rn be linearly independent?


Chapter 12

Some particular vector spaces

12.1 Domain and range

12.1.1 Recall

Recall that if f is a map from a set U to set V then U is called the domain and the set ofvalues that f takes are called the range of f .

12.1.2 Example

Suppose f : R→ R is defined by

f(x) = x2

then the domain of f is R and the range of f is all the non-negative reals.

12.2 Example

12.2.1 Question

Consider the matrix A3×3

A =

1 4 72 5 83 6 9

A acts on vectors in R3 and sends them to R3. Let L : R3 → R3 be the linear operatordefined by

L(u) = Au

What is the range of A?

169

170 CHAPTER 12. SOME PARTICULAR VECTOR SPACES

12.2.2 What does the matrix do to the standard basis vectors?

L(e1) = Ae1 =

1 4 72 5 83 6 9

100

=

123

L(e2) = Ae2 =

1 4 72 5 83 6 9

010

=

456

L(e3) = Ae3 =

1 4 72 5 83 6 9

001

=

789

12.2.3 What does the matrix do to a general vector?

Where does A send a general vector? A typical vector in R3 looks like

v = α1 + β2 + γ3

so

L(v) = Av = α

123

+ β

456

+ γ

789

12.2.4 Answer

The range of L is the set of vectors that look like

α

123

+ β

456

+ γ

789

where α, β, and γ are any real numbers.

12.2.5 Note

The range of L is a vector space spanned by the column vectors of A. Is this true in general?

12.3 Column space

12.3.1 Matrix and column vectors

Suppose that we have a matrix Am×n. Let L : Rn → Rm be the linear transformation definedby

L(u) = Au

12.3. COLUMN SPACE 171

We write

A =

a11 a12 · · · a1n

a21 a22 · · · a2n...

... · · · ...am1 am2 · · · amn

We can think of the matrix as being made up of n column vectors

v1 =

a11

a21...am1

v2 =

a12

a22...am2

...

vn =

a1n

a2n...

amn

12.3.2 Action of matrix on basis vectors

The standard basis vectors for Rn are

e1 =

10...0

e2 =

01...0

...


en =

00...1

Now we consider the products

L(e1) = Ae1 = v1

L(e2) = Ae2 = v2

...

L(en) = Aen = vn

12.3.3 Matrix acting on a typical vector

Any vector u ∈ Rn can be written as

u = α11 + α22 + · · ·+ αnn

and so

L(u) = Au = α1v1 + α2v2 + · · ·+ αnvn

12.3.4 Column space

The range of L is a vector space, namely the span of the column vectors of A. This space iscalled the column space of A.

12.3.5 Column space and solutions

For a matrix Am×n and column vectors x and b, the matrix-vector equation

Ax = b

is consistent if and only if the vector b is in the column space of A.

12.3. COLUMN SPACE 173

12.3.6 Proof

Suppose that the columns of A are

{v1, v2, v3, · · · , vn}

If b is in the column space of A then b is a linear combination of the columns of A

b = α1v1 + α2v2 + · · ·+ αnbn

Let

x =

α1

α2...αn

Then

Ax =

v(1)1 v

(1)2 v

(1)3 · · · v

(1)n

v(2)1 v

(2)2 v

(2)3 · · · v

(2)n

......

... · · · ...

v(k)1 v

(k)2 v

(k)3 · · · v

(k)n

......

... · · · ...

v(m)1 v

(m)2 v

(m)3 · · · v

(m)n

α1

α2...αn

=

α1v(1)1 + α2v

(1)2 + α3v

(1)3 + · · ·+ αnv

(1)n

α1v(2)1 + α2v

(2)2 + α3v

(2)3 + · · ·+ αnv

(2)n

...

α1v(k)1 + α2v

(k)2 + α3v

(k)3 + · · ·+ αnv

(k)n

...

α1v(m)1 + α2v

(m)2 + α3v

(m)3 + · · ·+ αnv

(m)n

=

α1v(1)1

α1v(2)1

...

α1v(k)1

...

α1v(m)1

+

α2v(1)2

α2v(2)2

...

α2v(k)2

...

α2v(m)2

+

α3v(1)3

α3v(2)3

...

α3v(k)3

...

α3v(m)3

+ · · ·+

αnv(1)n

αnv(2)n

...

αnv(k)n

...

αnv(m)n

= α1v1 + α2v2 + · · ·+ αnvn

= b


So b in the column space of A implies that Ax = b has a solution.

Now, suppose that Ax = b is consistent. Then there is a vector

x0 =

α1

α2...αn

so that Ax0 = b. But

Ax0 = α1v1 + α2v2 + · · ·+ αnvn = b

so b can be written as a linear combination of the vectors that span the column space of A.So b is in the column space of A.

12.4 Null space

12.4.1 Null space

Given a matrix Am×n, the subspace of all vectors x ∈ Rn so that

Ax = 0

is called the null space of A.

12.4.2 The null space is a subspace

The null space is a subspace of Rn. If there are two vectors u and v such that

Au = 0

and

Av = 0

then for any linear combination

w = αu+ βv

of u and v we have that

Aw = A(αu+ βv) = αAu+ βAv = 0 + 0 = 0

so any linear combination of two vectors in the null space is also in the null space. Then thenull space is a subspace.

12.5. SOLUTIONS OF HOMOGENEOUS AND INHOMOGENEOUS SYSTEMS 175

12.4.3 Example

Consider the matrix

A =

(1 23 4

)The null space of A is only the zero vector and nothing else. Why?

12.4.4 Example

Consider the matrix

A =

(1 22 4

)What is the null space of A?

The null space of A consists of all vectors of the form(−2ss

)= s

(−21

)for any real number s. Then the basis for the null space of A is{(

−21

)}

12.5 Solutions of homogeneous and inhomogeneous sys-

tems

12.5.1 Homogeneous linear system

Given matrix Am×n, the linear system

Ax = 0

is called homogeneous.

12.5.2 Inhomogeneous linear system

Given matrix Am×n and vector b 6= 0, the linear system

Ax = b

is called inhomogeneous.


12.5.3 Theorem - solns of the inhomogeneous system

If the equation Ax = b is consistent then the solutions can be written in the form

x = xp + α1v1 + α2v2 + · · ·+ αkvk

where

B = {v1 , v2 , · · · , vk}

is a basis for the null space of A and xp is a particular solution to the equation Ax = b.

12.5.4 In other words

The general solution to Ax = b is any particular solution plus the null space of A.

12.5.5 Proof in one direction

Suppose that

B = {v1 , v2 , · · · , vk}

is a basis for the null space of A and xp is a solution of Ax = b and x is any solution. Then

Axp = b

and

Ax = b

and so subtracting gives

A(x− xp) = 0

Then as x−x0 is in the null space of A we must be able to write it as some linear combinationof the basis vectors of the null space:

x− xp = α1v1 + α2v2 + · · ·+ αkvk

This implies that every solution of the equation can be written in the form

x = xp + α1v1 + α2v2 + · · ·+ αkvk

12.5. SOLUTIONS OF HOMOGENEOUS AND INHOMOGENEOUS SYSTEMS 177

12.5.6 Proof in the other direction

Suppose that xp is a solution of Ax = b and that

B = {v1 , v2 , · · · , vk}

is a basis for the null space of A. Let

x = xp + α1v1 + α2v2 + · · ·+ αkvk

Then

Ax = Axp + α1Av1 + α2Av2 + · · ·+ αkAvk = b+ α10 + α20 + · · ·+ αk0 = b

so a vector of the form

x = xp + α1v1 + α2v2 + · · ·+ αkvk

is a solution of the equation.

12.5.7 Particular solution

We call xp a particular solution of Ax = b.

12.5.8 General solution

The vector

x = xp + α1v1 + α2v2 + · · ·+ αkvk

is called the general solution of Ax = b.

12.5.9 General solution of the homogeneous equation

The linear combination

α1v1 + α2v2 + · · ·+ αkvk

is called the general solution to Ax = 0.

12.5.10 Example

Consider the solution of

1 1 1 10 1 1 10 0 1 1

x1

x2

x3

x4

=

1097


The rref of the augmented matrix is 1 0 0 0 10 1 0 0 20 0 1 1 7

which gives a solution

x =

x1

x2

x3

x4

=

12

7− x4

x4

=

1270

+

00−x4

x4

=

1270

+ t

00−11

where t could be any real number. In the notation that we have been using, the particularsolution of the inhomogeneous problem is

xp =

1270

and the general solution of the homogeneous problem is

xh = t

00−11

and the general solution of the inhomogeneous problem is

x = xp + xh

12.6 Null space and elementary row operations

12.6.1 Question

We might use elementary row operations on a matrix to find the null space of the matrix.There would be a problem if the null space of the original matrix and the null space of therow reduced matrix were not the same.

It turns out that the null spaces of a matrix and the matrix multiplied by an elementarymatrix from the left are the same.

12.6.2 Theorem

If E is an elementary matrix and A is a general matrix so that their product EA is definedthen the null space of EA and the null space of A are the same.

12.7. COLUMN SPACE AND ELEMENTARY OPERATIONS 179

12.6.3 Proof

Suppose that u is in the null space of A. That is, Au = 0. Then

(EA)u = E(Au) = E · 0 = 0

So, every vector in the null space of A is in the null space of EA. So, the null space of A isa subset of the null space of EA.

Now, suppose that there is a vector v in the null space of EA. That is, (EA)v = 0. Then

(EA)v = E(Av) = 0

The elementary matrices are all invertible so

Av = E−10 = 0

So, every vector in the null space of EA is a vector in the null space of A. Then the nullspace of EA is a subset of the null space of A.

Since the null space of A is a subset of the null space of EA and the null space of EAis a subset of the null space of A, the two spaces are the same.

12.7 Column space and elementary operations

12.7.1 Linear independence of column spaces

Suppose that we have a matrix Am×n and B is obtained by A through elementary rowoperations. Then the column vectors of A are linearly independent if and only if the columnvectors of B are linearly independent.

12.7.2 Proof

Sufficient to show this for

B = EA

with E an elementary matrix.

Suppose that the column vectors of A are linearly independent. Consider

u =

α1

α2...αn

Then the vector Au is


Au = α1c1 + α2c2 + · · ·+ αncn

where the ci are the column vectors of A.

If the column vectors of A are linearly independent then Au is not zero for any nonzerou. No elementary matrix can make a nonzero vector zero, so

(EA)u = E(Au) 6= 0

Suppose that A has linearly dependent column vectors. Then there is a nonzero u so thatAu = 0. Then

(EA)u = E(Au) = 0

and so the column vectors of EA are linearly dependent.

12.7.3 Example

Consider the matrix

A =

1 2 53 8 62 1 95 3 1

Its rref is

1 0 00 1 00 0 10 0 0

As the column vectors of the rref are linearly independent the column vectors of the originalmatrix A are linearly independent.

12.7.4 Related theorem

A given set of column vectors of A is linearly independent if and only if the correspondingcolumn vectors of B are linearly independent.

12.7.5 Proof

Work with A modified by dropping any columns not in the set.

12.7. COLUMN SPACE AND ELEMENTARY OPERATIONS 181

12.7.6 Basis of the column space

Suppose that A is a matrix and B is given by performing elementary row operations on A.

A given set of column vectors of A forms a basis for the column space of A if and onlyif the corresponding column vectors of B form a basis for the column space of B.

12.7.7 Proof

Sufficient to show this for B = EA where E is an elementary matrix.

Suppose that a given set of column vectors are a basis for the column space of A. Thenthat set of column vectors is linearly independent, by the definition of a basis. Then thecorresponding column vectors in B are linearly independent by the previous theorem. Now,we would like to show that the corresponding column vectors are a basis.

We need to show that every vector in the column space of EA can be written as a lin-ear combination of the corresponding vectors.

Suppose, without loss of generality, that that the first k vectors are a basis for the col-umn space of A and the remaining n−k can be written as linear combinations of the first k.

Then any vector in the column space is the image of a vector u mapped by the associatedlinear transformation:

Au = α1c1 + α2c2 + · · ·+ αkck + αkck+1 + · · ·+ αncn

= α1c1 + α2c2 + · · ·+ αkck + αk

k∑i=1

βki ci + · · ·+ αn

k∑i=1

βni cn

for some scalars bji . Then a typical vector in the column space of EA is

(EA)u = E(Au) = α1Ec1 + α2Ec2 + · · ·+ αkEck + αk

k∑i=1

βki Eci + · · ·+ αn

k∑i=1

βni Ecn

so the images of the original basis vectors also span the column space of EA.

For the proof in the opposite direction, B = EA means E−1B = A and note that theinverse of an elementary matrix is an elementary matrix. Then the same proof can be usedagain.


12.7.8 RREF and column space

If a matrix A is in rref, then the column vectors with the leading ones of the row vectorsform a basis for the column space of A.

12.7.9 Proof

Exercise.

12.7.10 Example

Consider the matrix

A =

1 1 1 1 12 2 3 4 43 3 1 3 4

The rref is 1 1 0 0 1/4

0 0 1 0 −1/20 0 0 1 5/4

So a basis of the column space is

c1 =

100

c3 =

010

c4 =

001

Since this basis would span all of R3 we can say that the range of A is all of R3.

12.8 Row space

12.8.1 Row space

The row space of a matrix A is the space spanned by the row vectors of A.

12.9. ELEMENTARY OPERATIONS AND THE ROW SPACE 183

12.8.2 Example

Consider the matrix

A =

1 1 1 10 0 1 10 0 0 2

The row vectors of A are

r1 =(1 1 1 1

)r2 =

(0 0 1 1

)r3 =

(0 0 0 2

)The row space consists of all the linear combinations

αr1 + βr2 + γr3

12.9 Elementary operations and the row space

12.9.1 Elementary operations do not change the row space

If A is a matrix and B is obtained by A by multiplication by elementary matrices then therow space of A and the row space of B are the same.

12.9.2 Proof

Sufficient to show for B = EA where E is single elementary matrix.

Suppose that E interchanges matrices. Then the rows of A and B are the same, so therow spaces do not change.

Suppose that E multiplies one row of A by a non-zero scalar. Then again, the row spacedoes not change.

Now, suppose that E multiplies the ith row of A by a non-zero scalar α and adds it tothe jth row of A, replacing the jth row.

Then if the row space of A is spanned by

{r1, r2, · · · , ri, · · · , rj, · · · , rm}

the row space of B is spanned by


{r1, r2, · · · , ri, · · · , rj + αri, · · · , rm}

which will span the same space.

12.9.3 Row space and rref

If a matrix A is in rref, then the row vectors with the leading ones form a basis for the rowspace of A.

12.9.4 Example

Consider the matrix

A =

1 2 53 8 62 1 95 3 1

Its rref is

1 0 00 1 00 0 10 0 0

and so the row space of A is spanned by the basis vectors

r1 =(1 0 0

)r2 =

(0 1 0

)r3 =

(0 0 1

)

Date post:	15-Jan-2016
Category:	Documents
Upload:	shahrukh-swati
View:	11 times
Download:	0 times

Linear Algebra Notes

Documents