+ All Categories
Home > Documents > AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor...

AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor...

Date post: 04-Mar-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
21
AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 3: Matrix Norms; Singular Value Decomposition Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 14
Transcript
Page 1: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

AMS526: Numerical Analysis I(Numerical Linear Algebra for

Computational and Data Sciences)Lecture 3: Matrix Norms;

Singular Value Decomposition

Xiangmin Jiao

SUNY Stony Brook

Xiangmin Jiao Numerical Analysis I 1 / 14

Page 2: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Outline

1 Matrix Norms (NLA §3)

2 Singular Value Decomposition (NLA§4-5)

Xiangmin Jiao Numerical Analysis I 2 / 14

Page 3: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Frobenius NormOne can define a norm by viewing m × n matrix as vectors in Rmn

One useful norm is Frobenius norm (a.k.a. Hilbert-Schmidt norm)

‖A‖F =

√√√√ n∑i=1

m∑j=1

|aij |2 =

√√√√ n∑j=1

‖aj‖22

i.e., 2-norm of (mn)-vectorFurthermore,

‖A‖F =√

tr(ATA)

where tr(B) denotes trace of B , the sum of its diagonal entriesNote that for A ∈ Rn×` and B ∈ R`×m,

‖AB‖F ≤ ‖A‖F‖B‖Fbecause

‖AB‖2F =n∑

i=1

m∑j=1

|aTi bj |2 ≤n∑

i=1

m∑j=1

(‖aTi ‖2‖bj‖2

)2= ‖A‖2F‖B‖2F

Xiangmin Jiao Numerical Analysis I 3 / 14

Page 4: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

General Definition of Matrix Norms

However, viewing m × n matrix as vectors in Rmn is not always useful,because matrix operations do not behave this waySimilar to vector norms, general matrix norms has the followingproperties (for A,B ∈ Rm×n)

(1)‖A‖ ≥ 0, and ‖A‖ = 0 only if A = 0,(2)‖A + B‖ ≤ ‖A‖+ ‖B‖,(3)‖αA‖ = |α|‖A‖.

In addition, a matrix norm for A,B ∈ Rn×n typically satisfies

‖AB‖ ≤ ‖A‖‖B‖, (submultiplicativity)

which is a generalization of Cauchy-Schwarz inequality

Xiangmin Jiao Numerical Analysis I 4 / 14

Page 5: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Norms Induced by Vector NormsMatrix norms can be induced from vector norms, which can bettercapture behaviors of matrix-vector multiplications

DefinitionGiven vector norms ‖ · ‖(n) and ‖ · ‖(m) on domain and range of A ∈ Rm×n,respectively, the induced matrix norm ‖A‖(m,n) is the smallest numberC ∈ R for which the following inequality holds for all x ∈ Rn:

‖Ax‖(m) ≤ C‖x‖(n).

In other words, it is supremum of ‖Ax‖(m)/‖x‖(n) for all x ∈ Rn\{0}Maximum factor by which A can “stretch” x ∈ Rn

‖A‖(m,n) = supx∈Rn,x 6=0

‖Ax‖(m)/‖x‖(n) = supx∈Rn,‖x‖(n)=1

‖Ax‖(m).

Is vector norm consistent with matrix norm of m × 1-matrix?Xiangmin Jiao Numerical Analysis I 5 / 14

Page 6: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

1-norm

By definition‖A‖1 = sup

x∈Rn,‖x‖1=1‖Ax‖1

What is it equal to?

I Maximum of 1-norm of column vectors of AI Or maximum of column sum of absolute values of A, “column-sum

norm”

To show it, note that for x ∈ Rn and ‖x‖1 = 1

‖Ax‖1 =

∥∥∥∥∥∥n∑

j=1

xjaj

∥∥∥∥∥∥1

≤n∑

j=1

|xj |‖aj‖1 ≤ max1≤j≤n

‖aj‖1‖x‖1

Let k = argmax1≤j≤n ‖aj‖1, then ‖Aek‖1 = ‖ak‖1, so max1≤j≤n ‖aj‖1is tight upper bound

Xiangmin Jiao Numerical Analysis I 6 / 14

Page 7: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

1-norm

By definition‖A‖1 = sup

x∈Rn,‖x‖1=1‖Ax‖1

What is it equal to?I Maximum of 1-norm of column vectors of AI Or maximum of column sum of absolute values of A, “column-sum

norm”

To show it, note that for x ∈ Rn and ‖x‖1 = 1

‖Ax‖1 =

∥∥∥∥∥∥n∑

j=1

xjaj

∥∥∥∥∥∥1

≤n∑

j=1

|xj |‖aj‖1 ≤ max1≤j≤n

‖aj‖1‖x‖1

Let k = argmax1≤j≤n ‖aj‖1, then ‖Aek‖1 = ‖ak‖1, so max1≤j≤n ‖aj‖1is tight upper bound

Xiangmin Jiao Numerical Analysis I 6 / 14

Page 8: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

∞-norm

By definition‖A‖∞ = sup

x∈Rn,‖x‖∞=1‖Ax‖∞

What is ‖A‖∞ equal to?

I Maximum of 1-norm of column vectors of AT

I Or maximum of row sum of absolute values of A, “row-sum norm”

To show it, note that for x ∈ Rn and ‖x‖∞ = 1

‖Ax‖∞ = max1≤i≤m

|ai ,:x | ≤ max1≤i≤m

‖aTi ,:‖1‖x‖∞

where ai ,: denotes ith row vector of A and ‖aTi ,:‖1 =∑n

j=1 |aij |Furthermore, ‖aTi ,:‖1 is a tight bound.Which vector can we choose for x for equality to hold?

Xiangmin Jiao Numerical Analysis I 7 / 14

Page 9: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

∞-norm

By definition‖A‖∞ = sup

x∈Rn,‖x‖∞=1‖Ax‖∞

What is ‖A‖∞ equal to?I Maximum of 1-norm of column vectors of AT

I Or maximum of row sum of absolute values of A, “row-sum norm”

To show it, note that for x ∈ Rn and ‖x‖∞ = 1

‖Ax‖∞ = max1≤i≤m

|ai ,:x | ≤ max1≤i≤m

‖aTi ,:‖1‖x‖∞

where ai ,: denotes ith row vector of A and ‖aTi ,:‖1 =∑n

j=1 |aij |Furthermore, ‖aTi ,:‖1 is a tight bound.Which vector can we choose for x for equality to hold?

Xiangmin Jiao Numerical Analysis I 7 / 14

Page 10: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

2-norm

What is 2-norm of a matrix?

Answer: Its largest singular value, which we will explain in laterlecturesWhat is 2-norm of a diagonal matrix D?Answer: ‖D‖2 = maxni=1{|dii |}What is 2-norm of rank-one matrix uvT ? Hint: Use Cauchy-Schwarzinequality.Answer: ‖uvT‖2 = ‖u‖2‖v‖2.

Xiangmin Jiao Numerical Analysis I 8 / 14

Page 11: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

2-norm

What is 2-norm of a matrix?Answer: Its largest singular value, which we will explain in laterlecturesWhat is 2-norm of a diagonal matrix D?

Answer: ‖D‖2 = maxni=1{|dii |}What is 2-norm of rank-one matrix uvT ? Hint: Use Cauchy-Schwarzinequality.Answer: ‖uvT‖2 = ‖u‖2‖v‖2.

Xiangmin Jiao Numerical Analysis I 8 / 14

Page 12: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

2-norm

What is 2-norm of a matrix?Answer: Its largest singular value, which we will explain in laterlecturesWhat is 2-norm of a diagonal matrix D?Answer: ‖D‖2 = maxni=1{|dii |}What is 2-norm of rank-one matrix uvT ? Hint: Use Cauchy-Schwarzinequality.

Answer: ‖uvT‖2 = ‖u‖2‖v‖2.

Xiangmin Jiao Numerical Analysis I 8 / 14

Page 13: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

2-norm

What is 2-norm of a matrix?Answer: Its largest singular value, which we will explain in laterlecturesWhat is 2-norm of a diagonal matrix D?Answer: ‖D‖2 = maxni=1{|dii |}What is 2-norm of rank-one matrix uvT ? Hint: Use Cauchy-Schwarzinequality.Answer: ‖uvT‖2 = ‖u‖2‖v‖2.

Xiangmin Jiao Numerical Analysis I 8 / 14

Page 14: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Bounding Matrix-Matrix Multiplication

Let A be an l ×m matrix and B an m × n matrix, then

‖AB‖(l ,n) ≤ ‖A‖(l ,m)‖B‖(m,n)

To show it, note for x ∈ Rn

‖ABx‖(l) ≤ ‖A‖(l ,m)‖Bx‖(m) ≤ ‖A‖(l ,m)‖B‖(m,n)‖x‖(n),

In general, this inequality is not an equalityIn particular, ‖Ap‖ ≤ ‖A‖p but ‖Ap‖ 6= ‖A‖p in general for p ≥ 2

Xiangmin Jiao Numerical Analysis I 9 / 14

Page 15: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Invariance under Orthogonal Transformation

Given matrix Q ∈ R`×m with ` ≥ m. If QTQ = I , then Qx for x ∈ Rm

corresponds to orthogonal transformation to coordinate system in R`

If Q ∈ Rm×m, then Q is said to be an orthogonal matrix

TheoremFor any A ∈ Rm×n and Q ∈ R`×m with QTQ = I and ` ≥ m , we have

‖QA‖2 = ‖A‖2 and ‖QA‖F = ‖A‖F .

In other words, 2-norm and Frobenius norms are invariant under orthogonaltransformation.Proof for 2-norm: ‖Qy‖2 = ‖y‖2 for y ∈ Rm and therefore‖QAx‖2 = ‖Ax‖2 for x ∈ Rn. It then follows from definition of 2-norm.Proof for Frobenius norm:‖QA‖2F = tr

((QA)TQA

)= tr

(ATQTQA

)= tr

(ATA

)= ‖A‖2F .

Xiangmin Jiao Numerical Analysis I 10 / 14

Page 16: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Outline

1 Matrix Norms (NLA §3)

2 Singular Value Decomposition (NLA§4-5)

Xiangmin Jiao Numerical Analysis I 11 / 14

Page 17: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Singular Value Decomposition (SVD)

Given A ∈ Rm×n, its SVD is

A = UΣV T

where U ∈ Rm×m and V ∈ Rn×n are orthogonal, and Σ ∈ Rm×n isdiagonalIf A ∈ Cm×n, then its SVD is A = UΣVH , where U ∈ Cm×m andV ∈ Cn×n are unitary, and Σ ∈ Rm×n is diagonalSingular values are diagonal entries of Σ, with entriesσ1 ≥ σ2 ≥ · · · ≥ σn ≥ 0Left singular vectors of A are column vectors of URight singular vectors of A are column vectors of V and are thepreimages of the principal semiaxes of ASSVD plays prominent role in data analysis and matrix analysis

Xiangmin Jiao Numerical Analysis I 12 / 14

Page 18: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Geometric Observation

Image of unit sphere under any m × n matrix is a hyperellipseGive unit sphere S in Rn, AS denotes shape after transformationSingular values correspond to the principal semiaxes of hyerellipseLeft singular vectors are oriented in directions of principal semiaxes ofAS

Right singular vectors are preimages of principal semiaxes of ASAvj = σjuj for 1 ≤ j ≤ n

Xiangmin Jiao Numerical Analysis I 13 / 14

Page 19: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Two Different Types of SVD

Full SVD: U ∈ Rm×m, Σ ∈ Rm×n, and V ∈ Rn×n is

A = UΣV T

Thin SVD (Reduced SVD): U ∈ Rm×n, Σ ∈ Rn×n (assume m ≥ n)

A = UΣV T

Furthermore, notice that

A =

min{m,n}∑i=1

σiuivTi

so we can keep only entries of U and V corresponding to nonzero σi .

Xiangmin Jiao Numerical Analysis I 14 / 14

Page 20: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Two Different Types of SVD

Full SVD: U ∈ Rm×m, Σ ∈ Rm×n, and V ∈ Rn×n is

A = UΣV T

Thin SVD (Reduced SVD): U ∈ Rm×n, Σ ∈ Rn×n (assume m ≥ n)

A = UΣV T

Furthermore, notice that

A =

min{m,n}∑i=1

σiuivTi

so we can keep only entries of U and V corresponding to nonzero σi .

Xiangmin Jiao Numerical Analysis I 14 / 14

Page 21: AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ...jiao/teaching/ams526_fall16/lectures/lecture03.pdf · AMS526: NumericalAnalysisI (NumericalLinearAlgebrafor ComputationalandDataSciences)

Two Different Types of SVD

Full SVD: U ∈ Rm×m, Σ ∈ Rm×n, and V ∈ Rn×n is

A = UΣV T

Thin SVD (Reduced SVD): U ∈ Rm×n, Σ ∈ Rn×n (assume m ≥ n)

A = UΣV T

Furthermore, notice that

A =

min{m,n}∑i=1

σiuivTi

so we can keep only entries of U and V corresponding to nonzero σi .

Xiangmin Jiao Numerical Analysis I 14 / 14


Recommended