+ All Categories
Home > Documents > Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant...

Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant...

Date post: 04-Nov-2020
Category:
Upload: others
View: 8 times
Download: 0 times
Share this document with a friend
100
Dominant feature extraction Francqui Lecture 7-5-2010 Paul Van Dooren Université catholique de Louvain CESAME, Louvain-la-Neuve, Belgium
Transcript
Page 1: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

1 / 82

Dominant feature extraction

Francqui Lecture 7-5-2010

Paul Van DoorenUniversité catholique de Louvain

CESAME, Louvain-la-Neuve, Belgium

Page 2: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

2 / 82

Goal of this lecture

Develop basic ideas for large scale dense matrices

Recursive procedures for

I Dominant singular subspaceI Multipass iterationI Subset selectionI Dominant eigenspace of positive definite matrixI Possible extensions

which are all based on solving cheap subproblems

Show accuracy and complexity results

Page 3: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

3 / 82

Dominant singular subspaces

Given Am×n, approximate it by a rank k factorization Bm×k Ck×nby solving

min ‖A− BC‖2, k � m,n

This has several applications in Image compression, Informationretrieval and Model reduction (POD)

Page 4: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

4 / 82

Information retrieval

I Low memoryrequirement0(k(m + n))

I Fast queriesAx ≈ L(Ux)0(k(m + n)) time

I Easy to obtain0(kmn) flops

Page 5: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

5 / 82

Proper Orthogonal decomposition (POD)

Compute a state trajectory for one “typical" input

Collect the principal directions to project on

Page 6: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

6 / 82

Recursivity

We pass once over the data with a window of length k and performalong the way a set of windowed SVD’s of dimension m × (k + `)

Step 1 : expand by appending ` columns (Gram Schmidt)Step 2 : contract by deleting the ` least important columns (SVD)

Page 7: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

7 / 82

Expansion (G-S)

Append column a+ to the current approximation URV T to get

[URV T a+

]=[U a+

] [R 01

] [V T

1

]

Update with Gram Schmidt to recover a new decomposition URV T :

using r = UT a+, a = a+ − Ur , a = uρ (since a+ = Ur + uρ)

Page 8: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

8 / 82

Contraction (SVD)

Now remove the ` smallest singular values of this new URV T via

URV T = (UGu)(GTu RGv )(GT

v V T ) =

and keeping U+R+V T+ as best approximation of URV T

(just delete the ` smallest singular values)

Page 9: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

9 / 82

Complexity of one pair of steps

The Gram Schmidt update (expansion) requires 4mk flops percolumn (essentially for the products r = UT a+, a = a+ − Ur )

For GuRGv =

[R+ 0

µi

]one requires the left and right singular

vectors of R which can be obtained in O(k2) flops per singular value(using inverse iteration)

Multiplying UGu and VGv requires 4mk flops per deflated column

The overall procedure requires 8mk flops per processed column andhence 8mnk flops for a rank k approximation to a m × n matrix A

One shows that A = U[

R A120 A22

]V T where ‖

[A12A22

]‖2

F is known

Page 10: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

10 / 82

Error estimates

Let E := A− A = UΣV T − UΣV T and µ := ‖E‖2

Let µ := maxµi where µi is the neglected singular value at step i

One shows that the error norm

µ ≤ σk+1 ≤ µ ≤√

n − k µ ≈ cµ

σi ≤ σi � σi + µ2/2σi

tan θk � tan θk := µ2/(σ2k − µ2), tanφk � tan φk := µσ1/(σ2

k − µ2)

where θk , φk are the canonical angles of dimension k :

cos θk := ‖UT (:, k)U‖2, cosφk := ‖V T (:, k)V‖2

Page 11: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

11 / 82

Examples

The bounds get much better when the gap σk − σk+1 is large

Page 12: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

12 / 82

Convergence

How quickly do we track the subpaces ?

How cos θ(i)k evolves with the time step i

Page 13: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

13 / 82

Example

Find the dominant behavior in an image sequence

Images can have up to 106 pixels

Page 14: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

14 / 82

Multipass iteration

Low Rank Incremental SVD can be applied in several passes, say to

1√k

[A A . . . A

]After the first block (or “pass”) a good approximation of the dominantspace U has already been constructed

Going over to the next block (second “pass”) will improve it, etc.

Theorem Convergence of the multipass method is linear, withapproximate ratio of convergence ψ/(1− κ2) < 1, where

I ψ measures orthogonality of the residual columns of AI κ is the ratio σk/σk+1 of A

Page 15: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

15 / 82

Convergence behavior

for increasing gap between “signal" and “noise"

Number of INCSVD steps

Page 16: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

16 / 82

Convergence behavior

for increasing orthogonality between “residual vectors"

Number of INCSVD steps

Page 17: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

17 / 82

Eigenfaces analysis

Ten dominant left singular vectors of ORL Database of faces(40 images, 10 subjects, 92×112 pixels = 10304×400 matrix)

Using MATLAB’ SVD function

Using one pass of incremental SVD

Maximal angle : 16.3◦, maximum relative error in sing. values : 4.8%

Page 18: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

18 / 82

Conclusions Incremental SVD

A useful and economical SVD approximation of Am,n

For matrices with columns that are very large or “arrive" with time

Complexity is proportional to mnk and the number of “passes"

Algorithms due to[1] Manjunath-Chandrasekaran-Wang (95)[2] Levy-Lindenbaum (00)[3] Chahlaoui-Gallivan-VanDooren (01)[4] Brand (03)[5] Baker-Gallivan-VanDooren (09)

Convergence analysis and accuracy in refs [3],[4],[5]

Page 19: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

19 / 82

Subset selection

We want a “good approximation" of Amn by a product Bmk PT wherePnk is a “selection matrix" i.e. a submatrix of the identity In

This seems connected to

min ‖A− BPT‖2

and maybe similar techniques can be used as for incremental SVD

Clearly, if B = AP, we just select a subset of the columns of A

Rather than minimizing ‖A− BPT‖2 we maximize vol(B) where

vol(B) = det(BT B)12 =

k∏i=1

σi (B), m ≥ k

There are(

nk

)possible choices and the problem is NP hard

and there is no polynomial time approximation algorithm

Page 20: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

20 / 82

Heuristics

Gu-Eisenstat show that the Strong Rank Revealing QR factorization(SRRQR) solves the following simpler problem

B is sub-optimal if there is no swapping of a single column of A(yielding B) that has a larger volume (constrained minimum)

Here, we propose a simpler “recursive updating" algorithm that hascomplexity O(mnk) rather than O(mn2) for Gu-Eisenstat

The idea is again based on a sliding window of size k + 1 (or k + `)

Sweep through columns of A while maintaining a “best" subset B

I Append a column of A to B, yielding B+

I Contract B+ to B by deleting the “weakest" column of B+

Page 21: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

21 / 82

Deleting the weakest column

Let B = A(:,1 : k) to start with and let B = QR where R is k × k

Append the next column a+ of A to form B+ and update itsdecomposition using Gram Schmidt

B+ :=[QR a+

]=[Q a+

] [R 01

]=[Q q

] [R rρ

]= Q+R+

with r = QT a+, a = a+ −Qr , a = qρ (since a+ = Qr + qρ)

Contract B+ to B by deleting the “weakest" column of R+

This can be done in O(mk2) using Gu-Eisenstat’s SRRQR methodbut an even simpler heuristic uses only O((m + k)k) flops

Page 22: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

22 / 82

Golub-Klema-Stewart heuristic

Let R+v = σk+1u be the singular vector pair corresponding to thesmallest singular value σk+1 of R+ and let vi be the components of v

Let Ri be the submatrix obtained by deleting column i from R+ then

σ2k+1

σ21

+

(1−

σ2k+1

σ21

)|vi |2 ≤

vol2(Ri )∏kj=1 σ

2j

≤σ2

k+1

σ2k

+

(1−

σ2k+1

σ2k

)|vi |2

Maximizing |vi | maximizes thus a lower bound on vol2(Ri )In practice this is almost always optimal and guaranteed to be so if

σ2k+1

σ2k

+

(1−

σ2k+1

σ2k

)|vi |2 ≤

σ2k+1

σ21

+

(1−

σ2k+1

σ21

)|vj |2 ∀j 6= i

Page 23: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

23 / 82

GKS method

Start with B = A(:,1 : k) = QR where R is k × k

For j = k + 1 : nI append column a+ := A(:, j) to get B+

I update its QR decomposition to B+ = Q+R+

I contract B+ to yield a new B using the GKS heuristicI update its QR decomposition to B = QR

One can verify the optimality by performing a second pass

Notice that GKS is optimal when σk+1 = 0 since then

vol(Ri ) = |vi |k∏

j=1

σj

Page 24: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

24 / 82

Dominant eigenspace of PSD matrix

Typical applications

I Kernel Matrices (Machine Learning)I Spectral Methods (Image Analysis)I Correlation Matrices (Statistics and Signal Processing)I Principal Component AnalysisI Karhunen-Loeve

I...

We use KN to denote the full N × N positive definite matrix

Page 25: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

25 / 82

Sweeping through K

I Suppose a rank m approximation of the dominant eigenspace ofKn ∈ Rn×n, n� m, is known,

Kn ≈ An := UnMmUTn ,

Mm ∈ Rm×m an spd matrix and Un ∈ Rn×m with UTn Un = Im

I Obtain the (n + 1)×m eigenspace Un+1 of the (n + 1)× (n + 1)kernel matrix Kn+1

Kn+1 =

[Kn aaT b

]≈ Un+1,m+2Mm+2UT

n+1,m+2

I Downdate Mm+2 to get back to rank mI Downsize Un+1 to get back to size n

Page 26: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

25 / 82

Sweeping through K

I Suppose a rank m approximation of the dominant eigenspace ofKn ∈ Rn×n, n� m, is known,

Kn ≈ An := UnMmUTn ,

Mm ∈ Rm×m an spd matrix and Un ∈ Rn×m with UTn Un = Im

I Obtain the (n + 1)×m eigenspace Un+1 of the (n + 1)× (n + 1)kernel matrix Kn+1

Kn+1 =

[Kn aaT b

]≈ Un+1,m+2Mm+2UT

n+1,m+2

I Downdate Mm+2 to get back to rank mI Downsize Un+1 to get back to size n

Page 27: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

25 / 82

Sweeping through K

I Suppose a rank m approximation of the dominant eigenspace ofKn ∈ Rn×n, n� m, is known,

Kn ≈ An := UnMmUTn ,

Mm ∈ Rm×m an spd matrix and Un ∈ Rn×m with UTn Un = Im

I Obtain the (n + 1)×m eigenspace Un+1 of the (n + 1)× (n + 1)kernel matrix Kn+1

Kn+1 =

[Kn aaT b

]≈ Un+1,m+2Mm+2UT

n+1,m+2

I Downdate Mm+2 to get back to rank m

I Downsize Un+1 to get back to size n

Page 28: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

25 / 82

Sweeping through K

I Suppose a rank m approximation of the dominant eigenspace ofKn ∈ Rn×n, n� m, is known,

Kn ≈ An := UnMmUTn ,

Mm ∈ Rm×m an spd matrix and Un ∈ Rn×m with UTn Un = Im

I Obtain the (n + 1)×m eigenspace Un+1 of the (n + 1)× (n + 1)kernel matrix Kn+1

Kn+1 =

[Kn aaT b

]≈ Un+1,m+2Mm+2UT

n+1,m+2

I Downdate Mm+2 to get back to rank mI Downsize Un+1 to get back to size n

Page 29: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

26 / 82

Sweeping through K

We show only the columns and rows of U and K that are involved

× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×

× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×× × × × × × × × × ×

Window size n = 5, rank k = 4

Page 30: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

27 / 82

step 1

× × × ×× × × ×× × × ×× × × ×× × × ×

× × × × ×× × × × ×× × × × ×× × × × ×× × × × ×

Start with leading n × n subproblem

Page 31: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

28 / 82

step 2

× × × ×× × × ×× × × ×× × × ×× × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Expand

Page 32: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

29 / 82

step 3

× × × ×× × × ×× × × ×× × × ×× × × ×× × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Downdate and downsize

Page 33: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

30 / 82

step 4

× × × ×× × × ×

× × × ×× × × ×× × × ×

× × × × ×× × × × ×

× × × × ×× × × × ×× × × × ×

Page 34: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

31 / 82

step 5

× × × ×× × × ×

× × × ×× × × ×× × × ×

× × × × × ×× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Expand

Page 35: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

32 / 82

step 6

× × × ×× × × ×

× × × ×× × × ×× × × ×× × × ×

× × × × × ×× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Downdate and downsize

Page 36: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

33 / 82

step 7

× × × ×× × × ×

× × × ×

× × × ×× × × ×

× × × × ×× × × × ×

× × × × ×

× × × × ×× × × × ×

Page 37: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

34 / 82

step 8

× × × ×× × × ×

× × × ×

× × × ×× × × ×

× × × × × ×× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×

Expand

Page 38: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

35 / 82

step 9

× × × ×× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × × ×× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×

Downdate and downsize

Page 39: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

36 / 82

step 10

× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × ×

× × × × ×

× × × × ×× × × × ×× × × × ×

Page 40: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

37 / 82

step 11

× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Expand

Page 41: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

38 / 82

step 12

× × × ×

× × × ×

× × × ×× × × ×× × × ×× × × ×

× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Downdate and downsize

Page 42: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

39 / 82

step 13

× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × ×

× × × × ×

× × × × ×× × × × ×× × × × ×

Page 43: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

40 / 82

step 14

× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Expand

Page 44: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

41 / 82

step 15

× × × ×

× × × ×

× × × ×× × × ×× × × ×× × × ×

× × × × × ×

× × × × × ×

× × × × × ×× × × × × ×× × × × × ×× × × × × ×

Downdate and downsize

Page 45: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

42 / 82

step 16

× × × ×

× × × ×

× × × ×× × × ×× × × ×

× × × × ×

× × × × ×

× × × × ×× × × × ×× × × × ×

Page 46: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

43 / 82

Downdating K to fixed rank m

I Suppose a rank m approximation of the dominant eigenspace ofKn ∈ Rn×n, n� m, is known,

Kn ≈ An := UnMmUTn ,

Mm ∈ Rm×m an spd matrix and Un ∈ Rn×m with UTn Un = Im

I Obtain the (n + 1)×m eigenspace Un+1 of the (n + 1)× (n + 1)kernel matrix Kn+1

Kn+1 =

[Kn aaT b

]≈ Un+1,m+2Mm+2UT

n+1,m+2

I Downdate Mm+2 to delete the “smallest" two eigenvalues

Page 47: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

44 / 82

Updating: Proposed algorithm

Since

a = UnUTn a + (Im − UnUT

n )a= Unr + ρu⊥,

with q = (Im − UnUTn )a, ρ = ‖q‖2, u⊥ = q/ρ, we can write

An+1 =

[An aaT b

]

=

[Un u⊥

1

]Mm r0 ρ

rT ρ b

UTn

u⊥T

1

=

[Un u⊥

1

]Mm+2

UTn

u⊥T

1

.

Page 48: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

45 / 82

Updating: Proposed algorithm

I LetMm = QmΛmQT

m

whereΛm = diag(µ1, µ2, . . . , µm), µ1 ≥ · · · ≥ µm > 0, QT

mQm = Im.

I LetMm+2 = Qm+2Λm+2QT

m+2

whereΛm+2 = diag(λ1, λ2, . . . , λm+1, λm+2), QT

m+2Qm+2 = Im+2

I By the interlacing property, we have

λ1 ≥ µ1 ≥ λ2 ≥ µ2 ≥ · · · ≥ µm ≥ λm+1 ≥ 0 ≥ λm+2

Page 49: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

45 / 82

Updating: Proposed algorithm

I LetMm = QmΛmQT

m

whereΛm = diag(µ1, µ2, . . . , µm), µ1 ≥ · · · ≥ µm > 0, QT

mQm = Im.

I LetMm+2 = Qm+2Λm+2QT

m+2

whereΛm+2 = diag(λ1, λ2, . . . , λm+1, λm+2), QT

m+2Qm+2 = Im+2

I By the interlacing property, we have

λ1 ≥ µ1 ≥ λ2 ≥ µ2 ≥ · · · ≥ µm ≥ λm+1 ≥ 0 ≥ λm+2

Page 50: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

45 / 82

Updating: Proposed algorithm

I LetMm = QmΛmQT

m

whereΛm = diag(µ1, µ2, . . . , µm), µ1 ≥ · · · ≥ µm > 0, QT

mQm = Im.

I LetMm+2 = Qm+2Λm+2QT

m+2

whereΛm+2 = diag(λ1, λ2, . . . , λm+1, λm+2), QT

m+2Qm+2 = Im+2

I By the interlacing property, we have

λ1 ≥ µ1 ≥ λ2 ≥ µ2 ≥ · · · ≥ µm ≥ λm+1 ≥ 0 ≥ λm+2

Page 51: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

46 / 82

Updating: Proposed algorithm

We want a simple orthogonal transformation H such that

H [vm+1 vm+2] =

0 0× 00 ×

, HMm+2HT =

Mm

λm+1λm+2

,with Mm ∈ Rm×m. Therefore

An+1 =

[Un u⊥

1

]HT

Mm

λm+1λm+2

H

UTn

u⊥T

1

.and the new updated decomposition is given by

An+1 = Un+1MmUTn+1,

with Un+1 given by the first m columns of[Un u⊥

1

]HT

Page 52: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

47 / 82

Cholesky factorization

I It is not needed to compute the whole spectral decomposition ofthe matrix Mm+2

I To compute H only the eigenvectors vm+1, vm+2 correspondingto λm+1, λm+2 are needed

I To compute these vectors cheaply, one needs to maintain (andupdate) the Cholesky factorization of

Mm = LmLTm

I The eigenvectors are then obtained via inverse iteration and

I H can then be computed as a product of Householder or Givenstransformations

Page 53: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

47 / 82

Cholesky factorization

I It is not needed to compute the whole spectral decomposition ofthe matrix Mm+2

I To compute H only the eigenvectors vm+1, vm+2 correspondingto λm+1, λm+2 are needed

I To compute these vectors cheaply, one needs to maintain (andupdate) the Cholesky factorization of

Mm = LmLTm

I The eigenvectors are then obtained via inverse iteration and

I H can then be computed as a product of Householder or Givenstransformations

Page 54: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

47 / 82

Cholesky factorization

I It is not needed to compute the whole spectral decomposition ofthe matrix Mm+2

I To compute H only the eigenvectors vm+1, vm+2 correspondingto λm+1, λm+2 are needed

I To compute these vectors cheaply, one needs to maintain (andupdate) the Cholesky factorization of

Mm = LmLTm

I The eigenvectors are then obtained via inverse iteration and

I H can then be computed as a product of Householder or Givenstransformations

Page 55: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

47 / 82

Cholesky factorization

I It is not needed to compute the whole spectral decomposition ofthe matrix Mm+2

I To compute H only the eigenvectors vm+1, vm+2 correspondingto λm+1, λm+2 are needed

I To compute these vectors cheaply, one needs to maintain (andupdate) the Cholesky factorization of

Mm = LmLTm

I The eigenvectors are then obtained via inverse iteration and

I H can then be computed as a product of Householder or Givenstransformations

Page 56: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

47 / 82

Cholesky factorization

I It is not needed to compute the whole spectral decomposition ofthe matrix Mm+2

I To compute H only the eigenvectors vm+1, vm+2 correspondingto λm+1, λm+2 are needed

I To compute these vectors cheaply, one needs to maintain (andupdate) the Cholesky factorization of

Mm = LmLTm

I The eigenvectors are then obtained via inverse iteration and

I H can then be computed as a product of Householder or Givenstransformations

Page 57: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

48 / 82

Updating Cholesky

Mm+2 =

Mm r0 ρ

rT ρ b

=

LmLTm r

0 ρ

rT ρ b

=

Lm

0TmtT I2

[ImSc

] [LT

m 0m tI2

]= Lm+2Dm+2LT

m+2,

where

t = L−1m r and Sc =

[0 ρ

ρ b − tT t

].

Page 58: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

49 / 82

step 1

→→

×

⊗×××××××××

,→→

×××××××××××××××××××××

1

11

1× ×× ×

↓ ↓×××××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 59: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

50 / 82

step 2

××

××××××××

,

↓ ↓×××⊗××××××××××××××××××

↓ ↓

→→

1

11

1× ×× ×

→→

×××××××××××⊗××××

××××××

,

××××××......

......

......

××××××

Page 60: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

51 / 82

step 3

→→

××

⊗×××××××

, →→

×××××××××××××××××××××

1

11

1× ×× ×

↓ ↓×××××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 61: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

52 / 82

step 4

×××

××××××

,

↓ ↓××××××⊗×××××××××××××××

↓ ↓

→→

1

11

1× ×× ×

→→

×××××××××××××××⊗×××

×××

,

××××××......

......

......

××××××

Page 62: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

53 / 82

step 5

→→

×××

⊗×××××

,→→

×××××××××××××××××××××

1

11

1× ×× ×

↓ ↓×××××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 63: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

54 / 82

step 6

××××

××××

,

↓ ↓××××××××××⊗×××××××××××

↓ ↓

→→

1

11

1× ×× ×

→→

××××××××××××××××××⊗××

×

,

××××××......

......

......

××××××

Page 64: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

55 / 82

step 7

→→

××××

⊗×××

,

→→

×××××××××××××××××××××

1

11

1 × ×× × ×× × ×

↓ ↓×××××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 65: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

56 / 82

step 8

×××××

1

,

↓ ↓×××××××××××××××⊗

×××

↓ ↓

→→

1

11

1 × ×× × ×× × ×

→→

×××××××××××××××××⊗×

,

××××××......

......

......

××××××

Page 66: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

57 / 82

step 9

×××××

1

,

×××××××××××××××

×××

1

11

1 × ×× × ×× × ×

××××××××××××××××××

,

××××××......

......

......

××××××

Page 67: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

58 / 82

step 9

→→

⊗××××

1

,

→→

×××××××××××××××

×××

1

11

1 × ×× × ×× × ×

↓ ↓××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 68: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

59 / 82

step 10

××××

1 ×

,

↓ ↓×⊗××××××××××××××

×××

↓ ↓→→

1

11

1 × ×× × ×× × ×

→→

×××××⊗××××

×××××××××

,

××××××......

......

......

××××××

Page 69: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

60 / 82

step 11

→→

⊗×××

1

,→→

×××××××××××××××

×××

1

11

1 × ×× × ×× × ×

↓ ↓××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 70: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

61 / 82

step 12

×××

1

,

↓ ↓×××⊗××××××××××××

×××

↓ ↓

→→

1

11

1 × ×× × ×× × ×

→→

×××××××××⊗×××

××××××

,

××××××......

......

......

××××××

Page 71: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

62 / 82

step 13

→→

⊗××

1

, →→

×××××××××××××××

×××

1

11

1 × ×× × ×× × ×

↓ ↓××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 72: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

63 / 82

step 14

××

1

,

↓ ↓××××××⊗×××××××××

×××

↓ ↓

→→

1

11

1 × ×× × ×× × ×

→→

××××××××××××⊗×××

×××

,

××××××......

......

......

××××××

Page 73: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

64 / 82

step 15

→→

⊗×

1

,→→

×××××××××××××××

××××

1

11 × × ×× × × ×× × × ×× × × ×

↓ ↓×××××××××××××××××××

,

↓ ↓××××××......

......

......

××××××

Page 74: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

65 / 82

step 16

11

,

↓ ↓××××××××××⊗

×××××××

↓ ↓

→→

1

11 × × ×× × × ×× × × ×× × × ×

→→

××××××××××××××⊗××

×

,

××××××......

......

......

××××××

Page 75: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

66 / 82

step 17

11

,

××××××××××

×××××××

1

11 × × ×× × × ×× × × ×× × × ×

××××××××× ××× ×××××

,

×××× ××...

......

......

...×××× ××

Page 76: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

67 / 82

Downsizing algorithm

I Let Hv be orthogonal such that

Hv v = υ e1,m, υ = ∓‖v‖2, Un+1 =

[vT

V

].

I Then

Un+1Hv =

[υ 0 · · · 0

VHv

].

I To retrieve the orthonormality of VHv , it is sufficient to divide itsfirst column of by

√1− υ2 and therefore multiply the first column

and row of Mm by the same quantityI If the matrix Mm is factored as LmLT

m this reduces to multiplyingthe first entry of Lm by

√1− υ2

I Any row of Un+1 can be chosen to be removed this way

Page 77: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

67 / 82

Downsizing algorithm

I Let Hv be orthogonal such that

Hv v = υ e1,m, υ = ∓‖v‖2, Un+1 =

[vT

V

].

I Then

Un+1Hv =

[υ 0 · · · 0

VHv

].

I To retrieve the orthonormality of VHv , it is sufficient to divide itsfirst column of by

√1− υ2 and therefore multiply the first column

and row of Mm by the same quantityI If the matrix Mm is factored as LmLT

m this reduces to multiplyingthe first entry of Lm by

√1− υ2

I Any row of Un+1 can be chosen to be removed this way

Page 78: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

67 / 82

Downsizing algorithm

I Let Hv be orthogonal such that

Hv v = υ e1,m, υ = ∓‖v‖2, Un+1 =

[vT

V

].

I Then

Un+1Hv =

[υ 0 · · · 0

VHv

].

I To retrieve the orthonormality of VHv , it is sufficient to divide itsfirst column of by

√1− υ2 and therefore multiply the first column

and row of Mm by the same quantity

I If the matrix Mm is factored as LmLTm this reduces to multiplying

the first entry of Lm by√

1− υ2

I Any row of Un+1 can be chosen to be removed this way

Page 79: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

67 / 82

Downsizing algorithm

I Let Hv be orthogonal such that

Hv v = υ e1,m, υ = ∓‖v‖2, Un+1 =

[vT

V

].

I Then

Un+1Hv =

[υ 0 · · · 0

VHv

].

I To retrieve the orthonormality of VHv , it is sufficient to divide itsfirst column of by

√1− υ2 and therefore multiply the first column

and row of Mm by the same quantityI If the matrix Mm is factored as LmLT

m this reduces to multiplyingthe first entry of Lm by

√1− υ2

I Any row of Un+1 can be chosen to be removed this way

Page 80: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

67 / 82

Downsizing algorithm

I Let Hv be orthogonal such that

Hv v = υ e1,m, υ = ∓‖v‖2, Un+1 =

[vT

V

].

I Then

Un+1Hv =

[υ 0 · · · 0

VHv

].

I To retrieve the orthonormality of VHv , it is sufficient to divide itsfirst column of by

√1− υ2 and therefore multiply the first column

and row of Mm by the same quantityI If the matrix Mm is factored as LmLT

m this reduces to multiplyingthe first entry of Lm by

√1− υ2

I Any row of Un+1 can be chosen to be removed this way

Page 81: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

68 / 82

Accuracy bounds

When there is no downsizing

‖Kn − An‖2F ≤ ηn :=

n∑i=m+1

λ(n)i

2+

n∑i=n+1

δ(+)i

2+

n∑i=n+1

δ(−)i

2,

‖Kn − An‖2 ≤ ζn := λ(n)m+1 +

n∑i=n+1

max{δ(+)i , δ

(−)i

},

where

An := UnMmUTn , δ

(+)i = λ

(m+1)i , δ

(−)i = λ

(m+2)i ,

‖Kn − An‖2F =

n∑i=m+1

λ(n)i

2, ‖Kn − An‖2 = λ

(n)m+1.

Page 82: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

69 / 82

Accuracy bounds

In relation to the original spectrum KN one obtains the approximatebounds

N∑i=m+1

λ2i ≤ ‖KN − AN‖2

F / (N −m) λ2m+1.

andλm+1 ≤ ‖KN − AN‖2 / cλm+1,

When donwsizing the matrix as well, there are no guaranteed bounds

Page 83: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

70 / 82

Example 1 (no downsizing)

I The matrix considered in this example is a Kernel Matrixconstructed from the Abalone benchmark data sethttp://archive.ics.uci.edu/ml/support/Abalone,with radial basis kernel function

k(x,y) = exp(−‖x− y‖2

2100

),

I This data set has 4177 training instances

Page 84: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

70 / 82

Example 1 (no downsizing)

I The matrix considered in this example is a Kernel Matrixconstructed from the Abalone benchmark data sethttp://archive.ics.uci.edu/ml/support/Abalone,with radial basis kernel function

k(x,y) = exp(−‖x− y‖2

2100

),

I This data set has 4177 training instances

Page 85: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

71 / 82

0 10 20 30 40 50 60 70 80 90 10010

−12

10−10

10−8

10−6

10−4

10−2

100

102

104

Figure: Distribution of the largest 100 eigenvalues of the Abalone matrix inlogarithmic scale.

Page 86: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

72 / 82

Table: Largest 9 eigenvalues of the Abalone matrix KN (second column), ofAN obtained with updating with n = 500, m = 9 (third column) and withn = 500, m = 20 (fourth column), respectively.

λi µi , n = 500,m = 9 µi , n = 500,m = 204.14838108255808e+3 4.14838108255812e+3 4.14838108255805e+32.77142467123926e+1 2.77142467123935e+1 2.77142467123908e+13.96946486354603e-1 3.96946485174339e-1 3.96946486354575e-12.82827838600384e-1 2.82827838240747e-1 2.82827838601794e-18.76354938729571e-2 8.76354893664714e-2 8.76354938730078e-24.48191766538717e-2 4.48191002296202e-2 4.48191766537462e-23.95005821149249e-2 3.95005033082028e-2 3.95005821145827e-23.44916594206443e-2 3.44915746496473e-2 3.44916594206963e-21.22751950123456e-2 1.22750932394003e-2 1.22751950116852e-2

Page 87: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

73 / 82

0 500 1000 1500 2000 2500 3000 3500 4000 450010

−6

10−5

10−4

10−3

10−2

Figure: Plot of the sequences of δ(+)n (blue line), δ(−)

n (green line), ηn (redsolid line), λm+1 (cyan solid line) and ‖KN − AN‖F (magenta solid line).

Page 88: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

74 / 82

Table: Angles between the eigenvectors corresponding to the largest 9eigenvalues of KN , computed by the function eigs of matlab and thosecomputed by the proposed algorithm for n = 500, m = 9 (second column)and n = 500, m = 20 (third column).

i ∠(xi , xi ) ∠(xi , xi )1 3.6500e-08 8.4294e-082 3.9425e-08 2.9802e-083 2.3774e-06 5.1619e-084 2.5086e-06 2.9802e-085 3.0084e-05 1.1151e-076 2.0446e-04 4.2147e-087 2.0213e-04 1.4901e-088 3.4670e-04 8.1617e-089 5.9886e-04 2.1073e-08

Page 89: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

75 / 82

Example 2 (downsizing)

The following matrix has rank 3

F (i , j) =3∑

k=1

exp(− (i − µk )2 + (j − µk )2

2σk

), i , j = 1, . . . ,100,

withµ =

[4 18 76

], σ =

[10 20 5

].

Let F = QΛQT be its spectral decomposition and let ∆ ∈ R100×100 bea matrix of random numbers generated by the matlab functionrandn, and define ∆ = ∆/‖∆‖2.For this example, the considered SPD matrix is

KN = F + ε∆∆T , ε = 1.0e − 5

Page 90: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

76 / 82

Example 2

0

20

40

60

80

100

020

4060

80100

0

0.5

1

1.5

Figure: Graph of the size of the entries of the matrix KN .

Page 91: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

77 / 82

Example 2

0 20 40 60 80 10010

−12

10−10

10−8

10−6

10−4

10−2

100

102

Figure: Distribution of the eigenvalues of the matrix KN in logarithmic scale.

Page 92: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

78 / 82

Example 2

0 20 40 60 80 100−0.1

0

0.1

0.2

0.3

0.4

0.5

0.6

Figure: Plot of the three dominant eigenvectors of KN

Page 93: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

79 / 82

Example 2

30 40 50 60 70 80 90 10010

−7

10−6

10−5

10−4

Figure: Plot of the sequences of δ(+)n (blue dash dotted line), δ(−)

k (greendotted line), ηn (red solid line), λm+1 (cyan solid line) and ‖KN − AN‖F (blacksolid line).

Page 94: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

80 / 82

Example 2

Table: Largest three eigenvalues of the matrix KN (first column). Largestthree eigenvalues computed with downdating procedure with minimal norm,with m = 3 and n = 30, 40, 50 (second, third and fourth column), respectively.Largest three largest eigenvalues of KN computed with the “ former”downdating procedure (fifth column).

λi µi , n = 30 µi , n = 40 µi , n = 50 µi , n = 507.949478e0 7.375113e0 7.820407e0 7.947127e0 3.963329e05.261405e0 5.255163e0 5.260243e0 5.261384e0 5.417202e-63.963329e0 3.948244e0 3.963213e0 3.963329e0 4.824060e-6

Page 95: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

81 / 82

Conclusions

I A fast algorithm for to compute incrementally the dominanteigenspace of a positive definite matrix

I Improvement on Hoegaerts, L., De Lathauwer, L., Goethals I.,Suykens, J.A.K., Vandewalle, J., & De Moor, B. Efficientlyupdating and tracking the dominant kernel principal components.Neural Networks, 20, 220–229, 2007.

I The overall complexity of the incremental updating technique tocompute an N ×m basis matrix UN for the dominant eigenspaceof KN , is reduced from (m + 4)N2m + O(Nm3) to6N2m + O(Nm2).

I When using both incremental updating and downsizing tocompute the dominant eigenspace of Kn (an n × n principalsubmatrix of KN ), the complexity is reduced(12m + 4)Nnm + O(Nm3) to 16Nnm + O(Nm2).

I This is in both cases essentially a reduction by a factor m.

Page 96: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

81 / 82

Conclusions

I A fast algorithm for to compute incrementally the dominanteigenspace of a positive definite matrix

I Improvement on Hoegaerts, L., De Lathauwer, L., Goethals I.,Suykens, J.A.K., Vandewalle, J., & De Moor, B. Efficientlyupdating and tracking the dominant kernel principal components.Neural Networks, 20, 220–229, 2007.

I The overall complexity of the incremental updating technique tocompute an N ×m basis matrix UN for the dominant eigenspaceof KN , is reduced from (m + 4)N2m + O(Nm3) to6N2m + O(Nm2).

I When using both incremental updating and downsizing tocompute the dominant eigenspace of Kn (an n × n principalsubmatrix of KN ), the complexity is reduced(12m + 4)Nnm + O(Nm3) to 16Nnm + O(Nm2).

I This is in both cases essentially a reduction by a factor m.

Page 97: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

81 / 82

Conclusions

I A fast algorithm for to compute incrementally the dominanteigenspace of a positive definite matrix

I Improvement on Hoegaerts, L., De Lathauwer, L., Goethals I.,Suykens, J.A.K., Vandewalle, J., & De Moor, B. Efficientlyupdating and tracking the dominant kernel principal components.Neural Networks, 20, 220–229, 2007.

I The overall complexity of the incremental updating technique tocompute an N ×m basis matrix UN for the dominant eigenspaceof KN , is reduced from (m + 4)N2m + O(Nm3) to6N2m + O(Nm2).

I When using both incremental updating and downsizing tocompute the dominant eigenspace of Kn (an n × n principalsubmatrix of KN ), the complexity is reduced(12m + 4)Nnm + O(Nm3) to 16Nnm + O(Nm2).

I This is in both cases essentially a reduction by a factor m.

Page 98: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

81 / 82

Conclusions

I A fast algorithm for to compute incrementally the dominanteigenspace of a positive definite matrix

I Improvement on Hoegaerts, L., De Lathauwer, L., Goethals I.,Suykens, J.A.K., Vandewalle, J., & De Moor, B. Efficientlyupdating and tracking the dominant kernel principal components.Neural Networks, 20, 220–229, 2007.

I The overall complexity of the incremental updating technique tocompute an N ×m basis matrix UN for the dominant eigenspaceof KN , is reduced from (m + 4)N2m + O(Nm3) to6N2m + O(Nm2).

I When using both incremental updating and downsizing tocompute the dominant eigenspace of Kn (an n × n principalsubmatrix of KN ), the complexity is reduced(12m + 4)Nnm + O(Nm3) to 16Nnm + O(Nm2).

I This is in both cases essentially a reduction by a factor m.

Page 99: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

81 / 82

Conclusions

I A fast algorithm for to compute incrementally the dominanteigenspace of a positive definite matrix

I Improvement on Hoegaerts, L., De Lathauwer, L., Goethals I.,Suykens, J.A.K., Vandewalle, J., & De Moor, B. Efficientlyupdating and tracking the dominant kernel principal components.Neural Networks, 20, 220–229, 2007.

I The overall complexity of the incremental updating technique tocompute an N ×m basis matrix UN for the dominant eigenspaceof KN , is reduced from (m + 4)N2m + O(Nm3) to6N2m + O(Nm2).

I When using both incremental updating and downsizing tocompute the dominant eigenspace of Kn (an n × n principalsubmatrix of KN ), the complexity is reduced(12m + 4)Nnm + O(Nm3) to 16Nnm + O(Nm2).

I This is in both cases essentially a reduction by a factor m.

Page 100: Dominant feature extraction · -6pt-6pt Dominant feature extraction-6pt-6pt 3 / 82 Dominant singular subspaces Given A m n, approximate it by a rank k factorization B m kC k n by

-6pt-6pt Dominant featureextraction

-6pt-6pt

82 / 82

References

Gu, Eisenstat, An efficient algorithm for computing a strong rank revealingQR factorization, SIAM SCISC, 1996

Chahlaoui, Gallivan, Van Dooren, An incremental method for computingdominant singular subspaces, SIMAX, 2001

Hoegaerts, De Lathauwer, Goethals, Suykens, Vandewalle, De Moor,Efficiently updating and tracking the dominant kernel principal componentsNeural Networks, 2007.

Mastronardi, Tyrtishnikov, Van Dooren, A fast algorithm for updating anddownsizing the dominant kernel principal components, SIMAX, 2010

Baker, Gallivan, Van Dooren, Low-rank incrmenetal methods for computingdominant singular subspaces, submitted, 2010

Ipsen, Van Dooren, Polynomial Time Subset Selection Via Updating, inpreparation, 2010


Recommended