+ All Categories
Home > Documents > Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a...

Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a...

Date post: 27-Jan-2020
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
41
Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information Engineering Cornell University Based on joint work with Yiming Sun (Cornell), Yang Guo (UW Madison), Charlene Luo (Columbia), and Joel Tropp (Caltech) April 1, 2019 Madeleine Udell, Cornell. Streaming Tucker Approximation. 1
Transcript
Page 1: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Low Rank Tucker Approximation of a Tensorfrom Streaming Data

Madeleine Udell

Operations Research and Information EngineeringCornell University

Based on joint work withYiming Sun (Cornell), Yang Guo (UW Madison),

Charlene Luo (Columbia), and Joel Tropp (Caltech)

April 1, 2019

Madeleine Udell, Cornell. Streaming Tucker Approximation. 1

Page 2: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 2

Page 3: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Big data, small laptop

X = H1 + · · ·+ HT X

HT X

Madeleine Udell, Cornell. Streaming Tucker Approximation. 3

Page 4: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Distributed data

X = H1 + · · ·+ HT X

H1

Ht

HT

X

Madeleine Udell, Cornell. Streaming Tucker Approximation. 4

Page 5: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Streaming data

X(t) = H1 + · · ·+ Ht

Ht

Madeleine Udell, Cornell. Streaming Tucker Approximation. 5

Page 6: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Streaming multilinear algebra

turnstile model:

X = H1 + · · ·+ HT

I tensor X presented as sum of smaller, simpler tensors Ht

I must discard Ht after it is processed

I Goal: without storing X, approximate X after seeing allupdates (with guaranteed accuracy)

applications:

I scientific simulation

I sensor measurements

I memory- or communication-limited computing

I low memory optimization

Madeleine Udell, Cornell. Streaming Tucker Approximation. 6

Page 7: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Linear sketch

X = H1 + · · ·+ HT

L(X) = L(H1) + · · ·+ L(HT )

I select a linear map L independent of X

I sketch L(X) is much smaller than input tensor X

I use randomness so sketch works for an arbitrary input

I essentially the only way to handle the turnstile model[Li, Nguyen & Woodruff 2014]

examples:

I L(X) = X×n Ω for some matrix Ω

I L(X) = X×n Ωnn∈[N] for some matrices Ωnn∈[N]

Madeleine Udell, Cornell. Streaming Tucker Approximation. 7

Page 8: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Main idea

sketch suffices for (Tucker) approximation:

I compute (randomized) linear sketch of tensor

I recover low rank (Tucker) approximation from sketch

I (optional) improve approximation by revisiting data

Madeleine Udell, Cornell. Streaming Tucker Approximation. 8

Page 9: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Big data, small laptop: sketch

X = H1 + · · ·+ HT L(X)

HT L(X)

I (+) reduced communication

I (+) sketch of data fits on laptop

Madeleine Udell, Cornell. Streaming Tucker Approximation. 9

Page 10: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Distributed data: sketch

L(X(T ))

= L(H1 + · · ·+ HT−1) + L(HT )

L(H1)

L(Ht)

L(HT )

I (+) reduced communication

I (+) no PITI (personally identifiable toast information)

I (+) sketch of data fits on laptop

Madeleine Udell, Cornell. Streaming Tucker Approximation. 10

Page 11: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Streaming data: sketch

L(X(t)) = L(H1 + · · ·+ Ht−1) + L(Ht)

L(Ht)

I (+) even a toaster can form sketch

Madeleine Udell, Cornell. Streaming Tucker Approximation. 11

Page 12: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 12

Page 13: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Notation

tensor to compress:

I tensor X ∈ RI1×···×IN with N modes

I sometimes assume I1 = · · · = IN = I for simplicity

indexing:

I [N] = 1, . . . ,N

I I(−n) = I1 × · · · × In−1 × In+1 × IN

tensor operations:

I mode n product: for A ∈ Rk×In ,X×n A ∈ RI1×···×In−1×k×In+1×···×IN

I unfolding X(n) ∈ RIn×I(−n) stacks mode-n fibers of X ascolumns of matrix

Madeleine Udell, Cornell. Streaming Tucker Approximation. 13

Page 14: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Tucker factorization

rank r = (r1, . . . , rN) Tucker factorization of X ∈ RI1×···×IN :

X = G×1 U1 · · · ×N UN =: JG; U1, . . . ,UNK

where

I G ∈ Rr1×···×rN is the core matrix

I Un ∈ RIn×rn is the factor matrix for each mode n ∈ [N]

(sometimes assume r1 = · · · = rN = r for simplicity)

Tucker is useful for compression: when N is small,

I Tucker stores O(rNI ) numbers for rank r3 approximation

I CP stores O(rNI ) numbers for rank r approximation

future work: one pass ST-HOSVD / tensor train?

Madeleine Udell, Cornell. Streaming Tucker Approximation. 14

Page 15: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Computing Tucker: HOSVD

Algorithm Higher order singular value decomposition (HOSVD)[De Lathauwer, De Moor & Vandewalle 2000, Tucker 1966]

Given: tensor X, rank r = (r1, . . . , rN)

1. Factors. Compute top rn left singular vectors Un of theunfolding X(n) for each n ∈ [N].

2. Core. Contract these with X to form the core

G = X×1 UT1 · · · ×N UT

N .

Return: Tucker approximation XHOSVD = JG; U1, . . . ,UNK

Madeleine Udell, Cornell. Streaming Tucker Approximation. 15

Page 16: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Two pass HOSVD

HOSVD can be computed in two passes over the tensor:

I Factors. use randomized linear algebraI need to find span of fibers of X along nth mode:

range(Un) ≈ range(X(n))

I if rank(Ω) ≥ rank(X(n)), then whp for random Ω,

range(X(n)) = range(X(n)Ω)

algorithm:1. compute sketch L(X) = X(n)Ωnn∈[N]

2. use QR on sketch to approximate range(X(n))

I Core. Computation is linear in X:

G = X×1 UT1 · · · ×N UT

N .

Source: [Halko, Martinsson & Tropp 2011, Zhou, Cichocki & Xie 2014, Battaglino,

Ballard & Kolda 2019]

Madeleine Udell, Cornell. Streaming Tucker Approximation. 16

Page 17: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Computing Tucker: HOOI

Algorithm Higher order orthogonal iteration (HOOI)[De Lathauwer et al. 2000]

Given: tensor X, rank r = (r1, . . . , rN)Initialize: compute X ≈ JG; U1, . . . ,UNK using HOSVDRepeat:

1. Factors. For n ∈ [N],

Un ← argminUn

‖JG; U1, . . . ,UNK−X‖2F ,

2. Core.G← argmin

G

‖JG; U1, . . . ,UNK−X‖2F .

Return: Tucker approximation XHOOI = JG; U1, . . . ,UNK

I core update has closed form G← X×1 U>1 · · · ×N U>NMadeleine Udell, Cornell. Streaming Tucker Approximation. 17

Page 18: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Previous work: one pass algorithm via HOOI

[Malik & Becker 2018]:

I (+) sketch design matrix to reduce size of HOOIsubproblems

I (+) exploit Tucker structure of design matrix

I (-) expensive slow reconstruction (via iterative optimization)

I (-) no error guarantees for one pass algorithm

Madeleine Udell, Cornell. Streaming Tucker Approximation. 18

Page 19: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 19

Page 20: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Background: randomized sketches

idea: random matrix Ω is not orthogonal to range of interest(whp)

range(X(n)) = range(X(n)Ω)

a dimension reduction map (DRM) (approximately) preservesrange of its argument

examples of DRMS: multiplication by random matrix Ω that is

I gaussian

I sparse [Achlioptas 2003, Li, Hastie & Church 2006]

I SSFRT [Woolfe, Liberty, Rokhlin & Tygert 2008]

I tensor random projection (TRP) [Sun, Guo, Tropp & Udell2018]

I . . .

Madeleine Udell, Cornell. Streaming Tucker Approximation. 20

Page 21: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

The sketch

approximate factor matrices and core:

I Factor sketch (k). For each n ∈ [N],fix random DRM Ωn ∈ RI(−n)×kn and compute the sketch

Vn = X(n)Ωn ∈ RIn×kn .

I Core sketch (s). For each n ∈ [N],fix random DRM Φn ∈ RIn×sn . Compute the sketch

H = X×1 Φ>1 · · · ×N Φ>N ∈ Rs1×···×sN .

I Rule of thumb. Pick k as big as you can afford, pick s = 2k.

I define (H,V1, . . . ,VN) = Sketch(X; Φn,Ωnn∈[N]

)Madeleine Udell, Cornell. Streaming Tucker Approximation. 21

Page 22: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Low memory DRMs

factor sketch DRMs are big!

I I(−n) × kn for each n ∈ [N]

how to store?

I don’t store DRMS; instead, use pseudorandom numbergenerator to generate (parts of) DRMs as needed.

I use structured DRM:I TRP generates DRM as Khatri-Rao product of simpler,

smaller DRMsI behaves approximately like a Gaussian sketch

Source: [Sun et al. 2018, Rudelson 2012]

Madeleine Udell, Cornell. Streaming Tucker Approximation. 22

Page 23: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 23

Page 24: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Recovery: factor matrices

I compute QR factorization of each factor sketch Vn:

Vn = QnRn

where Qn is orthonormal and Rn is triangular

Madeleine Udell, Cornell. Streaming Tucker Approximation. 24

Page 25: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Two pass algorithm

Algorithm Two Pass Sketch and Low Rank Recovery

Given: tensor X, rank r = (r1, . . . , rN), DRMs Φn,Ωnn∈[N]

I Sketch. (H,V1, . . . ,VN) = Sketch(X; Φn,Ωnn∈[N]

)I Recover factor matrices. For n ∈ [N],

(Qn,∼)← QR(Vn)

I Recover core.

W← X×1 Q1 · · · ×N QN

Return: Tucker approximation X = JW; Q1, . . . ,QNK

accesses X twice: 1) to sketch 2) to recover core

Madeleine Udell, Cornell. Streaming Tucker Approximation. 25

Page 26: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Intuition: one pass core recovery

I we want to know W:compression of X using factor range approximations Qn

I we observe H:compression of X using random projections Φn

how to approximate W?

X ≈ X×1 Q1Q>1 × · · · ×N QNQ>N

=(X×1 Q>1 ×N · · · ×Q>N

)×1 Q1 · · · ×N QN

= W×1 Q1 · · · ×N QN

X×1 Φ>1 · · · ×N Φ>N︸ ︷︷ ︸H

≈ W×1 Φ>1 Q1 × · · · ×N Φ>NQN

we can solve for W: s > k, so each Φ>n Qn has a left inverse(whp):

W ≈H×1 (Φ>1 Q1)† × · · · ×N (Φ>NQN)†

Madeleine Udell, Cornell. Streaming Tucker Approximation. 26

Page 27: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

One pass algorithm

Algorithm One Pass Sketch and Low Rank Recovery

Given: tensor X, rank r = (r1, . . . , rN), DRMs Φn,Ωnn∈[N]

I Sketch. (H,V1, . . . ,VN) = Sketch(X; Φn,Ωnn∈[N]

)I Recover factor matrices. For n ∈ [N],

(Qn,∼)← QR(Vn)

I Recover core.

W←H×1 (Φ>1 Q1)† × · · · ×N (Φ>NQN)†

Return: Tucker approximation X = JW; Q1, . . . ,QNK

accesses X only once, to sketch

Source: [Sun, Guo, Luo, Tropp & Udell 2019]

Madeleine Udell, Cornell. Streaming Tucker Approximation. 27

Page 28: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Fixed rank approximation

to truncate reconstruction to rank r, truncate core:

LemmaFor a tensor W ∈ Rk1×···×kN , orthogonal matrices Qn ∈ Rkn×rn ,

JW×1 Q1 · · · ×N QNKr = JWKr ×1 Q1 · · · ×N QN ,

where J·K denotes the best rank r Tucker approximation.

=⇒ compute fixed rank approximation using, e.g., HOOI on(small) core approximation W

Madeleine Udell, Cornell. Streaming Tucker Approximation. 28

Page 29: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Tail energy

For each unfolding X(n), define its ρth tail energy as

(τ (n)ρ )2 :=

min(In,I(−n))∑k>ρ

σ2k(X(n)),

where σk(X(n)) is the kth largest singular value of X(n).

Madeleine Udell, Cornell. Streaming Tucker Approximation. 29

Page 30: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Guarantees (I)

Theorem (Recommended parameters [Sun et al. 2019])

Sketch X with Gaussian DRMs of parameters k, s = 2k + 1.Form a rank r Tucker approximation X using the one passalgorithm. Then

E‖X− X‖2F ≤ 4

N∑n=1

(τ(n)rn )2.

If X is truly rank r, we obtain the true Tucker factorization!

Madeleine Udell, Cornell. Streaming Tucker Approximation. 30

Page 31: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Guarantees (II)

Theorem (Detailed guarantee [Sun et al. 2019])

Sketch X with Gaussian DRMs of parameters k, s. Form a rankr Tucker approximation X using the one pass algorithm. Then

E‖X− X‖2F ≤ (1 + ∆) min

1≤ρn<kn−1

N∑n=1

(1 +

ρnkn − ρn − 1

)(τ (n)ρn )2

where ∆ = maxNn=1 kn/(sn − kn − 1)

Madeleine Udell, Cornell. Streaming Tucker Approximation. 31

Page 32: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 32

Page 33: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Different DRMs perform similarly

0.0 0.1 0.2 0.3 0.4Compression Factor: δ1 = k/I

10−3

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 600 (Low Rank γ = 0.01)

0.02 0.04 0.06 0.08 0.10Compression Factor: δ1 = k/I

10−2

10−1

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 600 (Sparse Low Rank γ = 0.01)

0.00 0.05 0.10 0.15Compression Factor: δ1 = k/I

10−3

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 600 (Polynomial Decay)

0.0 0.1 0.2 0.3 0.4Compression Factor: δ1 = k/I

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 600 (Low Rank γ = 0.1)

0.0 0.1 0.2 0.3 0.4Compression Factor: δ1 = k/I

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 600 (Low Rank γ = 1)

SSRFT

Gaussian TRP

Sparse TRP

Comments: Synthetic data, I = 600 and r = (5, 5, 5). k/I = .4 =⇒ 20×compression.

Madeleine Udell, Cornell. Streaming Tucker Approximation. 33

Page 34: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Sensible reconstruction at practical compression level

105 106 107

Memory Use

10−3

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary Memory Size, I = 300 (Low Rank γ = 0.01)

105 106 107

Memory Use

10−3

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary Memory Size, I = 300 (Sparse Low Rank γ = 0.01)

105 106 107

Memory Use

10−3

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary k, I = 300 (Polynomial Decay)

105 106 107

Memory Use

10−2

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary Memory Size, I = 300 (Low Rank γ = 0.1)

105 106 107

Memory Use

10−1

100

Diff

eren

cein

Rel

ativ

eE

rror

Vary Memory Size, I = 300 (Low Rank γ = 1)

Two Pass

One Pass

TS

Comments: Error of fixed-rank approximation relative to HOOI for r = 10, I = 300using TRP. Total memory use is ((2k + 1)N + kIN) and (Kr2N + K ∗ r2N−2).Low-rank data uses γ = 0.01, 0.1, 1.

Madeleine Udell, Cornell. Streaming Tucker Approximation. 34

Page 35: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Combustion simulation

0 20 40 60 80 100 120x

0

20

40

60

80

100

120

yOrginal

0 20 40 60 80 100 120x

0

20

40

60

80

100

120

y

HOOI

0 20 40 60 80 100 120x

0

20

40

60

80

100

120

y

Two Pass

0 20 40 60 80 100 120x

0

20

40

60

80

100

120

yOne Pass

Comments:1408× 128× 128simulatedcombustion datafrom [Lapointe,Savard & Blanquart2015].

Madeleine Udell, Cornell. Streaming Tucker Approximation. 35

Page 36: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Video scene classification

Linear Sketch (k = 20)

Two-Pass Tucker (k = 20, r = 10)

One-Pass Tucker (k = 20, r = 10)

0 250 500 750 1000 1250 1500 1750 2000

Frame

One-Pass Tucker (k = 300, r = 10)

Comments: Video data 2200× 1080× 1980. Classify scenes using k-means on: 1)linear sketch along the time dimension k = 20 (Row 1); 2) The Tucker factor alongthe time dimension, computed via our two pass (Row 2) and one pass (Row 3)sketching algorithm (r , k, s) = (10, 20, 41). 3) The Tucker factor along the timedimension, computed via our one pass (Row 4) sketching algorithm(r , k, s) = (10, 300, 601).

Madeleine Udell, Cornell. Streaming Tucker Approximation. 36

Page 37: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Summary

Streaming Tucker approximation compresses tensor withoutstoring it.

useful for:

I streaming data

I distributed data

I low memory compute

key ideas:

I form linear sketch of tensor and recover from sketch

I random projection of tensor preserves dominant information

Madeleine Udell, Cornell. Streaming Tucker Approximation. 37

Page 38: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Future work + references

let’s talk!

I bigger tensors to compress?

I streaming compression for 〈your research〉?references:

I Sun, Y., Guo, Y., Tropp, J. A., and Udell, M. (2018).Tensor random projection for low memory dimensionreduction. In NeurIPS Workshop on RelationalRepresentation Learning.

I Sun, Y., Guo, Y., Luo, C., Tropp, J. A., and Udell, M.(2019). Low rank tucker approximation of a tensor fromstreaming data. In preparation.

I Tropp, J. A., Yurtsever, A., Udell, M., and Cevher, V.(2019). Streaming low-rank matrix approximation with anapplication to scientific simulation. Submitted to SISC.

Madeleine Udell, Cornell. Streaming Tucker Approximation. 38

Page 39: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Outline

Applications

Tucker factorization

Sketching

Reconstruction

Numerics

*

Madeleine Udell, Cornell. Streaming Tucker Approximation. 39

Page 40: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

References

Achlioptas, D. (2003). Database-friendly random projections:Johnson-lindenstrauss with binary coins. Journal of computer and SystemSciences, 66(4), 671–687.

Battaglino, C., Ballard, G., & Kolda, T. G. (2019). Faster parallel tucker tensordecomposition using randomization.

De Lathauwer, L., De Moor, B., & Vandewalle, J. (2000). A multilinear singularvalue decomposition. SIAM journal on Matrix Analysis and Applications, 21(4),1253–1278.

Halko, N., Martinsson, P.-G., & Tropp, J. A. (2011). Finding structure withrandomness: Probabilistic algorithms for constructing approximate matrixdecompositions. SIAM review, 53(2), 217–288.

Lapointe, S., Savard, B., & Blanquart, G. (2015). Differential diffusion effects,distributed burning, and local extinctions in high karlovitz premixed flames.Combustion and flame, 162(9), 3341–3355.

Li, P., Hastie, T. J., & Church, K. W. (2006). Very sparse random projections. InProceedings of the 12th ACM SIGKDD international conference on Knowledgediscovery and data mining, (pp. 287–296). ACM.

Li, Y., Nguyen, H. L., & Woodruff, D. P. (2014). Turnstile streaming algorithmsmight as well be linear sketches. In Proceedings of the forty-sixth annual ACMsymposium on Theory of computing, (pp. 174–183). ACM.

Malik, O. A. & Becker, S. (2018). Low-rank tucker decomposition of large tensorsusing tensorsketch. In Advances in Neural Information Processing Systems, (pp.10116–10126).

Madeleine Udell, Cornell. Streaming Tucker Approximation. 39

Page 41: Low Rank Tucker Approximation of a Tensor from Streaming Data · Low Rank Tucker Approximation of a Tensor from Streaming Data Madeleine Udell Operations Research and Information

Rudelson, M. (2012). Row products of random matrices. Advances inMathematics, 231(6), 3199–3231.

Sun, Y., Guo, Y., Luo, C., Tropp, J. A., & Udell, M. (2019). Low rank tuckerapproximation of a tensor from streaming data. In preparation.

Sun, Y., Guo, Y., Tropp, J. A., & Udell, M. (2018). Tensor random projection forlow memory dimension reduction. In NeurIPS Workshop on RelationalRepresentation Learning.

Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis.Psychometrika, 31(3), 279–311.

Woolfe, F., Liberty, E., Rokhlin, V., & Tygert, M. (2008). A fast randomizedalgorithm for the approximation of matrices. Applied and ComputationalHarmonic Analysis, 25(3), 335–366.

Zhou, G., Cichocki, A., & Xie, S. (2014). Decomposition of big tensors with lowmultilinear rank. arXiv preprint arXiv:1412.1885.

Madeleine Udell, Cornell. Streaming Tucker Approximation. 39


Recommended