+ All Categories
Home > Documents > Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP...

Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP...

Date post: 01-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
84
Low rank tensor methods for high-dimensional problems Daniel Kressner MATHICSE / SMA / SB / EPF Lausanne daniel.kressner@epfl.ch http://anchp.epfl.ch Based on joint work with Michael Steinlechner (EPFL), Christine Tobler (Mathworks), André Uschmajew (Bonn), Bart Vandereycken (Princeton) iTWIST’14
Transcript
Page 1: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Low rank tensor methodsfor high-dimensional problems

Daniel Kressner

MATHICSE / SMA / SB / EPF [email protected]

http://anchp.epfl.ch

Based on joint work withMichael Steinlechner (EPFL), Christine Tobler (Mathworks),André Uschmajew (Bonn), Bart Vandereycken (Princeton)

iTWIST’14

Page 2: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Background

I Numerical linear algebra: analysis and numerical solution ofmatrix eigenvalue problems, matrix equations, . . .More recently, tensors.

I High-performance implementations of solvers on shared- anddistributed-memory architectures.

I Contributions to LAPACK, ScaLAPACK, . . .

Page 3: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Goal of this talk

Give a flavor of low-rank tensor techniques for solving problems indata analysis and scientific computing.

Page 4: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?

Page 5: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

Page 6: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

A d-th order tensor X of size n1 × n2 × · · · × nd is a d-dimensionalarray with entries

Xi1,i2,...,id , iµ ∈ 1, . . . ,nµ for µ = 1, . . . ,d .

X ∈ Rn1×n2×···×nd .

Multi-index notation:

I = 1, . . . ,n1 × 1, . . . ,n2 × · · · × 1, . . . ,nd.

Then i ∈ I is a tuple of d indices:

i = (i1, i2, . . . , id ).

Allows to write entries of X as Xi for i ∈ I.

Page 7: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

Hyperspectral image is an Nx × Ny × Nfrequencies tensor

Page 8: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

CT scan is an Nx × Ny × Nslices tensor

Page 9: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

Uncompressed video is an Nx × Ny × Nframes tensor

Page 10: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

Adding contextual information (e.g., day of week when rating wasmade) leads to (highly incomplete) tensors.

Page 11: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

What is a tensor?Tensor is a multi-dimensional array of numbers.

Regular discretization of functionf (ξ1, . . . , ξd )on hypercube tensor of order d

Page 12: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Data-related tensors: SummaryI Order (dimensionality) of tensor modest (3,4,5)I Mode sizes can be huge.I Entries of tensor (at least partially) explicitly given.I Can often afford to store full tensor.

Cichocki’2014. Era of Big Data Processing: A New Approach via TensorNetworks and Tensor Decompositions.http://arxiv.org/abs/1403.2048

Page 13: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Function-related tensors

Page 14: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Discretization of bivariate functionI Bivariate function: f (x , y) :

[xmin, xmax

]×[ymin, ymax

]→ R.

I Function values on tensor grid [x1, . . . , xn]× [y1, . . . , ym].I Collected in looong vector:

f =

f (x1, y1)f (x2, y1)

...f (xm, y1)f (x1, y2)f (x2, y2)

...f (xm, y2)

...

...f (x1, yn)f (x2, yn)

...f (xm, yn)

Page 15: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Discretization of bivariate functionI Bivariate function: f (x , y) :

[xmin, xmax

]×[ymin, ymax

]→ R.

I Function values on tensor grid [x1, . . . , xn]× [y1, . . . , ym].I Collected in a matrix:

F =

f (x1, y1) f (x1, y2) · · · f (x1, yn)f (x2, y1) f (x2, y2) · · · f (x2, yn)

......

...f (xm, y1) f (xm, y2) · · · f (xm, yn)

Page 16: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Discretization of function f (ξ1, . . . , ξd )on hypercube tensor of order d

Page 17: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

High-dimensional elliptic PDEs: 3D model problemI Consider

−∆u = f in Ω, u|∂Ω = 0,

on unit cube Ω = [0,1]3.I Discretize on tensor grid.

Uniform grid for simplicity:

ξ(j)µ = jh, h =

1n + 1

for µ = 1,2,3.

I Approximate solution tensor U ∈ Rn×n×n:

Ui1,i2,i3 ≈ u(ξ

(i1)1 , ξ

(i2)2 , ξ

(i3)3

).

Page 18: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

High-dimensional elliptic PDEs: Arbitrary dimensionsFinite difference discretization of model problem

−∆u = f in Ω, u|∂Ω = 0

for Ω = [0,1]d takes the form

( d∑j=1

I ⊗ · · · ⊗ I ⊗ A⊗ I ⊗ · · · ⊗ I)

u = f.

To obtain such Kronecker structure in general:I tensorized domain;I highly structured grid;I coefficients that can be written/approximated as sum of

separable functions.

Page 19: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

High-dimensional PDE-eigenvalue problemsPDE-eigenvalue problem

∆u(ξ) + V (ξ)u(ξ) = λu(ξ) in Ω = [0,1]d ,u(ξ) = 0 on ∂Ω.

Example: Ω = [−10,2]d and Henon-Heiles potential ([Meyer et al.1990; Raab et al. 2000; Faou et al. 2009])

V (ξ) =12

d∑j=1

σjξ2j +

d−1∑j=1

(σ∗(ξjξ

2j+1 −

13ξ3

j ) +σ2∗

16(ξ2

j + ξ2j+1)2

).

with σj ≡ 1, σ∗ = 0.2.

Discretization with n = 128 dof/dimension for d = 20 dimensions. Eigenvector has nd ≈ 1042 entries.

Page 20: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Quantum many-body problemsI spin-1/2 particles: proton, neutron, electron, and quark.I two states: spin-up, spin-downI quantum state for each spin represented by vector in C2 (spinor)I quantum state for system of d spins represented by vector in C2d

I quantum mechanical operators expressed in terms of Paulimatrices

Px =

[0 11 0

], Py =

[0 −ii 0

], Pz =

[1 00 −1

].

I spin Hamiltonian: sum of Kronecker products of Pauli matricesand identities each term describes physical (inter)action of spins

I interaction of spins described by graphI Goal: Compute ground state of spin Hamiltonian.

Page 21: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Quantum many-body problemsExample: 1d chain of 5 spins with periodic boundary conditions

1 3 4 52

Hamiltonian describing pairwise interaction between nearestneighbors:

H = Pz ⊗ Pz ⊗ I ⊗ I ⊗ I+ I ⊗ Pz ⊗ Pz ⊗ I ⊗ I+ I ⊗ I ⊗ Pz ⊗ Pz ⊗ I+ I ⊗ I ⊗ I ⊗ Pz ⊗ Pz+ Pz ⊗ I ⊗ I ⊗ I ⊗ Pz

Page 22: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Quantum many-body problemsI Ising (ZZ) model for 1d chain of d spins with open boundary

conditions:

H =

p−1∑k=1

I ⊗ · · · ⊗ I ⊗ Pz ⊗ Pz ⊗ I ⊗ · · · ⊗ I

p∑k=1

I ⊗ · · · ⊗ I ⊗ Px ⊗ I ⊗ · · · ⊗ I

λ = ratio between strength of magnetic field and pairwiseinteractions

I 1d Heisenberg (XY) modelI Current research: 2d models.I More details in:

Huckle/Waldherr/Schulte-Herbrüggen: Computations inQuantum Tensor Networks.Schollwöck: The density-matrix renormalization group in the ageof matrix product states.

Page 23: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Stochastic Automata Networks (SANs)

I 3 stochastic automata A1,A2,A3 having 3 states each.I Vector x (i)

t ∈ R3 describes probabilities of states (1), (2), (3) in Aiat time t

I No coupling between automata local transition x (i)t 7→ x (i)

t+1described by Markov chain:

x (i)t+1 = Eix

(i)t ,

with a stochastic matrix Ei .I Stationary distribution of Ai = Perron vector of Ei (eigenvector for

eigenvalue 1).

Page 24: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Stochastic Automata Networks (SANs)

I 3 stochastic automata A1,A2,A3 having 3 states each.I Coupling between automata local transition x (i)

t 7→ x (i)t+1 not

described by Markov chain.I Need to consider all possible combinations of states in

(A1,A2,A3):

(1,1,1), (1,1,2), (1,1,3), (1,2,1), (1,2,2), . . . .

I Vector xt ∈ R33(or tensor X (t) ∈ R3×3×3) describes probabilities

of combined states.

Page 25: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Stochastic Automata Networks (SANs)I Transition xt 7→ xt+1 described by Markov chain:

xt+1 = Ext ,

with a large stochastic matrix E .I Oversimplified example:

E = I ⊗ I ⊗ E1 + I ⊗ E2 ⊗ I + E3 ⊗ I ⊗ I︸ ︷︷ ︸local transition

.

+ I ⊗ E21 ⊗ E12︸ ︷︷ ︸interaction between A1,A2

+ E32 ⊗ E23 ⊗ I︸ ︷︷ ︸interaction between A2,A3

I Goal: Compute stationary distribution = Perron vector of E .I More details in:

Stewart: Introduction to the Numerical Solution of MarkovChains. Chapter 9.DK and F. Macedo. Low-rank tensor methods for communicatingMarkov processes. QEST 2014.

Page 26: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

High-order correlations of random fieldsKarhunen-Loève expansion of random field on bounded domain Ω:

f (x , ω) = E[f ](x) +∞∑i=1

√λiYi (ω)Φi (x).

d-point correlation µdf :

µdf (x1, . . . , xd ) := E

[ d∏η=1

(f (xη, ·)− E[f ](xη))]

Combined with KL expansion

µdf (x1, . . . , xd ) =

∑i∈Nd

Ci1,...,id

d⊗η=1

Φiη (xη), Ci1,...,id =∞∏`=1

λmi(`)/2` E

[Y mi(`)`

].

DK, R. Kumar, F. Nobile, and C. Tobler. Low-rank tensorapproximation for high-order correlation functions of Gaussianrandom fields. 2014.

Page 27: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Function-related tensors: SummaryI Order (dimensionality) of tensor may become arbitrarily large.I Mode sizes usually modest.I Tensor NOT explicitly given.I Can rarely afford to store full tensor.

Page 28: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Function-related tensors: SummaryI Order (dimensionality) of tensor may become arbitrarily large.I Mode sizes usually modest.I Tensor NOT explicitly given.I Can rarely afford to store full tensor.

Tensor of order 30 with mode size 10 has1030 entries = 8× 1012 Exabyte storage!

Global data storage calculated at 295exabyte, see http://www.bbc.co.uk/news/technology-12419672.

Page 29: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Function-related tensors: SummaryI Order (dimensionality) of tensor may become arbitrarily large.I Mode sizes usually modest.I Tensor NOT explicitly given.I Can rarely afford to store full tensor.

Alternatives:

I Smarter discretizations (sparsegrids, adaptive sparse discretization,. . .).

I Smarter sampling strategies.I model reduction

11 12 13 14

21 22 23 24

31 32 33 34

41 42 43 44

l1

l2

Page 30: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Low-rank matrices

Page 31: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Low-rank approximationSetting: Matrix X ∈ Rn×m, m and n too large to compute/store Xexplicitly.Idea: Replace X by RST with R ∈ Rn×r ,S ∈ Rm×r and r m,n.

X RST

Memory nm nr + rm

min‖X − RST‖2 : R ∈ Rn×r ,S ∈ Rm×r = σr+1.

with singular values σ1 ≥ σ2 ≥ · · · ≥ σminm,n of X .

Page 32: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Construction from singular value decompositionSVD: Let matrix X ∈ Rn×m and k = minm,n. Then ∃ orthonormalmatrices

U =[u1, u2, . . . , uk

]∈ Rn×k , V =

[v1, v2, . . . , vk

]∈ Rm×k ,

such thatX = UΣV T , Σ = diag(σ1, σ2, . . . , σk ).

Choose r ≤ k and partition

X =[U1, U2

] [ Σ1 00 Σ2

] [V1, V2

]T= U1 Σ1︸ ︷︷ ︸

=:R

V T1︸︷︷︸

=:ST

+ U2Σ2V T2 .

Then ‖X − RST‖2 = ‖Σ2‖2 = σr+1.

Good low rank approximation if singular values decay sufficiently fast.

Also: span(X ) ≈ span(R), span(X T ) ≈ span(ST )

Page 33: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Singular values of ground state

I Computed ground state for modified Henon-Heiles potential ford = 2.

I Reshaped ground state into matrix.

Ground state Singular values

0 100 200 30010

−20

10−15

10−10

10−5

100

Excellent rank-10 approximation possible

Page 34: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

When to expect good low-rank approximationsRule of thumb: Smoothness helps, but not always needed.

Page 35: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Low-rank tensors

Page 36: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

CP decompositionCANDECOMP/PARAFAC (CP):

X = a1 b1 c1 + a2 b2 c2 + · · ·+ aR bR cR .

c1

a1

b1

ck

ak

bk

Y

Page 37: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

CP decompositionI CP decomposition offers low data-complexity; for constant R:

linear complexity in d .

Theoretical issues:

For tensors of order d ≥ 3:I tensor rank R is not upper

semi-continuous

lack of closedness

I successive rank-1 approximations failI all algorithms based on optimization

techniques (ALS, Gauss-Newton)Picture taken from [Kolda/Bader’2009].Practical issues:

I No SVD-based compression possible.I CP ignores locality of interactions!

Page 38: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Tensor network diagrams

I Introduced by Roger Penrose.I Heavily used in quantum mechanics (spin networks).

Page 39: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a scalar γ ∈ R

Page 40: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a vector x ∈ Rn

Page 41: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

These are two vectors x , y ∈ Rn

Page 42: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is the inner product between x , y ∈ Rn

〈x , y〉 =n∑

i=1

xiyi

Page 43: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

These are two matrices A,B

Page 44: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is the matrix product C = AB

Cij =r∑

k=1

Aik Bkj

Page 45: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is the matrix product C = UΣV T

Cij =r∑

k=1

r∑`=1

Uik Σk`Vj`

If r n: Implicit representation of C via smaller matrices U,V ,Σ.

Page 46: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a tensor X of order 3

Page 47: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a tensor X of order 3 in Tucker decomposition

Xijk =

r1∑`1=1

r2∑`2=1

r3∑`3=1

C`1`2`3Ui`1Vj`2Wk`3

Implicit representation of X viaI r1 × r2 × r3 core tensor CI n1 × r1 matrix U spans first modeI n2 × r2 matrix V spans second modeI n3 × r3 matrix W spans third mode.

Page 48: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Tucker decomposition & multilinear rank

Reshape tensor into matrix by slicing, e.g. for first dimension:

X = X(1) = ∈ Rn1×(n2·n3)

Multilinear rank of tensor X ∈ Rn1×n2×n3 defined by tuple

r = (r1, r2, r3), with ri = rank(X(i)).

X =U

W

V

C

Representation of rank-r-tensor:Tucker decomposition:

X = C ×1 U ×2 V ×3 W

U ∈ Rn1×r1 , V ∈ Rn2×r2 , W ∈ Rn3×r3 , andcore tensor C ∈ Rr1×r2×r3

Page 49: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a tensor X of order 6 in TT decomposition

I TT = Tensor TrainI X implicitly represented by six ri−1 × ni × ri tensors, r0 = r6 = 1.I Quantum mechanics: MPS (matrix product states)I Introduced in numerical analysis by Oseledets and Tyrtishnikov.

Page 50: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Ranks of a tensor in TT decomposition

This partition corresponds to low-rank factorization

X (1,2,3) = UV T , X (1,2,3) ∈ Rn1n2n3×n4n5n6 , U ∈ Rn1n2n3×r3 , V ∈ Rn4n5n6×r3

X (1,2,3) is matricization/unfolding/flattening/reshape of X :

Merge multi-indices (1,2,3) into row indices andmulti-indices (4,5,6) into column indices

The ranks of X (1,...,µ) for µ = 1, . . . ,d − 1 are the TT ranks of X .

Page 51: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Inner product of two tensors in TT decomposition

I Carrying out contractions requires O(dnr4) instead of O(nd )operations for tensors of order d .

Page 52: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Operations with tensors in TT decompositionEasy:

I inner product, 2-normI multiplication with Kronecker structured AI recompression/truncationI (partial) contractions

Page 53: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Operations with tensors in TT decompositionEasy:

I inner product, 2-normI multiplication with Kronecker structured AI recompression/truncationI (partial) contractions

Hard:I almost everything else

Page 54: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Operations with tensors in TT decompositionEasy:

I inner product, 2-normI multiplication with Kronecker structured AI recompression/truncationI (partial) contractions

Hard:I almost everything else

2 classes of algorithms for solving (linear algebra) problems:I iterate and truncateI constrain and optimize

All under the assumption that ranks stay small!

Page 55: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

TT decomposition of function-related tensorsI When to expect good low-rank approximations?

I Approximation error from separation wrt to x1, . . . , xa:

f (x1, . . . , xa, xa+1, . . . , xd ) ≈r∑

k=1

gk (x1, . . . , xa)hk (xa+1, . . . , xd )

for a = 1, . . . ,d − 1.I [DK/Tobler’2011]: For analytic functions

error . exp(−rmax1/a,1/(d−a)).

I [Temlyakov’1992, Uschmajew/Schneider’2013]: For f ∈ Bs,mix

error . r−2s(log r)2s(maxa,d−a−1).

Smoothness neither sufficient nor necessary for high dimensions!

Purely algebraic construction [DK/Uschmajew’2014] yieldsdimension-independent error bounds for linear systems/eigenvalueproblems with nearest neighbor interaction.

Page 56: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

This is a tensor X of order 16 in PEPS

I PEPS = Projected Entangled Pair StatesI Schuch/Wolf/Verstraete/Cirac’2007: inner product of two PEPS is

NP hardI Landsberg/Qi/Ye’2012: PEPS not Zariski closed

Page 57: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Constrain and optimize:Eigenvalue computation

Page 58: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Rayleigh quotients wrt low-rank matricesConsider symmetric n2 × n2 matrix A. Then

λmin(A) = minx 6=0

〈x ,Ax〉〈x , x〉

.

We now...I reshape vector x into n × n matrix X ;I reinterpret Ax as linear operator A : X 7→ A(X );I for example if A =

∑sk=1 Bk ⊗ Ak then

A(X ) =s∑

k=1

Bk XATk .

Page 59: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Rayleigh quotients wrt low-rank matricesConsider symmetric n2 × n2 matrix A. Then

λmin(A) = minX 6=0

〈X ,A(X )〉〈X ,X 〉

with matrix inner product 〈·, ·〉. We now...I restrict X to low-rank matrices.

Page 60: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Rayleigh quotients wrt low-rank matricesConsider symmetric n2 × n2 matrix A. Then

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

I Approximation error governed by low-rank approximability of X .I Solved by Riemannian optimization techniques or ALS.

Page 61: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Initially:I fix target rank rI U ∈ Rm×r ,V n×r randomly, such that V is ONB

λ− λ = 6× 103

residual = 3× 103

Page 62: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Fix V , optimize for U.

〈X ,A(X )〉 = vec(UV T )TA vec(UV T )

= vec(U)T (V ⊗ I)TA(V ⊗ I)vec(U)

Compute smallest eigenvalue of reduced matrix (rn × rn) matrix

(V ⊗ I)TA(V ⊗ I).

Note: Computation of reduced matrix benefits from Kroneckerstructure of A.

Page 63: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Fix V , optimize for U.

λ− λ = 2× 103

residual = 2× 103

Page 64: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Orthonormalize U, fix U, optimize for V .

〈X ,A(X )〉 = vec(UV T )TA vec(UV T )

= vec(V T )(I ⊗ U)TA(I ⊗ U)vec(V T )

Compute smallest eigenvalue of reduced matrix (rn × rn) matrix

(I ⊗ U)TA(I ⊗ U).

Note: Computation of reduced matrix benefits from Kroneckerstructure of A.

Page 65: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Orthonormalize U, fix U, optimize for V .

λ− λ = 1.5× 10−7

residual = 7.7×10−3

Page 66: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Orthonormalize V , fix V , optimize for U.

λ− λ = 1× 10−12

residual = 6× 10−7

Page 67: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALSALS for solving

λmin(A)≈ minX=UV T 6=0

〈X ,A(X )〉〈X ,X 〉

.

Orthonormalize U, fix U, optimize for V .

λ− λ = 7.6× 10−13

residual = 7.2×10−8

Page 68: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

ALS for TT decompositions

Originates from quantum mechanics = one-site DMRG.

Goal:min

〈X ,A(X )〉〈X ,X〉

: X ∈Mr, X 6= 0

Mr =

ALS: Choose one node t , fix all other nodes, set new tensor at node tto minimize Rayleigh quotient 〈X ,A(X )〉

〈X ,X〉 . This is done for all nodes (asweep), and sweeps are continued until convergence.

Page 69: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Numerical Experiments - Sine potential, d = 10

ALS

0 100 200 300 400 50010

−15

10−10

10−5

100

105

Execution time [s]

0 100 200 300 400 50015

20

25

30

35

40

45err_lambda

res

nr_iter

Size = 12810 ≈ 1021. Maximal TT rank 40.

Page 70: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Numerical Experiments - Henon-Heiles, d = 20

ALS

0 500 1000 1500 2000 250010

−15

10−10

10−5

100

105

Execution time [s]

0 500 1000 1500 2000 25000

10

20

30

40

50

60err_lambda

res

nr_iter

Size = 12820 ≈ 1042. Maximal TT rank 40.

Page 71: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Numerical Experiments - 1/‖ξ‖2 potential, d = 20

ALS

0 500 1000 150010

−15

10−10

10−5

100

105

Execution time [s]

0 500 1000 15000

5

10

15

20

25

30err_lambda

res

nr_iter

Size = 12820 ≈ 1042. Maximal TT rank 30.

Page 72: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Summary of ALS

I ALS-based methods currently state-of-the-art forhigh-dimensional linear systems/eigenvalue problems.

I Variation 1: (two-site) DMRG joins and optimizes twoneighboring nodes simultaneously.

I Variation 2: AMEN [Dolgov/Savostyanov’2013] inject(preconditioned) gradient information in every step of ALS.

I Local convergence analysis of ALS: [Uschmajew’2011],[Rohwedder/Uschmajew’2013].

Page 73: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Constrain and optimize:Tensor completion

Page 74: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Robust low-rank tensor completion

minimizeX

f (X ) :=12‖PΩ(X −A)‖2

subject to X ∈Mr

Applications:I Completion of multidimensional data,

e.g. hyperspectral images, CT ScansI Compression of multivariate

functions with singularitiesI Non-intrusive methods for stochastic

PDEsI Context-aware recommender

systemsI . . .

Page 75: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Manifold of low-rank tensors

Mr :=X ∈ Rn1×...×nd : rank(X ) = r

=

dim(Mr) =d∏

j=1

rj +d∑

i=1

(rini −

ri (ri − 1)

2

).

I Mr is a smooth manifold. Discussed for more general formats in[Holtz/Rohwedder/Schneider’2012], [Uschmajew/Vandereycken’2012]

I Riemannian with metric induced by standard inner product〈X ,Y〉 = 〈X(1),Y(1)〉 (sum of element-wise product)

Manifold structure used inI dynamical low-rank approximation

[Koch/Lubich’2010], [Arnold/Jahnke’2012],[Lubich/Rohwedder/Schneider/Vandereycken’2012],[Khoromskij/Oseledets/Schneider’2012], . . .

I best multilinear approximation [Eldén/Savas’2009], [Ishteva/Absil/VanHuffel/De Lathauwer’2011], [Curtef/Dirr/Helmke’2012]

I robust tensor completion [da Silva/Herrmann’2013],[DK/Steinlechner/Vandereycken’2013]

Page 76: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Riemannian optimization in a nutshell

I optimize in direction of Riemannian gradientI combine different directions using vector

transport

Retraction Vector transport

Page 77: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Geometric nonlinear CG for tensor completion

Input: Initial guess X0 ∈Mr.η0 ← −grad f (X0)α0 ← argminα f (X0 + αη0)X1 ← RX0 (α0η0)

for i = 1,2, . . . doCompute gradient:ξi ← grad f (Xi )Conjugate direction by PR+ updating rule:ηi ← −ξi + βiTXi−1→Xi f (ηi−1)Initial step size from linearized line search:αi ← argminα f (Xi + αηi )Armijo backtracking for sufficient decrease:Find smallest integer m ≥ 0 such thatf (Xi )− f (RXi (2

−mαiηi )) ≥ −1 · 10−4〈ξi ,2−mαiηi〉Obtain next iterate:Xi+1 ← RXi (2

−mαiηi )end for Cost/iteration: O(nrd + |Ω|rd−1) ops.

Page 78: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Reconstruction of CT Scan199× 199× 150 tensor from scaled CT data set “INCISIX”,(taken from OSIRIX MRI/CT data base[www.osirix-viewer.com/datasets/])

Slice of original tensor HOSVD approx. of rank 21

Sampled tensor (6.7%) Low-rank completion of rank 21

Compares very well with existing results w.r.t. low-rank recovery andspeed, e.g., [Gandy/Recht/Yamada/’2011].

Page 79: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Hyperspectral Image

Set of photographs, (204× 268 px) taken across a large range ofwavelengths. 33 samples from ultraviolet to infrared [Image data:Foster et al.’2004]Stacked into a tensor of size 204× 268× 33

Completed Tensor, 16th SliceFinal Rank is k = [50 50 6]

10% of the Original Hyperspectral Imega Tensor, 16th SliceSize of Tensor is [204, 268, 33]

Page 80: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

How many samples?

Matrix case:O(n · logβ n) samples suffice![Candès/Tao’2009]⇒ Completion of tensor byapplying matrix completion tomatricization: O(n2 log(n)). Givesupper bound!

Tensor case:Certainly: |Ω| O(n2)In all cases of convergence exact reconstruction.

For CP: Analysis by[Rauhut/Stojanac’2014].

4 4.5 5 5.5 6 6.5 78.5

9

9.5

10

10.5

11

11.5

12

12.5

13

log(n)

log(

Sm

alle

st s

ize

of

nee

ded

to c

onve

rge

)

y = 1.2*x + 4

O(n2)

O(n)

O(n log n)

Page 81: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Application to parameter-dependent PDEs

Stationary heat equation with pw constant heatconductivity σ(x , α):

−∇(σ(x , α)∇u) = s in Ω

u = 0 on ∂Ω

I σ(baking tray) = 1I σ(cookiej ) = 1 + α(j) 0 2 4 6

0

1

2

3

4

5

6

# Vertices : 3900, # Elements : 7542,# Edges : 11441

Parameter study:I n samples/parameterI Interested in functional f : u 7→ f (u) ∈ R. tensor with nd entries

Page 82: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Application to parameter-dependent PDEs

0 50 100 15010

−10

10−8

10−6

10−4

10−2

100

Iterations

Err

or

Cookie Problem, d=9, n=10, |Ω| = 3224, final rank = 4

Rel. error on sampling set Ω

Rel. error on test set Γ

I 3224 random parameter samplesI nonlinear CG on manifold of TT tensors of rank 4

[Steinlechner’2014]

Page 83: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

SummaryI Low-rank tensor techniques emerged during last five years in

numerical analysis.I Successfully applied to:

I parameter-dependent / multi-dimensional integrals;I electronic structure calculations: Hartree-Fock / DFT;I stochastic and parametric PDEs;I high-dimensional Boltzmann / chemical master / Fokker-Planck /

Schrödinger equations;I micromagnetism;I rational approximation problems;I computational homogenization;I computational finance;I multivariate regression and machine learning;I queuing models;I . . .

I For references on these applications, seeI L. Grasedyck, DK, Ch. Tobler (2013). A literature survey of low-

rank tensor approximation techniques. GAMM-Mitteilungen, 36(1).I W. Hackbusch (2012). Tensor Spaces and Numerical Tensor

Calculus, Springer.

Page 84: Low rank tensor methods for high-dimensional problems [-1.5cm] … · 2014-09-17 · CP decomposition I CP decomposition offers low data-complexity; for constant R: linear complexity

Open problemsI Better a priori approximation results.I Better understanding of (global) convergence of

optimization-based techniques.I Better sense of scope of applications.I Combination with other model reduction techniques.I Efficient implementation, parallelization.I . . .


Recommended