A t-SVD-based Nuclear Norm withImaging Applications
Oguz Semerci1 Ning Hao2 Misha E. Kilmer 2 Eric Miller2
Shuchin Aeron2 Gregory Ely2 Zemin Zhang2
1Schlumberger-Doll Research
2Tufts University
Kilmer and Hao’s work supported by NSF-DMS 0914957
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 1 / 28
Notation
1
A(i,j,k) = element of A in row i, column j, tube k
← A4,7,1
← A:,3,1
← A:,:,3
1Graphics thanks to K. BramanMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 2 / 28
Notation
1
A(i,j,k) = element of A in row i, column j, tube k
← A4,7,1
← A:,3,1
← A:,:,3
1Graphics thanks to K. BramanMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 2 / 28
Notation
1
A(i,j,k) = element of A in row i, column j, tube k
← A4,7,1
← A:,3,1
← A:,:,3
1Graphics thanks to K. BramanMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 2 / 28
Notation
1
A(i,j,k) = element of A in row i, column j, tube k
← A4,7,1
← A:,3,1
← A:,:,3
1Graphics thanks to K. BramanMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 2 / 28
Motivation
The application drives the choice of factorization (e.g. CP, Tucker)and the constraints. Today we are concerned with imaging
applications, orientation dependent.
Talk builds on:
Closed multiplication operation between two tensors,factorizations reminiscent of matrix factorizations [K., Martin,Perrone, 2008; K., Martin 2010; Martin et al, 2012].
View of Third order tensors as operators on matrices, [K.,Braman, Hoover, Hao, 2013]
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 3 / 28
Toward Defining Tensor-TensorMultiplication
For A ∈ Rm×p×n, let Ai = A:,:,i.
unfold−→
A1
A2
A3...An
∈ Rmn×p
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 4 / 28
Toward Defining Tensor-TensorMultiplication
For A ∈ Rm×p×n, let Ai = A:,:,i.
unfold−→
A1
A2
A3...An
∈ Rmn×p
A1
A2
A3...An
fold−→ ∈ Rm×p×n
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 4 / 28
Block Circulant Matrix
The block circulant matrix generated by unfold (A) is
circ (A) =
A1 An · · · A3 A2
A2 A1 An · · · · · ·A3 A2
. . . . . . . . ....
. . . . . . . . . . . .
An · · · · · · A2 A1
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 5 / 28
Block Circulants
A block circulant can be block-diagonalized by a (normalized) DFT inthe 2nd dimension:
(F ⊗ I)circ (A) (F ∗ ⊗ I) =
A1 0 · · · 0
0 A2 0 · · ·0 · · · . . . 0
0 · · · 0 An
Conveniently, an FFT along tube fibers of A gives A.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 6 / 28
Tensor - Tensor Multiplication
[K., Martin, Perrone ‘08]: For A ∈ Rm×p×n and B ∈ Rp×q×n, definethe t-product
A ∗B ≡ fold(
circ (A) · unfold (B)).
Result is m× q × n.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 7 / 28
Tensor - Tensor Multiplication
[K., Martin, Perrone ‘08]: For A ∈ Rm×p×n and B ∈ Rp×q×n, definethe t-product
A ∗B ≡ fold(
circ (A) · unfold (B)).
Result is m× q × n.
Example: A ∈ Rm×p×3 and B ∈ Rp×q×3,
A ∗B = fold
A1 A3 A2
A2 A1 A3
A3 A2 A1
B1
B2
B3
.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 7 / 28
Tensor - Tensor Multiplication
[K., Martin, Perrone ‘08]: For A ∈ Rm×p×n and B ∈ Rp×q×n, definethe t-product
A ∗B ≡ fold(
circ (A) · unfold (B)).
Result is m× q × n.
Example: A ∈ Rm×p×3 and B ∈ Rp×q×3,
A ∗B = fold
A1 A3 A2
A2 A1 A3
A3 A2 A1
B1
B2
B3
.
This tensor-tensor multiplication generalizes to higher-order tensorsthrough recursion - see Martin et al, 2012.Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 7 / 28
More Definitions
Definition
A 1× 1× n tensor is called a tubal scalar. The t-product betweentubal scalars is commutative ⇒ the t-product resemblesmatrix-matrix product with scalar mult replaced by t-product multamong tubal scalars.
Definition
The `× `× n identity tensor I is the tensor whose frontal slice isthe `× ` identity matrix, and whose other frontal slices are all zeros.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 8 / 28
TransposeDefinition
If A is `×m× n, then AT is the m× `× n tensor obtained bytransposing each of the frontal slices and then reversing the order oftransposed frontal slices 2 through n.
Example
If A ∈ R`×m×4
AT = fold
AT1AT4AT3AT2
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 9 / 28
Orthogonality
Definition
U ∈ Rm×m×n is orthogonal if UT ∗U = I = U ∗UT.
Can show Frobenius norm invariance: ‖U ∗A‖F = ‖A‖F .
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 10 / 28
The t-SVD
Theorem (K. and Martin, 2011)
Let A ∈ R`×m×n. Then A can be factored as
A = U ∗ S ∗VT
where U,V are orthogonal `× `× n and m×m× n, and S is a`×m× n f -diagonal tensor. Also, B = U1:k,1:k,: ∗ S1:k,1:k,: ∗VT
1:k,1:k,:
satisfies
B = arg minM‖A−B‖F , M = B = X∗Y,X ∈ R`×k×n,Y ∈ Rk×m×n.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 11 / 28
t-SVD Example
Let A be 2× 2× 2.
(F ⊗ I)circ (A) (F ∗ ⊗ I) =
[A1 0
0 A2
]
[A1 0
0 A2
]=
[U1 0
0 U2
][σ(1)1 0
0 σ(1)2
][σ(2)1 0
0 σ(2)2
][V ∗1 0
0 V ∗2
]
The U,S,VT are formed by putting the hat matrices as frontal slices,ifft along tubes.
e.g. s1 =
[σ(1)1
σ(2)1
]oriented into the screen.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 12 / 28
Multi-rank
Definition (K.,Braman,Hoover,Hao, 2013)
Let A ∈ R`×m×n. The multi-rank of A is length n vector consisting
of the ranks of all the A(i)
, which must be symmetric about the“middle”.
Example
A ∈ R2×2×4, multi-rank possible: [i, j, k, j]T , 1 ≤ i, j, k ≤ 2.
Example
A ∈ R5×4×3, multi-rank possible: [i, j, j]T , 1 ≤ i ≤ 4, 1 ≤ j ≤ 4.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 13 / 28
Tensor Nuclear Norm
If A is an `×m, ` ≥ m matrix with singular values σi, the nuclearnorm ‖A‖~ =
∑mi=1 σi.
However, in the t-SVD, we have singular tubes (the entries of whichneed not be positive), which sum to a singular tube!
The entries in the jth singular tube are the inverse Fourier coefficientsof the length-n vector of the jth singular values of A:,:,i, i = 1..n.
Definition
For A ∈ R`×m×n, our tensor nuclear norm is‖A‖~ =
∑min(`,m)i=1 ‖
√nF si‖1 =
∑min(`,m)i=1
∑nj=1 Si,i,j. (Same as the
matrix nuclear norm of circ (A)).
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 14 / 28
Tensor Nuclear Norm
Theorem (Semerci,Hao,Kilmer,Miller)
The tensor nuclear norm is a valid norm.
Since the t-SVD extends to higher-order tensors [Martin et al, 2012],the norm does, as well.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 15 / 28
TNN in Regularization and Optimization
Yes, the t-SVD is orientation dependent, as is the norm. There areapplications where this is particularly useful!
Collection of “structurally similar” m× n images
Video frames (3D, 4D= color); completing missing data
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 16 / 28
Multi-energy XRay CT
k energy bins.
µ(r, Ek)→ Xk ∈ RN1×N2
xk = vec(Xk)
(φ, t) space into Nm source/det pairs
Then A ∈ RNm×Np where [A]ij represents the length of thatsegment of ray i passing through pixel j.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 17 / 28
Multi-energy XRay CT
0.02 0.04 0.06 0.08
0.2
0.4
0.6
0.8
Energy (Mev)
µ (1
/cm
)
cottonwaxnylonethanolsoapplexiglassrubber
The log-liklihood function to be optimized, assuming Poisson noise
Lk(xk) = ‖D−1/2k (Axk −mk)‖22
where Dk is diagonal, mk is log of scaled projection data.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 18 / 28
Regularized Problem
Let Xs.t.X:,:,k = Xk; recall xk = vec(Xk).
minX
(
N3∑k=1
Lk(xk) + αkR(xk)) + γ‖Z‖∗, sbj to Z = X
where R() denotes possible additional regularization.
Optimized via an Alternating Direction Method of Multipliers(ADMM) (see Boyd et al 2011, Bertsekas 1999)Let Lη(X,Z,Y) denote the augmented Lagrangian. The updates are
Xn+1 := argminX
Lη (X,Zn,Yn) ,
Zn+1 := argminZ
Lη(Xn+1,Z,Yn
)Yn+1 := Yn + η(Xn+1 −Zn+1)
First decouples, 2nd is computed via t-SVD “shrinkage”
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 19 / 28
t-SVD Shrinkage
Zn+1 := argminZ
Lη(Xn+1,Z,Yn
)
Compute ( 1ηYn + Xn+1) = U ∗ S ∗VT. Recall S contains Σk for
frontal slices of transformed tensor.
Zn+1 := U ∗ ρ(S) ∗VT, where ρ(S) takes the non-zeros in S andreplaces them with the difference between them and rho, if thatdifference is greater than 0, and 0 otherwise. ρ depends on theparameter in the problem.Shown that Zn+1 := U ∗ (S ∗D) ∗VT
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 20 / 28
Numerical Results
0
0.2
0.4
0.6
0.8
1
0
0.05
0.1
0.15
0.2
0.25
25 keV 85 keVMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 21 / 28
Numerical Results
FBP, TNN, TV, TNN+TV
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
Figure: Reconstruction results for 25 keV.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 22 / 28
Numerical Results
0
0.05
0.1
0.15
0.2
0.25
0
0.05
0.1
0.15
0.2
0.25
Figure: Reconstruction results for 85 keV.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 23 / 28
Tensor Completion
Given unknown tensor M of size n1 × n2 × n3, given a subset ofentries Mijk : (i, j, k) ∈ Ω where Ω is an indicator tensor of sizen1 × n2 × n3. Recover the entire M:
min ‖X‖~subject to PΩ(X) = PΩ(M)
The (i, j, k)th component of PΩ(X) is equal to Mijk if (i, j, k) ∈ Ωand zero otherwise.
Similar to the previous problem, this can be solved by ADMM, with 3update steps, one which decouples, one that is ashrinkage/thresholding step.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 24 / 28
Numerical Results
TNN minimization, Low Rank Tensor Completion (LRTC) [Liu, et al,2013] based on tensor-n-rank [Gandy, et al, 2011], and the nuclearnorm minimization on the vectorized video data [Cai, et al, 2010].MERL2 video, Basketball video
2with thanks to Dr. Amit AgrawalMisha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 25 / 28
Numerical Results
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 26 / 28
Numerical Results
Basketball video data of size 144× 256× 3× 80.
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 27 / 28
Conclusions
Introduced the notion of a tensor nuclear norm around theconcept of the t-SVD for tensors
Discussed in 3rd order case, but generalizes to higher order
The tensor nuclear norm is useful in imaging applications wherewe can exploit certain features that are orientation dependent
Efficiency in implementations (parallelism); exploit complexconjugacy in the Fourier domain
A different t-SVD and associated factorizations and norm basedon fast trig-transform [Kernfeld, et al, 2013]
Misha E. Kilmer (Tufts University) Tensor Nuclear Norm June 2013 28 / 28