+ All Categories
Home > Documents > Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang,...

Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang,...

Date post: 09-Oct-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
16
JOURNAL OF L A T E X CLASS FILES, VOL. XX, NO. X, MONTH YEAR 1 Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE, Ce Zhu, Fellow, IEEE Abstract Coupled tensor decomposition can reveal the latent data structure with shared factors. In this paper, because of its outstanding performance in some multi-way data processing applications, the tensor ring is used for coupled tensor decomposition by sharing parts of the tensor ring factors of two coupled tensors. The corresponding optimization model for low rank coupled tensor ring completion is developed to let two tensors help each other for missing component estimation effectively. It is solved by the block coordinate descent algorithm which efficiently solves a series of quadratic problems resulted from sampling pattern. The excess risk bound for this optimization model shows the theoretical performance enhancement in comparison with other coupled nuclear norm based methods. The proposed method is validated on numerical experiments on synthetic data, and experimental results on real-world data demonstrate its superiority over the state-of-the-arts methods in terms of recovery accuracy. Index Terms tensor network, coupled tensor factorization, block coordinate descent, excess risk bound, permutational Rademacher com- plexity I. I NTRODUCTION Tensor is a natural form to represent multi-dimensional data. The multi-way data processing performance can be effectively enhanced by tensor based techniques in comparison with the matrix counterparts [1]. For instance, a color image can be regarded as a 3-order tensor with two spatial modes and one channel mode, and tensor method can better exploit the coherence in all the modes simultaneously [2], [3]. As it shows in Fig. 1, multi-way data can be generated from different sources but share some same modes, and coupled tensor is a good representation for these data. These kinds of coupled tensors widely exist in bioinformatics [4], [5], recommendation system [6], link prediction [7], [8] and chemometrics [9]. During the acquisition and transmission, multi-way data can be Fig. 1: Illustration of three coupled tensors on mode-1. partly corrupted, and tensor completion can recover the missing entries by low-rank approximation based on various tensor decompositions [3]. Coupled tensor decomposition gives an equivalent representation of multi-way data by a set of small factors, and parts of the factors are shared for coupled signals [10]. The corresponding coupled tensor completion can achieve better performance than the individual one by further exploiting the latent structures on coupled modes, which indicates that the freedom degree of the coupled system is decreased. The completion methods are mainly divided into two categories. One is the convex method based on the optimization of low rank inducing nuclear norms, and the other one is the non-convex method based on the optimization of latent factors with pre-defined tensor rank. Most of current coupled tensor completion methods are based on CANDECOMP/PARAFAC (CP) decomposition and Tucker (TK) decomposition [11]–[14]. By generalizing the singular value decomposition (SVD) for matrices, the CP decomposition factorizes a D-order tensor into a linear combination of D rank-1 tensors, resulting in DIR parameters, where I is the dimensional size and R is the CP rank [15]. The TK decomposition gives a core tensor mode-multiplied by a number of matrices [16]. The recently proposed tensor ring (TR) decomposition represents a D-order tensor with cyclically contracted 3-order tensor factors of size R ×I ×R by using the matrix product state expression [17]. As shown in Fig. 2a, it has DIR 2 parameters, where [R; ··· ; R] is the TR rank. The TR decomposition allows a cyclical shift of factors due to the nature of trace operator, and reordering tensor dimension makes no difference to the decomposition. As a quantum-inspired decomposition, it outperforms CP decomposition and TK decomposition due to its powerful representation ability in many applications [18], [19]. Though This research is supported by National Natural Science Foundation of China (NSFC, No. 61602091, No. 61571102) and the Sichuan Science and Technology program (No. 2019YFH0008, No. 2018JY0035). The corresponding author is Yipeng Liu. All the authors are with School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu, 611731, China. (email: [email protected], [email protected], [email protected]). arXiv:2001.02810v3 [cs.LG] 23 Apr 2020
Transcript
Page 1: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 1

Low Rank Coupled Tensor Ring Completion

Huyan Huang, Yipeng Liu, Senior Member, IEEE, Ce Zhu, Fellow, IEEE

Abstract

Coupled tensor decomposition can reveal the latent data structure with shared factors. In this paper, because of its outstandingperformance in some multi-way data processing applications, the tensor ring is used for coupled tensor decomposition by sharingparts of the tensor ring factors of two coupled tensors. The corresponding optimization model for low rank coupled tensor ringcompletion is developed to let two tensors help each other for missing component estimation effectively. It is solved by the blockcoordinate descent algorithm which efficiently solves a series of quadratic problems resulted from sampling pattern. The excessrisk bound for this optimization model shows the theoretical performance enhancement in comparison with other coupled nuclearnorm based methods. The proposed method is validated on numerical experiments on synthetic data, and experimental results onreal-world data demonstrate its superiority over the state-of-the-arts methods in terms of recovery accuracy.

Index Terms

tensor network, coupled tensor factorization, block coordinate descent, excess risk bound, permutational Rademacher com-plexity

I. INTRODUCTION

Tensor is a natural form to represent multi-dimensional data. The multi-way data processing performance can be effectivelyenhanced by tensor based techniques in comparison with the matrix counterparts [1]. For instance, a color image can beregarded as a 3-order tensor with two spatial modes and one channel mode, and tensor method can better exploit the coherencein all the modes simultaneously [2], [3].

As it shows in Fig. 1, multi-way data can be generated from different sources but share some same modes, and coupled tensoris a good representation for these data. These kinds of coupled tensors widely exist in bioinformatics [4], [5], recommendationsystem [6], link prediction [7], [8] and chemometrics [9]. During the acquisition and transmission, multi-way data can be

Fig. 1: Illustration of three coupled tensors on mode-1.

partly corrupted, and tensor completion can recover the missing entries by low-rank approximation based on various tensordecompositions [3]. Coupled tensor decomposition gives an equivalent representation of multi-way data by a set of smallfactors, and parts of the factors are shared for coupled signals [10]. The corresponding coupled tensor completion can achievebetter performance than the individual one by further exploiting the latent structures on coupled modes, which indicates thatthe freedom degree of the coupled system is decreased.

The completion methods are mainly divided into two categories. One is the convex method based on the optimizationof low rank inducing nuclear norms, and the other one is the non-convex method based on the optimization of latent factorswith pre-defined tensor rank. Most of current coupled tensor completion methods are based on CANDECOMP/PARAFAC (CP)decomposition and Tucker (TK) decomposition [11]–[14]. By generalizing the singular value decomposition (SVD) for matrices,the CP decomposition factorizes a D-order tensor into a linear combination of D rank-1 tensors, resulting in DIR parameters,where I is the dimensional size and R is the CP rank [15]. The TK decomposition gives a core tensor mode-multiplied by anumber of matrices [16].

The recently proposed tensor ring (TR) decomposition represents a D-order tensor with cyclically contracted 3-order tensorfactors of size R×I×R by using the matrix product state expression [17]. As shown in Fig. 2a, it has DIR2 parameters, where[R; · · · ;R] is the TR rank. The TR decomposition allows a cyclical shift of factors due to the nature of trace operator, andreordering tensor dimension makes no difference to the decomposition. As a quantum-inspired decomposition, it outperformsCP decomposition and TK decomposition due to its powerful representation ability in many applications [18], [19]. Though

This research is supported by National Natural Science Foundation of China (NSFC, No. 61602091, No. 61571102) and the Sichuan Science and Technologyprogram (No. 2019YFH0008, No. 2018JY0035). The corresponding author is Yipeng Liu.

All the authors are with School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC),Chengdu, 611731, China. (email: [email protected], [email protected], [email protected]).

arX

iv:2

001.

0281

0v3

[cs

.LG

] 2

3 A

pr 2

020

Page 2: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 2

the TR rank is a vector, it is approximately effective to let all components have the same value [19], which alleviates theburden for tuning parameters. In [20], TR is used for coupled tensor fusion with different scales. There is no TR completionwith their factors directly coupled, as given in Fig. 2b.

(a) A graphical TR representa-tion of a 6-order tensor.

(b) A graphical coupled TR rep-resentation where the first 3modes are shared.

Fig. 2: Illustration of the coupled TR decomposition.

To the best of our knowledge, this is the first attempt to use TR for coupled tensor completion. Given pre-defined TR ranks,the coupled TR decomposition can be formulated as the optimization with respect to factors such that the deviation of theapproximation from the given tensor is minimized. In this paper, we propose the low rank coupled tensor ring completion(CTRC), which can be regarded as the coupled TR decomposition for incomplete tensors. The block coordinate descent (BCD)algorithm is used to solve the problem. The computation and storage complexities are analyzed. In numerical experiments,the proposed CTRC-BCD is tested on synthetic data to verify the theoretical analysis, and real-world data based experimentsdemonstrate the proposed method outperforms the state-of-the-art ones. The main contributions of this paper are as follows:

1) We propose a coupling model based on tensor ring. Two tensors are coupled by sharing parts of their TR factors. as itshows in Fig. 2b. The coupled TR decomposition calculate its factors by solving a non-linear fitting problem that aimsto minimize the squared error of the difference between estimate and given tensor. The coupled factors can either sharepart or whole of entries. The TR ranks need to be pre-defined.

2) The block coordinate descent algorithm is used to solve the coupled TR completion problem. It alternately solves a seriesof quadratic forms with respect to the latent factors. The Hessian matrix depends on its corresponding sampling pattern,resulting simple update scheme.

3) With the newly defined coupled F-norm for coupled tensors, we derive an excess risk bound using the recently proposedpermutational Rademacher complexity [21] with a modern mathematical tool called brackets method [22], [23]. It indicatesthat coupled tensors can be recovered with a lower sampling rate than that each tensor’s sampling bound.

4) The proposed method is benchmarked on the user-centered collaborative location and activity filtering (UCLAF), short-wave near-infrared spectrum (SW-NIR) and visual-infrared images datasets. It shows improved performance comparedwith the existing methods.

The rests of this paper are organized as follows. In Section II, we introduce basic notations and preliminaries of tensor, TRdecomposition and its relevant operations. In Section III, we state the CTRC problem and propose our algorithm, along withthe algorithmic complexity. We provide an excess risk bound for the coupled TR F-norm model in Section IV. In Section V,we perform a series of numerical experiments to compared the proposed method with the existing ones. Finally we concludeour work in Section VI.

II. NOTATIONS AND PRELIMINARIES

A. Notations

Throughout the paper, a scalar, a vector, a matrix and a tensor are denoted by a normal letter, a boldfaced lower-case letter, aboldfaced upper-case letter and a calligraphic letter, respectively. For instance, a D-order tensor is denoted as X ∈ RI1×···×ID ,where Id is the dimensional size for d-th mode, d = 1, · · · , D.

The Frobenius norm of X is defined as the squared root of the inner product of two tensors:

‖X‖F =√〈X ,X〉 =

√√√√ I1∑i1=1

· · ·ID∑iD=1

x2i1···iD . (1)

Projection PO : RI1×···×ID 7→ Rm projects a tensor onto the support (observation) set O, where

O := {(i1, . . . , iD) | entry on (i1, . . . , iD) is observed} (2)

Page 3: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 3

For example, the formulation for a D-th order tensor X is

PO (X )i1···iD =

{xi1···iD (i1, . . . , iD) ∈ O

0 (i1, . . . , iD) /∈ O. (3)

The Hadamard product ~ is an element-wise product. For D-th order tensors X and Y , the representation is

(X ~ Y)i1···iD = xi1···iD · yi1···iD . (4)

The d-shifting H-unfolding yields a matrix X{d,H} ∈ RId×Jd by permuting X with order [d, . . . ,D, 1, . . . , d− 1] andunfolding along its first H dimensions, where Jd =

∏Dn=1, n 6=d In.

As a natural extension of the traditional F-norm, the couple F-norm for coupled tensors can be defined as follows:

‖X ,Y‖CF ,√‖X‖2F + ‖Y‖2F, (5)

whose dual norm is itself. The upper bound of this norm is

L∏l=1

∥∥∥U (d)∥∥∥

F

√√√√ D1∏d=L+1

∥∥U (d)∥∥2

F+

D2∏d=L+1

∥∥V(d)∥∥2

F.

The newly defined coupled F-norm can be used as a measure for the estimate accuracy of coupled tensors.

B. Preliminaries of tensor ring decomposition

The non-canonical TR decomposition factorizes X ∈ RI1×···×ID into D cyclically contracted 3-order tensors as follows:[24]

X (i1, . . . , iD) = tr(U (1) (:, i1, :) · · · U (D) (:, iD, :)

), (6)

where U (d) ∈ RRd×Id×Rd+1 .Two methods are introduced for TR decomposition in [25]. The first one is based on the density matrix renormalization

group [26]. It firstly reshapes X into X{1,1} and applies SVD to derive X{1,1} = UΣV. It then reshapes U as the first TRfactor and applies SVD to ΣV. D−1 SVDs are performed afterwards. This method does not per-define TR rank and performsfast. The second method alternatively optimizes one of the TR factors while keeping the others fixed. It repeatedly performs theoptimization until the relative change

∥∥X k −X k−1∥∥

F/∥∥X k−1

∥∥F

or the relative error∥∥X k −X∥∥

F/ ‖X‖F decreases below

a certain pre-defined threshold. This method requires a pre-defined TR rank which affects the performance, and it is slowerthan the first one.

III. METHOD

We use R to represent the TR computation, and assume the first L TR factors of R1 and R2 are coupled. Operator R (·)means the TR contraction which yields a tensor given a set of TR factors. Supposing the the TR factors of R1 and R2

are {U} ={U (1), . . . ,U (D1)

}and {V} =

{V(1), . . . ,V(D2)

}, respectively, we have R ({U}) ∈ RI1×···×ID1 and R ({V}) ∈

RI′1×···×I′D2 .Given the coupled measurements T1 and T2, the optimization model for CTRC can be formulated as follows:

min{U},{V}

1

2‖PO1

(R ({U}))− PO1(T1)‖22+

1

2‖PO2 (R ({V}))− PO2 (T2)‖22

s. t. U (l) (1 : Γl, :, 1 : Γl+1) = V(l) (1 : Γl, :, 1 : Γl+1)

l = 1, . . . , L,

(7)

where Γl ∈ [1,min {Rl, R′l}], l = 1, . . . , L are the coupled distances, in which [R1; · · · ;RD1] and [R′1; · · · ;R′D2

] are theTR ranks of R ({U}) and R ({V}), respectively.

A. Algorithm

To solve problem (7), we use the block coordinate descent algorithm [27]. Specifically, it alternately optimizes the blockvariable U (d1) (or V(d2)) while keeping others fixed, thus the problem (7) is decomposed into D1 +D2 − L sub-problems.

Page 4: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 4

1) Update of the uncoupled factors of R1: we reformulate the problem (7) into

minU(d)

d=L+1,...,D1

1

2‖PO1

(R ({U}))− PO1(T1)‖22 (8)

Along with substitution ofW1~(·) for PO1(·), let Ad ∈ RId×RdRd+1 , Bd ∈ RRdRd+1×Jd and Cd ∈ RId×Jd be the unfoldings

of U (d), Bd and T1, respectively, where Bd is computed by contracting all the D1 TR factors of R1 except the d-th factor.Then we can get an equivalent form of (8) as follows:

minAd, d=L+1,...,D1

1

2‖W{d,1} ~ AdBd −W1{d,1} ~ Cd‖2F. (9)

Defining w(d)id

= W{d,1} (id, :) and a permutation matrix P(d)id

= eS(d)id

∈ RI 6=d×‖w(d)id‖0 , where ek is a vector of length

I6=d =∏D1

t=1 It/Id whose values are all zero but one in the k-th entry, k ∈ S(d)id

and S(d)id

={jd|w(d)

id(jd) = 1

}.

Note that the d-th sub-problem in (9) can be divided into Id sub-sub-problems, in which the row vectors a(d)id

= Ad (id, :)are taken as the block variables. Reformulating the id-th sub-sub-problem in the quadratic form and calculating its first-orderderivative, we have the optimal solution of the form

a(d)∗

id= −g

(d)id

H(d)†

id, (10)

where † is the Moore-Penrose pseudoinverse, andH(d)id

= B(d)

idB

(d)T

id, g

(d)id

= −c(d)id

B(d)T

id

c(d)id

= c(d)id

P(d)id, B

(d)

id= B(d)P

(d)id

.

The TR factor U (d) is optimized by performing (10) Id times to solve the d-th sub-problem of (9). The uncoupled TR factorsof R1 are updated by optimizing all D1 − L factors.

2) Update of the uncoupled factors of R2: This optimization is similar to the update of uncoupled TR factors of R1 by(10), and we neglect the deduction and just give the solution as follows:

a′(d)∗

i′d− g′

(d)i′d

H′(d)†

i′d, (11)

where H′(d)i′d

= B′(d)

i′dB′

(d)T

i′d, g′

(d)i′d

= −c′(d)

i′dB′

Td , z(d)

i′d= c′

(d)

i′dc′

(d)T

i′d, and the symbols with superscript ′ means that the

corresponding terms are derived from computation of R2 .3) Update of the coupled factors of R1 and R2: we can rewrite the problem (7) as follows:

minU(l),V(l)

d=1,...,L

1

2‖PO1

(R ({U}))− PO1(T1)‖22+

1

2‖PO2

(R ({V}))− PO2(T2)‖22

s. t. U (d) (1 : Γd, :, 1 : Γd+1) =

V(d) (1 : Γd, :, 1 : Γd+1) , d = 1, . . . , L.

(12)

Let A′d ∈ RI′d×RdRd+1 be the unfolding of V(d), C′d ∈ RI′d×J′d be the {d, 1} unfolding of T2 and W ′ be the tensor formof PO2 , and

Cd = {1, . . . ,Γd+1, Rd+1 + 1, . . . , Rd+1 + Γd+1,

. . . ,

ΓdRd+1 + 1, . . . ,ΓdRd+1 + Γd+1}, d = 1, . . . , L,

C′d = {1, . . . ,Γd+1, R′d+1 + 1, . . . , R′d+1 + Γd+1,

. . . ,

ΓdR′d+1 + 1, . . . ,ΓdR

′d+1 + Γd+1}, d = 1, . . . , L.

Page 5: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 5

An equivalence for (12) can be obtained as follows:

minAd,A

′d

d=1,...,L

1

2‖W{d,1} ~ AdBd −W{d,1} ~ Cd‖2F+

1

2‖W′

{d,1} ~ A′dB′d −W′

{d,1} ~ C′d‖2Fs. t. Ad (:,Cd) = A′d (:,C′d) , d = 1, . . . , L,

(13)

where the index sets Cd ∈ RΓdΓd+1 and C′d ∈ RΓdΓd+1 indicate which columns are coupled in Ad and A′d, respectively.The id-th sub-sub-problem of the d-th sub-problem of (13) can taken in the similar way as it does for optimization (9). We

split the variables in coupled factors into three new blocks, which are defined asα

(d)id

, Ad (id,Cd) = A′d (id,C′d)

β(d)id

, Ad (id, {1, . . . , RdRd+1} \Cd)

γ(d)id

, A′d (id, {1, . . . , R′dR′d+1} \C′d)

,

and the equivalent relationship hold as follows:Ad (id, :) =[α

(d)id,β

(d)id

]PTd

A′d (id, :) =[α

(d)id,γ

(d)id

]P′

Td

,

where {Pd =

[eCd

; e{1,...,RdRd+1}\Cd

]P′d =

[eC′d ; e{1,...,R′dR′d+1}\C′d

]are permutation matrices. Accordingly, we can can get

minα

(d)id,β

(d)id,γ

(d)id

1

2‖w(d)

id~[α

(d)id,β

(d)id

]PTdBd −w

(d)id

~ c(d)id‖22

+1

2‖w′(d)

id~[α

(d)id,γ

(d)id

]P′

TdB′d −w′

(d)id

~ c′(d)id‖22.

(14)

Letting H(d)id

= PTdH

(d)id

Pd and H′(d)

id= P′

TdH′

(d)id

P′d, we reformulate the Hessian matrices in a block form

H(d)id

,

[H

(d)11id

H(d)12id

H(d)21id

H(d)22id

], H′

(d)

id,

[H′

(d)11

idH′

(d)12

id

H′(d)21

idH′

(d)22

id

]such that

H(d)11id

∈ RΓdΓd+1×ΓdΓd+1 ,

H(d)12id

∈ RΓdΓd+1×RdRd+1−ΓdΓd+1 ,

H(d)21id

∈ RRdRd+1−ΓdΓd+1×ΓdΓd+1 ,

H(d)22id

∈ RRdRd+1−ΓdΓd+1×RdRd+1−ΓdΓd+1

and the similar sizes hold for H′(d)11

id, H′

(d)12

id, H′

(d)21

idand H′

(d)22

id. Define

g(d)id

, −c(d)id

B(d)T

idPd =

(d)id,η

(d)id

]g′

(d)id

, −c′(d)

idB′

(d)T

idP′d =

[ξ′

(d)id,η′(d)

id

]such that ξ(d)

id, ξ′

(d)id∈ RΓdΓd+1 , η(d)

id∈ RRdRd+1−ΓdΓd+1 and η′(d)

id∈ RR′dR′d+1−ΓdΓd+1 .

Page 6: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 6

We can deduce the solution (see Appendix A for details) as follows:[α

(d)id,β

(d)id,γ

(d)id

]∗= arg minα

(d)id,β

(d)id

γ(d)id

1

2

(d)id,β

(d)id,γ

(d)id

]H

(d)id

(d)id,β

(d)id,γ

(d)id

]T

+[α

(d)id,β

(d)id,γ

(d)id

] [ξ

(d)id

+ ξ′(d)id,η

(d)id,η′(d)

id

]T+

1

2z

(d)id

+1

2z′

(d)id

=− g(d)id

H(d)†

id, (15)

where g(d)id

=[ξ

(d)id

+ ξ′(d)id,η

(d)id,η′(d)

id

]and the Hessian matrix is

H(d)id

= H(d)11id

+ H′(d)11

idH

(d)12id

+ H(d)21T

idH′

(d)12

id+ H′

(d)21T

id

H(d)21id

+ H(d)12T

idH

(d)22id

0

H′(d)21

id+ H′

(d)12T

id0 H′

(d)22

id

.The details for this block coordinate descent method for coupled TR completion is outlined in Algorithm 1.

Algorithm 1 BCD for CTRC

Input: Two zero-filled tensors T1 and T2, two binary tensors W1 and W2, the maximal number of iterations KOutput: Two recovered tensors X and Y , two sets of TR factors {U} and {V}

1: Apply Algorithm 1 to initialize {U} and {V}2: for k = 1 to K do3: Update the uncoupled TR factors of R1 according to (10)4: Update the uncoupled TR factors of R2 according to (11)5: Update the coupled TR factors of R1 and R2 according to (15)6: Update X = R ({U}), Y = R ({V})7: if converged then8: break9: end if

10: end for11: return X , Y , {U}, {V}

Note that this algorithm can be easily extended to the case where more than two tensor rings are coupled, and only thescheme for updating the coupled components is changed in the generalized cases. We have the Hessian matrix defined in theform of block matrix as follows:

H(d)id{1, 1} =

N∑n=1

H(n)(d)11

id, Hid {n, n} = H(n)

(d)22

id

(n = 2, . . . , N) ,

H(d)id{1, n} = Hid {n, 1}

T= H(n)

(d)12

id+ H(n)

(d)21T

id

(n = 2, . . . , N) ,

H(d)id{m,n} = 0 (m 6= n, m 6= 1, n 6= 1)

and g(d)id

=[∑N

n=1 ξ(n)(d)

id,η

(d)id,η(N)(d)

id

], where H(n)

(d)

id= P

(n)T

d H(n)id

P(n)d and

g(n)(d)

id= −c(n)

(d)

idB

(n)T

d P(n)d =

[ξ(n)

(d)

id,η(n)(d)

id

].

Page 7: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 7

B. Computational Complexity

Assuming all the tensors X1, . . . , XN have the same size I1× . . .×ID and TR rank [R, . . . , R]. Supposing the number of thesamples for the n-th tensor is m. The computation of Hessian matrix H

(d)id

costs O(

SR×R4∏Dk=1, k 6=d Id

)= O

(mR4/Id

),

where SR is the sampling rate for Xn. Thus updating the d-th TR factor costs O(mR4

)and one iteration costs O

(mNDR4

).

The computation of H(d)†

idcosts O

(R6)

and the update of the d-th TR factor costs O(IdR

6). Hence one iteration costs

O(NR6

∑Dd=1 Id

).

The total computational cost of one iteration of BCD for CTRC is max{O(mNDR4

), O(NR6

∑Dd=1 Id

)}= O

(mNDR4

).

IV. EXCESS RISK BOUND

We define lT (·, ·) as the average of the perfect square trinomial l (·, ·) computed on a finite training set T. For conciseexpression of average test error, we use notation lT ({X ,Y} , {T1, T2}) to denote the average training error over T, where werefer to T ⊆ O as the union of T1 ⊆ O1 and T1 ⊆ O2. Similarly, we can define lS (X ,Y) as the average test error measuredby l (·, ·) over S ⊆ O⊥. As in [21], we assume that |Si| = |Ti| for any i ∈ {1, 2}.

Given an assumption that X = R ({U}) with TR rank [R, . . . , R] and each TR factor is a independent Gaussian randomtensor with zero mean and variance of σ2, we can define a hypothesis classH ,

{X ,Y | U (d) ∼ N

(0, σ2

),V(d) ∼ N

(0, σ2

)}.

Without loss of generality, we assume l (·, ·) is Λ-Lipschitz continuous since the F-norms of two tensors are centralized withoverwhelming probability.

By leveraging the recently proposed permutational Rademacher complexity [21], the following theorem characterizes theexcess risk of coupled TR completion.

Theorem 1. Under the hypothesis H mentioned before, the excess risk of the coupled TR completion (7) is bounded as

lS ({X ,Y} , {T1, T2})− lT ({X ,Y} , {T1, T2}) ≤

Λ

(1 +

2√2π |T| − 2

)(2σ)

D12√

|T1|

ΓD1

(IR2+1

2

)ΓD1

(IR2

2

) ·D2+1−LFD1−L

(− 1

2 ,IR2

2 ,..., IR2

2

1−IR2

2 ,..., 1−IR2

2

∣∣∣∣∣ (−1)D1+1−L |T1| (2σ)

D2

|T2| (2σ)D1

)+√

2 |T ∪ S| ln (1/δ)

(|T ∪ S| − 1/2)2 (16)

for D1 ≥ D2 and

lS ({X ,Y} , {T1, T2})− lT ({X ,Y} , {T1, T2}) ≤

Λ

(1 +

2√2π |T| − 2

)σD22

D22√

|T2|

ΓD2

(IR2+1

2

)ΓD2

(IR2

2

) ·D1+1−LFD2−L

(− 1

2 ,IR2

2 ,..., IR2

2

1−IR2

2 ,..., 1−IR2

2

∣∣∣∣∣ (−1)D2+1−L |T2| (2σ)

D1

|T1| (2σ)D2

)+√

2 |T ∪ S| ln (1/δ)

(|T ∪ S| − 1/2)2 (17)

for D2 ≥ D1 respectively with probability at least 1−δ. Moreover, with the same probability, the excess risk of each individualTR completion is bounded by

Λ

(1 +

2√2π |Tn| − 2

)(2σ)

Dn2√

|Tn|

ΓDn

(IR2+1

2

)ΓDn

(IR2

2

) +√2 |Tn ∪ Sn| ln (1/δ)

(|Tn ∪ Sn| − 1/2)2 , n = 1, 2. (18)

Theorem 1 reports a phenomenon that the risk bound of coupled completion can be much lower than that of individualcompletion. Notice that these expressions are multiplicative, it suffices to illustrate the relationship by comparing each term.The term

√|T| in the denominator is larger than each single part. Note that the hypergeometric function approximates to 1 if

Page 8: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 8

the input argument approximates to 0. Supposing that D1 ≥ D2 without loss of generality, this function can yield a numberclose to 1 by choosing |T1| and |T2| such that |T1| is enough less than |T2|. Thus the risk bound (16) or (17) is less than thesum of the bounds (18).

To discuss the effect of these parameters on the risk bound, note that E[√

X]≤√E [X], a supremum of the risk bound

(follows from Appendix B) is as follows:

Λ

(1 +

2√2π |T| − 2

)(19)

(2σ)Dn2

ΓL(IR2+1

2

)ΓL(IR2

2

)√

(σIR2)D1−L

|T1|+

(σIR2)D2−L

|T1|. (20)

Therefore, increasing |T1| or |T2| or both are beneficial to the recovery performance, but the increments of I , R, D1 and D2

are unfavorable. To examine the effect of L, let f (L) be the above term in (20), we have

f ′ (L) =1

2f (L)

[ln

(2

Γ2(k+1

2

)Γ2(k2

) )− ln (k)

],

where k = IR2. For a Chi distributed variable X with degree of freedom k, we have E2 [X] ≤ E[X2], and f ′ (L) < 0. This

means increasing the number of coupled dimensions improves the recovery performance.This coupled completion can be also regarded from the viewpoint of mutual information. Two tensor rings have no mutual

information if they are not coupled, thus they cannot help each other’s recovery. On the other hand, it becomes the differentialentropy if they are totally coupled. The amount of information reaches the maximum, which results in the best recoveryperformance. The information transfer consists in the summation of the Hessian matrices in Algorithm 1 since the amplituderatio of the matrices is |T1| / |T2|.

As a comparison, the bound of our coupled F-norm is on the order of ID/2 which is lower than the bound in [28], i.e.,O(I(D+1)/2 lnD−1/2 (I)

)by considering bounding the nuclear norm of the coupled tensor.

V. NUMERICAL EXPERIMENTS

In this section, the proposed algorithm is evaluated on two kinds of datasets, i.e., synthetic data and real-world data. Thesynthetic dataset is employed to verify the theoretical results, and real-world data base experiments are used to test the empiricalperformance of the proposed CTRC and three other state-of-the-art algorithms, including the coupled nuclear norm minimizationfor coupled tensor completion (CNN) [28], advanced coupled matrix and tensor factorization (ACMTF) [12], [13] structureddata fusion by nonlinear least squares (SDF) [14]. The low rank tensor completion via alternating least square (TR-ALS) [19]is also compared as a baseline, since it can be regarded as the individual tensor ring completion.

The root of mean square error (RMSE) defined as RMSE = ‖X −X‖F/√|X | is used to measure the completion accuracy,

where X is the ground-truth and X is the estimate of X . We use computational CPU time (in seconds) as a measure ofalgorithmic complexity.

The sampling rate (SR) is defined as the ratio of the number of samples to the total number of the elements of tensor X , whichis denoted as SR = |O| / |X |. For fair comparison, the parameters in each algorithm are tuned to give optimal performance.For the proposed BCD for CTRC, one of the stop criteria is that the relative change RC = ‖Xk − Xk−1‖F/‖Xk−1‖F is lessthan a tolerance that is set to 1× 10−8. We set the maximal number of iterations K = 200 in experiments on synthetic dataand K = 100 in experiments on real-world data.

All the experiments are conducted in MATLAB 9.7.0 on a computer with a 2.8GHz CPU of Intel Core i7 and a 16GBRAM.

A. Synthetic Data

In this section, we test our algorithm on randomly generated tensor data for completion problem. We generate two tensorsof the same size 20× 20× 20× 20 using the TR decomposition (6). The TR factors are randomly sampled from the standardnormal distribution, i.e., U (d) (rd, id, rd+1) ∼ N (0, 1), V(d) (rd, id, rd+1) ∼ N (0, 1), d = 1, . . . , 4. Then we couple two tensorrings by setting U (d) = V(d), d = 1, . . . , 3. We compute the tensors T1 and T2 according to these factors.

The proposed algorithm’s performance is evaluated by the phase transition on TR rank versus sampling rate of tensor T1

under different settings of sampling rate of tensor T2 and the number of coupled TR factors. The sampling rate of T1 rangesfrom 0.005 to 0.1 with interval 0.005, and the sampling rate of T2 ranges from 0.05 to 0.2 with interval 0.05. The TR rankvaries from 2 to 8, and the number of coupled TR factors is 1, 2 and 3.

The results are shown in Fig. 3, where Dimc represents the number of the coupled TR factors, and SR1 and SR2 represent thesampling rates of T1 and T2, respectively. In phase transition, the white patch means a successful recovery whose RMSE is less

Page 9: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 9

than 1× 10−6, and the black patch means a failure. The successful area increases when sampling rate of T2 or the number ofthe coupled TR factors increases. This is because the first tensor ring can benefit from the second one with increasing samplingrate or number of coupled factors, though the recovery of T1 is beyond its sampling limit. Numerically, the magnitude of thesingular values of Hessian matrix H′id is SR2/SR1 times the magnitude of the singular values of Hid , hence the H′id isdominated in the updating scheme.

(a) The phase transition of an individual completion. (b) The phase transition on TR rank versus sampling rate oftensor T1 with various sampling rates of tensor T2 and thenumbers of coupled TR factors.

Fig. 3: The exact recovery result of randomly generated data with random sampling.

B. The UCLAF DataIn this sub-section, the user-centered collaborative location and activity filtering (UCLAF) data [29] is used. It comprises

164 users’ GPS trajectories based on their partial 168 locations and 5 activity annotations. In the experiment, we only usethe link activity, i.e., only entry greater than 0 is set to 1. users with no data are discarded, which results in a new tensor ofsize 144× 168× 5. The user-location matrix which has size 144× 168 is coupled with this tensor as a side information. Werandomly choose 50% samples from the tensor and the matrix independently. For each algorithm we conduct 10 experimentsfor avoiding fortuitous result.

(a) RMSE (b) CPU time (s)

Fig. 4: The completion result of the UCLAF data derived by five algorithms. The panel (a) is the RMSE comparison and thepanel (b) is the elapsed CPU time.

Fig. 4 shows the average completion accuracy result versus tensor rank by five algorithms, where the accuracy is measuredby RMSE. The label “TR” means the individual tensor ring completion method which is a baseline for comparing with thecoupled tensor ring completion methods. From the result, the coupled completion method performs better than the individualone. The proposed CTRC shows lower RMSEs in both user-loc-act tensor completion and user-loc matrix completion, whichillustrates that the coupled TR’s F-norm can lead to better performance compared with the other coupled norms.

Page 10: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 10

C. The SW-NIR Data

A dataset consists of a set of short-wave near-infrared (SW-NIR) spectrums measured on an HP diode array spectrometer isused in this subsection [30]. It is used to trace the influence of the temperature on vibrational spectrum and the consequencesfor the predictive ability of multivariate calibration models. The spectrum of 19 mixtures of ethanol, water, isopropanol andpure compounds are recorded in a 1 cm cuvette at different temperatures, i.e., 30, 40, 50, 60 and 70 degrees Celsius. We stackeach spectrum recorded at different temperatures in the third dimension and forms a tensor of size 512× 22× 5. The coupledmatrix is derived by stacking the temperature records in a similar way, which results in a matrix of size 22× 5. We randomlychoose 50% entries from the tensor and the matrix independently. For each algorithm we repeatedly conduct 10 experiments.

(a) RMSE (b) CPU time (s)

Fig. 5: The completion result of the SW-NIR data derived by five algorithms. The panel (a) is the RMSE comparison and thepanel (b) is the elapsed CPU time.

Fig. 5 provides the completion results. It can be seen that the proposed CTRC generates the lowest RMSE for both thespectrum tensor and the temperature matrix when TR rank is 2. The performance deteriorates when the TR rank becomeslarger, which may imply the overfitting occurs [19]. The individual TR completion method performs worse than the coupledones, which indicates the effectiveness of our method.

D. The Visual-Infrared Images

The dataset called “Reek” is used in this sub-section [31], which consists of a visual image and its infrared copy, as itshows in Fig. 6, where the left one is an RGB image and the right one is an infrared image. Both of them are compressedwith resolution 240× 320 since the original size is too large to conduct the experiment. We randomly select 50% entries fromeach image, and each algorithm is run for 10 times independently.

Fig. 6: Display of the Reek data. The left figure is the RGB image and the right one is its infrared observation.

It can be seen from Fig. 7 that the CTRC yields lower RMSE than other coupled completion methods, though the uncoupledcompletion method produces a similar RMSE, which demonstrates the lack of latent coupled structure of this dataset.

VI. CONCLUSION

This paper proposes to use tensor ring for coupled tensor completion, and a block coordinate descent algorithm is developedfor it. We provide an excess risk bound for the proposed method, which implies the sampling complexity can be reduced. Thenumerical results confirm the theoretical analysis and demonstrate the completion performance improvement in comparisonwith the state-of-the-art methods.

Page 11: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 11

RMSE CPU time (s)

Fig. 7: The completion result of the SW-NIR data derived by five algorithms. The panel (a) is the RMSE comparison and thepanel (b) is the elapsed CPU time.

APPENDIX AOPTIMIZATION ON COUPLED TR FACTORS OF R1 AND R2

To solve problem (14), we calculate the second-order partial derivatives of the objective function with respect to α(d)id

, β(d)id

(d)id

, respectively. We write the objective function as

f(d)id

=1

2

(d)id,β

(d)id

]H

(d)id

(d)id,β

(d)id

]T−[

α(d)id,β

(d)id

]PTdBdc

(d)T

id+

1

2c

(d)id

c(d)T

id+

1

2

(d)id,γ

(d)id

]H′

(d)

id

(d)id,γ

(d)id

]T−[

α(d)id,γ

(d)id

]P′

TdB′dc′

(d)T

id+

1

2c′

(d)

idc′

(d)T

id,

then there is∂fid

∂α(d)id

=∂

∂α(d)id

{1

(d)id

(H

(d)11id

+ H′(d)11

id

(d)T

id+

α(d)id

(d)id

(H

(d)21id

+ H(d)12T

id

)+ γ

(d)id

(H′

(d)21

id+

H′(d)12T

id

)+ ξ

(d)id

+ ξ′(d)id

]T

+1

(d)id

H(d)22id

β(d)T

id+

1

(d)id

H′(d)22

idγ

(d)T

id+ β

(d)idη

(d)T

id+ γ

(d)idη′(d)T

id+

1

2z

(d)id

+1

2z′

(d)id

}=α

(d)id

(H

(d)11id

+ H′(d)11

id

)+ %

(d)id,

where

%(d)id

=β(d)id

(H

(d)21id

+ H(d)12T

id

)+ γ

(d)id

(H′

(d)21

id+ H′

(d)12T

id

)+ ξ

(d)id

+ ξ′(d)id.

Page 12: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 12

Thus we have

∂2fid

∂α(d)2

id

= H(d)11id

+ H′(d)11

id

∂2fid

∂α(d)id∂β

(d)id

= H(d)12id

+ H(d)21T

id

∂2fid

∂α(d)id∂γ

(d)id

= H′(d)12

id+ H′

(d)21T

id

. (21)

Then we deduce∂fid

∂β(d)id

=∂

∂β(d)id

{1

(d)id

H(d)22id

β(d)T

id+ β

(d)id

(d)id

(H

(d)12id

+H(d)21T

id

)+ η

(d)id

]T+

1

(d)id

H′(d)22

idγ

(d)T

id+

γ(d)id

(d)id

(H′

(d)12

id+ H′

(d)21T

id

)+ η′(d)

id

]T

+

1

(d)id

(H

(d)11id

+ H′(d)11

id

(d)T

id+

α(d)id

(d)id

+ ξ′(d)id

)T

+1

2z

(d)id

+1

2z′

(d)id

}=β

(d)id

H(d)22id

+ ϑ(d)id,

where ϑ(d)id

= α(d)id

(H

(d)12id

+ H(d)21T

id

)+ η

(d)id

. Hence there is∂2fid

∂β(d)2

id

= H(d)22id

,∂2fid

∂β(d)id∂α

(d)id

= H(d)21id

+ H12T

id

∂2fid

∂β(d)id∂γ

(d)id

= 0

. (22)

Similarly, we derive ∂2fid

∂γ′(d)2

id

= H′(d)22

id,

∂2fid

∂γ(d)id∂α

(d)id

= H′(d)21

id+ H′

(d)12T

id

∂2fid

∂γ(d)id∂β

(d)id

= 0

. (23)

Incorporating (21) – (23), we derive the Hessian matrix

Hid = H(d)11id

+ H′(d)11

idH

(d)12id

+ H(d)21T

idH′

(d)12

id+ H′

(d)21T

id

H(d)21id

+ H(d)12T

idH

(d)22id

0

H′(d)21

id+ H′

(d)12T

id0 H′

(d)22

id

.APPENDIX B

PROOF OF THEOREM 1A. The expectation of a linear combination of products of independent variables

Supposing Xi, i = 1, . . . ,m and Yj , j = 1, . . . , n are the independent Gamma variables with the same shape parameterk/2 and scale parameter 2σ, where σ is the standard deviation of all elements of each TR factor. It follows that the densityfunction of Xi is p (xi) = x

k/2−1i exp (−xi/2σ) / (2σ)

k/2Γ (k/2), where Γ (·) is the Gamma function. We assume 0 < α < 1

and 0 < β < 1, then the expectation of√α∏mi=1Xi + β

∏nj=1 Yj is given by the multiple integral

∫ +∞

0

· · ·∫ +∞

0

√√√√α

m∏i=1

xi + β

n∏j=1

yj

m∏i=1

p (xi)

n∏j=1

p (yj)

dx1 · · · dxmdy1 · · · dyn.

Page 13: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 13

The calculation of this integral is done with the help of the method of brackets, which expands a definite integral evaluatingover the half line [0,+∞) as a series consisting of the brackets. For example, the notation 〈a〉 stands for the divergentintegral

∫ +∞0

xa−1dx. If function f (x) admits the formal power series∑+∞n=0 anx

αn+β−1, then the improper integral off is formalized by

∫ +∞0

f (x) dx =∑+∞n=0 an〈αn + β〉. The indicator φn , (−1)

n/Γ (n+ 1) will be used in the series

expressions when applying the method of brackets. The Pochhammer symbols defined as (b)n ,= Γ (n+ b) /Γ (b) is asystematic procedure in the simplification of the series. An exponential function exp (−x) can be represented as

∑n φnx

n

in the framework of the method of brackets. Another useful rule is that a multinomial (x1 + · · ·+ xm)a is expanded as∑

{n} φ{n}xn11 · · ·xnm

m 〈n1 + · · ·+ nm − a〉/Γ (−a).We start with the two rules, slinging out the terms that do not contain the integral variables, merging the remained terms

and substituting the integral with brackets, and the integral is transformed into

(2σ)− (m+n)k

2

Γ(− 1

2

)Γm+n

(k2

) +∞∑w1,w2=0

φw1w2αw1βw2〈w1 + w2 −

1

2〉

m∏i=1

n∏j=1

∑pi

∑qj

φpiqj

(2σ)pi+qj

〈pi + w1 +k

2〉〈qj + w2 +

k

2〉.

Afterwards, we choose w1 and w2 as free variables and eliminate the other brackets. The result shown below follows fromthe rule that the value assigned to

∑n φnf (n) 〈cn + d〉 is f (n∗) Γ (−n∗) / |c|, where n∗ is obtained from the vanishing of

the bracket.

1

Γ(− 1

2

)Γm+n

(k2

) +∞∑w1,w2=0

φw1w2 [α (2σ)m

]w1 [β (2σ)

n]w2

Γm(w1 +

k

2

)Γn(w2 +

k

2

)〈w1 + w2 −

1

2〉

The matrix of coefficients left has rank 1, thus it produces two series as candidates for the values of the integral, one per freevariable. The simplified formulation derives from the Pochhammer symbols and the transformation (b)−n = (−1)

n/ (1− b)n

which can be proved by repeatedly using the recurrence relation Γ (x) = Γ (x+ 1) /x = (x− 1) Γ (x− 1). The final result isobtained by introducing the hypergeometric function pFq (······|·).

1) Case 1: The variable w1 is free. Thus Plugging w∗2 = 1/2− w1 into the rule gives

T1 =√β (2σ)

nΓn(k+1

2

)Γn(k2

) m+1Fn

(− 1

2 ,k2 ,...,

k2

1−k2 ,..., 1−k

2

| (−1)n+1

α (2σ)m

β (2σ)n

).

2) Case 2: The variable w2 is free. Then Plugging w∗1 = 1/2− w2 into the rule yields

T2 =√α (2σ)

mΓm(k+1

2

)Γm(k2

) n+1Fm

(− 1

2 ,k2 ,...,

k2

1−k2 ,..., 1−k

2

| (−1)m+1

β (2σ)n

α (2σ)m

).

The selection of the two expressions depends on the convergence condition of the hypergeometric function. The first one isemployed if n ≥ m, otherwise the second one is considered.

B. Bounding the expectation of the F-norm of two coupled tensorsWithout loss of generality, supposing X and Y are D1-order and D2-order tensors coupled on their first L modes, with the

same TR rank and a dimensional size. To calculate E√α ‖X‖2F + β ‖Y‖2F, we first note that ‖·‖F is submultiplicative, thus

E√α ‖X‖2F + β ‖Y‖2F

≤EL∏l=1

∥∥∥U (l)∥∥∥

F

√√√√α

D1∏d1=L+1

∥∥U (d1)∥∥2

F+ β

D2∏d2=L+1

∥∥V(d2)∥∥2

F

=√α (2σ)

D12

ΓD1(k+1

2

)ΓD1

(k2

) ·D2+1−LFD1−L

(− 1

2 ,k2 ,...,

k2

1−k2 ,..., 1−k

2

∣∣∣∣∣ (−1)D1+1−L β (2σ)

D2

α (2σ)D1

)(24)

Page 14: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 14

holds for D1 ≥ D2 and

E√α ‖X‖2F + β ‖Y‖2F

≤√β (2σ)

D22

ΓD2(k+1

2

)ΓD2

(k2

) ·D1+1−LFD2−L

(− 1

2 ,k2 ,...,

k2

1−k2 ,..., 1−k

2

∣∣∣∣∣ (−1)D2+1−L α (2σ)

D1

β (2σ)D2

)(25)

holds for D2 ≥ D1.

C. Bounding the excess risk

A subset xm1containing m1 = |S1 ∪ T1| elements is sampled uniformly without replacement from vec (X ). We concatenate

xm1 and ym2 as a vector zm , [xm1 ; ym2 ] where m = |S ∪ T|,

Qm,n(lT, zm

)= E

zn

[supX ,Y∈H

lT (zk, tk)− lT (zn, tn)

],

where zn, n ∈ {1, . . . ,m− 1} is a random subset of zm containing n elements sampled uniformly without replacement andzk , zm\zn.

Under the hypothesis H mentioned before, letting m = 2n = |T1 ∪ T2|, the expectation of the permutational Rademachercomplexity is bounded as follows:

Ezm

[Qm,m/2

(lT, zm

)]≤ E

zm

{(1 +

2√2π |T| − 2

)Eε

[supX ,Y∈H

2

|T|εTlT (zm, tm)

]}

≤ Ezm

(1 +

2√2π |T| − 2

)2

|T|Eε

[supX ,Y∈H

εTzm

]}

≤Λ

(1 +

2√2π |T| − 2

)2

|T|Eε,zm

[supX ,Y∈H

‖ε‖F ‖zm‖F

]

(1 +

2√2π |T| − 2

)2√|T|

Ezm

[supX ,Y∈H

‖zm‖F

]

(1 +

2√2π |T| − 2

)2√|T|

Em+ 1

m∑i=0

1(|T1∪S1|i

)(|T2∪S2|m−i

)∑

z(1)m ⊆T1∪S1

∑z(2)m ⊆T2∪S2

√ ∑xj∈z(1)

m

x2j +

∑xj∈z(2)

m

y2k

≤Λ

(1 +

2√2π |T| − 2

)2√|T|

1

m+ 1

Em∑i=0

√√√√(|T1∪S1|−1i−1

)(|T1∪S1|i

) ‖xT1∪S1‖22 +

(|T2∪S2|−1m−i−1

)(|T2∪S2|m−i

) ‖yT2∪S2‖22

=√

(1 +

2√2π |T| − 2

)E

√‖xT1∪S1‖

22

|T1 ∪ S1|+‖yT2∪S2‖

22

|T2 ∪ S2|

≤Λ

(1 +

2√2π |T| − 2

)E

√‖X‖2F|T1|

+‖Y‖2F|T2|

,

where the first inequality follows from the Theorem 3 in [21], the second inequality is a result of Rademacher contraction,the third inequality comes from the Hlder’s inequality, the forth inequality is a consequence of arithmetic mean-quadraticmean inequality. Due to the hypothesis H, the final bounds can be derived by plugging (24) and (25) with α = 1/ |T1| andβ = 1/ |T2|.

Page 15: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 15

REFERENCES

[1] A. Cichocki, D. Mandic, L. De Lathauwer, G. Zhou, Q. Zhao, C. Caiafa, and H. A. Phan, “Tensor decompositions for signal processing applications:from two-way to multiway component analysis,” IEEE Signal Processing Magazine, vol. 32, no. 2, pp. 145–163, 2015.

[2] J. Liu, P. Musialski, P. Wonka, and J. Ye, “Tensor completion for estimating missing values in visual data,” IEEE transactions on pattern analysis andmachine intelligence, vol. 35, no. 1, pp. 208–220, 2012.

[3] Z. Long, Y. Liu, L. Chen, and C. Zhu, “Low rank tensor completion for multiway visual data,” Signal Processing, vol. 155, pp. 301–316, 2019.[4] S. Gandy, B. Recht, and I. Yamada, “Tensor completion and low-n-rank tensor recovery via convex optimization,” Inverse Problems, vol. 27, no. 2,

p. 025010, 2011.[5] J. A. Bazerque, G. Mateos, and G. B. Giannakis, “Rank regularization and Bayesian inference for tensor completion and extrapolation,” IEEE Transactions

on Signal Processing, vol. 61, no. 22, pp. 5689–5703, 2013.[6] P. Symeonidis, “Matrix and tensor decomposition in recommender systems,” in Proceedings of the 10th ACM Conference on Recommender Systems,

pp. 429–430, ACM, 2016.[7] Y. Liu, F. Shang, H. Cheng, J. Cheng, and H. Tong, “Factor matrix trace norm minimization for low-rank tensor completion,” in Proceedings of the

2014 SIAM International Conference on Data Mining, pp. 866–874, SIAM, 2014.[8] B. Ermis, E. Acar, and A. T. Cemgil, “Link prediction in heterogeneous data via generalized coupled tensor factorization,” Data Mining and Knowledge

Discovery, vol. 29, no. 1, pp. 203–236, 2015.[9] A. Narita, K. Hayashi, R. Tomioka, and H. Kashima, “Tensor factorization using auxiliary information,” Data Mining and Knowledge Discovery, vol. 25,

no. 2, pp. 298–324, 2012.[10] K. Y. Yılmaz, A. T. Cemgil, and U. Simsekli, “Generalised coupled tensor factorisation,” in Advances in neural information processing systems,

pp. 2151–2159, 2011.[11] E. Acar, T. G. Kolda, and D. M. Dunlavy, “All-at-once optimization for coupled matrix and tensor factorizations,” arXiv preprint arXiv:1105.3422, 2011.[12] E. Acar, A. J. Lawaetz, M. A. Rasmussen, and R. Bro, “Structure-revealing data fusion model with applications in metabolomics,” in The 35th Annual

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2013), pp. 6023–6026, IEEE, 2013.[13] E. Acar, E. E. Papalexakis, G. Gurdeniz, M. A. Rasmussen, A. J. Lawaetz, M. Nilsson, and R. Bro, “Structure-revealing data fusion,” BMC bioinformatics,

vol. 15, no. 1, p. 239, 2014.[14] L. Sorber, M. Van Barel, and L. De Lathauwer, “Structured data fusion,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 4, pp. 586–600,

2015.[15] M. Zhou, Y. Liu, Z. Long, L. Chen, and C. Zhu, “Tensor rank learning in CP decomposition via convolutional neural network,” Signal Processing:

Image Communication, vol. 73, pp. 12–21, 2019.[16] Y. Liu, Z. Long, H. Huang, and C. Zhu, “Low cp rank and tucker rank tensor completion for estimating missing components in image data,” IEEE

Transactions on Circuits and Systems for Video Technology, vol. 30, no. 4, pp. 944–954, 2020.[17] H. Huang, Y. Liu, J. Liu, and C. Zhu, “Provable tensor ring completion,” Signal Processing, vol. 171, p. 107486, 2020.[18] J. A. Bengua, H. N. Phien, H. D. Tuan, and M. N. Do, “Efficient tensor completion for color image and video recovery: low-rank tensor train,” IEEE

Transactions on Image Processing, vol. 26, no. 5, pp. 2466–2479, 2017.[19] W. Wang, V. Aggarwal, and S. Aeron, “Efficient low rank tensor ring completion,” in Computer Vision (ICCV), 2017 IEEE International Conference

on, IEEE, 2017.[20] Y. Xu, Z. Wu, J. Chanussot, and Z. Wei, “Hyperspectral images super-resolution via learning high-order coupled tensor ring representation,” IEEE

Transactions on Neural Networks and Learning Systems, 2020.[21] I. Tolstikhin, N. Zhivotovskiy, and G. Blanchard, “Permutational rademacher complexity,” in International Conference on Algorithmic Learning Theory,

pp. 209–223, Springer, 2015.[22] I. Gonzalez, V. Moll, and A. Straub, “The method of brackets. part 2: Examples and applications,” Gems in Experimental Mathematics, vol. 517,

pp. 157–172, 2010.[23] I. Gonzalez, K. Kohl, L. Jiu, and V. H. Moll, “An extension of the method of brackets. part 1,” Open Mathematics, vol. 15, no. 1, pp. 1181–1211, 2017.[24] R. Orus, “A practical introduction to tensor networks: Matrix product states and projected entangled pair states,” Annals of Physics, vol. 349, pp. 117–158,

2014.[25] Q. Zhao, M. Sugiyama, and A. Cichocki, “Learning efficient tensor representations with ring structure networks,” arXiv preprint arXiv:1705.08286,

2017.[26] I. V. Oseledets, “Tensor-train decomposition,” SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011.[27] Y. Xu and W. Yin, “A block coordinate descent method for regularized multiconvex optimization with applications to nonnegative tensor factorization

and completion,” SIAM Journal on Imaging Sciences, vol. 6, no. 3, pp. 1758–1789, 2013.[28] K. Wimalawarne and H. Mamitsuka, “Efficient convex completion of coupled tensors using coupled nuclear norms,” in Advances in Neural Information

Processing Systems, pp. 6902–6910, 2018.[29] V. W. Zheng, B. Cao, Y. Zheng, X. Xie, and Q. Yang, “Collaborative filtering meets mobile recommendation: A user-centered approach,” in Twenty-Fourth

AAAI Conference on Artificial Intelligence, 2010.[30] F. Wulfert, W. T. Kok, and A. K. Smilde, “Influence of temperature on vibrational spectra and consequences for the predictive ability of multivariate

models,” Analytical chemistry, vol. 70, no. 9, pp. 1761–1767, 1998.[31] A. Toet, M. J. de Jong, M. A. Hogervorst, and I. T. Hooge, “Perceptual evaluation of color transformed multispectral imagery,” Optical Engineering,

vol. 53, no. 4, p. 043101, 2014.

Huyan Huang received the B.S. degree from Wuhan University of Technology, Wuhan, China, in 2018. He is currently a graduate withthe University of Electronic Science and Technology of China, Chengdu, China. His research interests include tensor decompositionand its machine learning applications.

Page 16: Low Rank Coupled Tensor Ring Completion · Low Rank Coupled Tensor Ring Completion Huyan Huang, Yipeng Liu, Senior Member, IEEE ... recommendation system [6], link prediction [7],

JOURNAL OF LATEX CLASS FILES, VOL. XX, NO. X, MONTH YEAR 16

Yipeng Liu (S’09-M’13-SM’17) received the B.S. degree in biomedical engineering and the Ph.D. degree in information andcommunication engineering from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2006and 2011, respectively. In 2011, he was a Research Engineer at Huawei Technologies. From 2011 to 2014, he was a Research Fellowat the University of Leuven, Leuven, Belgium. Since 2014, he has been an Associate Professor with UESTC, Chengdu, China.

His research interests include compressive sensing, tensor signal processing, and deep neural networks. He has been an AssociateEditor of IEEE Signal Processing Letters and a Lead Guest Editor of Signal Processing: Image Communication.

Ce Zhu (M’03-SM’04-F’17) received the B.S. degree in communication engineering from Sichuan University, Chengdu, China, in1989, and the M.Eng and Ph.D. degrees from Southeast University, Nanjing, China, in 1992 and 1994, respectively, all in electronicand information engineering. He held a post-doctoral research position with the Chinese University of Hong Kong in 1995, the CityUniversity of Hong Kong, and the University of Melbourne, Australia, from 1996 to 1998. He was with Nanyang TechnologicalUniversity, Singapore, for 14 years from 1998 to 2012, where he was a Research Fellow, a Program Manager, an Assistant Professor,and then promoted to an Associate Professor in 2005. He has been with University of Electronic Science and Technology of China,Chengdu, China, as a professor since 2012.

His research interests include video coding and communications, video analysis and processing, 3D video, visual perception andapplications. He has served on the editorial boards of a few journals, including as an Associate Editor of IEEE TRANSACTIONS ONIMAGE PROCESSING, IEEE TRANSACTIONS ON BROADCASTING, IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMSFOR VIDEO TECHNOLOGY, IEEE SIGNAL PROCESSING LETTERS, an Editor of IEEE COMMUNICATIONS SURVEYS AND

TUTORIALS, and an Area Editor of SIGNAL PROCESSING: IMAGE COMMUNICATION.


Recommended