# Note: low-rank tensor train completion with side information based on Riemannian optimization

We consider the low-rank tensor train completion problem when additional side information is available in the form of subspaces that contain the mode-k fiber spans. We propose an algorithm based on Riemannian optimization to solve the problem. Numerical experiments show that the proposed algorithm requires far fewer known entries to recover the tensor compared to standard tensor train completion methods.

## Authors

• 2 publications
• 2 publications
05/08/2018

### Low Rank Tensor Completion for Multiway Visual Data

Tensor completion recovers missing entries of multiway data. Teh missing...
10/08/2021

### Tensor train completion: local recovery guarantees via Riemannian optimization

In this work we estimate the number of randomly selected elements of a t...
08/02/2021

### Tensor completion using geodesics on Segre manifolds

We propose a Riemannian conjugate gradient (CG) optimization method for ...
12/11/2019

### Tensor Completion via Gaussian Process Based Initialization

In this paper, we consider the tensor completion problem representing th...
11/12/2016

### Riemannian Tensor Completion with Side Information

By restricting the iterate on a nonlinear manifold, the recently propose...
04/17/2018

### Fast and Accurate Tensor Completion with Total Variation Regularized Tensor Trains

We propose a new tensor completion method based on tensor trains. The to...
04/17/2018

### Fast and Accurate Tensor Completion with Tensor Trains: A System Identification Approach

We propose a novel tensor completion approach by equating it to a system...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Let be a -dimensional tensor with low tensor train (TT) ranks [1]. We consider the problem of recovering from a limited number of its entries. Put more formally, let

 Ω⊂[N1]×…×[Nd]

be the indices of the given entries and let

 PΩ:RN1×⋯×Nd→RN1×⋯×Nd

be the projection operator defined by

 PΩA(i1,…,id)={A(i1,…,id),(i1,…,id)∈Ω,0,(i1,…,id)∉Ω.

If the TT ranks of are equal to , the TT completion problem can be written down as

 f(X)=12∥PΩX−PΩA∥2F→minXs.t.rankTT(X)=r. (1)

It is known that the set

 Nr={X∈RN1×⋯×Nd:rankTT(X)=r}

is a smooth embedded submanifold of , hence the completion problem can be solved via Riemannian optimization [2]. Other approaches include alternating least squares [3, 4] and nuclear norm minimization [5].

Sometimes, additional side information about is available that could be incorporated into the algorithm to reduce the amount of data needed for successful reconstruction. This side information can come in the form of orthogonal matrices

 Qk∈RNk×Mk,QTkQk=IMk,k=1,…,d,

that constraint the mode- subspaces of as

 col(A(k))⊂col(Qk),k=1,…,d, (2)

where is the mode- flattening of and is the column span of the matrix argument. Put differently,

 QkQTkA(k)=A(k),k=1,…,d,

or

 A×1(Q1QT1)×2…×d(QdQTd)=A,

where is the -mode product.

To the best of our knowledge, low-rank completion problems with side information have been analyzed only for the matrix case [6, 7, 8]; the corresponding algorithms have been applied in bioinformatics [9]. For the nuclear norm minimization algorithm in [7], it was proved that it requires only known entries to recover the matrix instead of required by the standard nuclear norm minimization [10]. Note that is the bound for a particular algorithm; meanwhile, samples are necessary for any matrix completion method whatsoever [11].

The goal of this paper is to develop a Riemannian optimization algorithm to solve the low-rank TT completion problem (1) given the side information (2) and to examine how the side information affects the number of known entries needed for successful recovery.

## 2 Structure of the feasible set

Let

 Nr(Q)={X∈Nr:QkQTkX(k)=X(k),k=1,…,d}

be the set of low-rank TTs that conform to the side information. Each can be represented as

 X=QY≡Y×1Q1×2…×dQd

where is a smaller tensor with the same TT ranks:

 Y∈Mr={Y∈RM1×⋯×Md:rankTT(Y)=r}.

Owing to the orthogonality of , the mapping

 Q:RM1×⋯×Md→RN1×⋯×Nd

establishes a one-to-one correspondence between and . Its inverse acts according to

 Q−1X=QTX≡X×1QT1×2…×dQTd.

## 3 Two ways to use the one-to-one correspondence

This correspondence lies at the heart of the methods in the matrix case [6, 7]. With its help, we can formulate the TT completion problem with side information as

 g(Y)=12∥PΩQY−PΩA∥2F→minYs.t.Y∈Mr. (3)

Since the set is a smooth embedded submanifold of , the optimization problem can be solved with Riemannian gradient descent or Riemannian conjugate gradient methods [2]. At each iteration, Riemannian optimization methods perform a series of steps:

1. compute the usual Euclidean gradient ,

2. compute the Riemannian gradient by projecting the Euclidean gradient onto the tangent space ,

3. choose the step direction and the step size ,

4. project onto the manifold with a retraction.

A computationally efficient way to evaluate the Riemannian gradient for the low-rank TT completion problem was proposed in [2]. It exploits the sparsity of the Euclidean gradient and takes operations. However, the Euclidean gradient of (3) is no longer sparse due to the presence of the mapping . Namely,

While the tensor

is indeed sparse, the sparsity is broken by the mode-products and the projection onto the tangent space thus suffers from the curse of dimensionality and becomes unacceptably inefficient.

This is why we suggest to use the correspondence between and in a different manner. Since is a smooth submanifold of and is a one-to-one linear mapping, is also a smooth submanifold but in the larger space . This means that we can pose the low-rank TT completion problem with side information as

 f(X)=12∥PΩX−PΩA∥2F→minXs.t.X∈Nr(Q). (4)

## 4 Riemannian optimization scheme

To apply Riemannian optimization methods to problem (4), we need to describe the tangent space , the projection operator onto the tangent space, and the retraction operator

from the tangent bundle into the submanifold. In the case of the Riemannian conjugate gradient method, the vector transport

 TX→X+:TXNr(Q)→TX+Nr(Q)

needs to be provided too.

To describe we will use the correspondence between and . Since the mapping is linear, the tangent space is

 TXNr(Q)=QTQ−1XMr.

We can further prove that

 TXNr(Q)=QQTTXNr.

This immediately gives the projection operator onto as

 PTXNr(Q)=QQTPTXNr, (5)

and the description of together with an efficient way to apply it are presented in [2].

In the absence of side information, TT-SVD [1] defines a retraction [2] as

 R:TNr∋(X,η)↦PTTr(X+η)∈Nr,

where maps a tensor to its rank-

TT approximation. The TT-SVD algorithm and its TT-rounding variant perform sequences of QR decompositions and truncated SVDs that cannot enlarge the fiber spans, i.e. if

then . This means that if and , then and . Thus the function

 R(Q):TNr(Q)∋(X,η)↦PTTr(X+η)∈Nr(Q)

defines a retraction since it inherits the properties of .

As in [2], the orthogonal projection can be used as the vector transport for the nonlinear conjugate gradient scheme. But since, just like TT-SVD, does not enlarge the fiber spans and tensors in conform to the side information, it suffices to apply .

This being said, we can solve the low-rank TT completion problem with side information (4) with the RTTC algorithm of [2] by changing the projection operator onto the tangent space as in (5) (provided that the initial approximation for the iterations has correct fiber spans). We call this algorithm RTTC with side information (RTTC-SI).

## 5 Numerical experiments

We test the algorithm on random synthetic data. At first, a tensor of order is generated as a TT of rank with standard Gaussian TT cores. For each mode, we generate a random standard Gaussian matrix , , and orthogonalize it via Gram-Schmidt to get , . We then form the tensor to be recovered as

 A=QQT~A∈RN×…×N.

The set of indices for which the entries are known is generated uniformly at random. The initial approximation is generated at random similarly to :

 X0=QQT~X0∈RN×…×N

with a standard Gaussian rank- TT .

We let the algorithm take 250 iterations and call the recovery convergent if the relative error is small on a test set:

 ∥PΓX250−PΓA∥F∥PΓA∥F<10−4

for , , generated uniformly at random. For each combination of parameters we run 5 experiments and report the frequency of convergent runs.

In Figure 1, we compare the phase plots obtained with RTTC and RTTC-SI. The phase plot of RTTC is identical to that of [2] which shows that the algorithm is ignorant to the side information even when the initial approximation has the correct fiber spans. At the same time, RTTC-SI converges even when the available data are scarce and the sufficient number of entries does not depend on .

The phase plots in Figure 2

illustrate that phase transition happens when

similarly to the matrix case. Figure 3 allows to see how the phase transition curve changes as we double or (compare with Figure 1 (b)).

Finally, we consider the situation when the side information is available for a limited number of modes. Let

 Mk={M,k≤l,N,k>l,k=1,…,d.

In Figure 4 we present the phase plots for varying from to .

## 6 Conclusion

In this paper we presented the RTTC-SI algorithm for low-rank TT completion with side information and numerically studied the effects of side information on the phase transition curve.

## References

• [1] I. V. Oseledets. Tensor-Train Decomposition. SIAM Journal on Scientific Computing, 33(5):2295–2317, January 2011.
• [2] Michael. Steinlechner. Riemannian Optimization for High-Dimensional Tensor Completion. SIAM Journal on Scientific Computing, 38(5):S461–S484, January 2016.
• [3] Lars. Grasedyck, Melanie. Kluge, and Sebastian. Krämer. Variants of Alternating Least Squares Tensor Completion in the Tensor Train Format. SIAM Journal on Scientific Computing, 37(5):A2424–A2450, January 2015.
• [4] Lars Grasedyck and Sebastian Krämer. Stable ALS approximation in the TT-format for rank-adaptive tensor completion. Numerische Mathematik, 143(4):855–904, December 2019.
• [5] Johann A. Bengua, Ho N. Phien, Hoang Duong Tuan, and Minh N. Do. Efficient Tensor Completion for Color Image and Video Recovery: Low-Rank Tensor Train. IEEE Transactions on Image Processing, 26(5):2466–2479, May 2017.
• [6] Prateek Jain and Inderjit S. Dhillon. Provable Inductive Matrix Completion. arXiv:1306.0626 [cs, math, stat], June 2013.
• [7] Miao Xu, Rong Jin, and Zhi-Hua Zhou. Speedup Matrix Completion with Side Information: Application to Multi-Label Learning. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 26, pages 2301–2309. Curran Associates, Inc., 2013.
• [8] Kai-Yang Chiang, Cho-Jui Hsieh, and Inderjit S Dhillon. Matrix Completion with Noisy Side Information. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 3447–3455. Curran Associates, Inc., 2015.
• [9] Nagarajan Natarajan and Inderjit S. Dhillon. Inductive matrix completion for predicting gene–disease associations. Bioinformatics, 30(12):i60–i68, June 2014.
• [10] Benjamin Recht. A Simpler Approach to Matrix Completion.

Journal of Machine Learning Research

, 12(Dec):3413–3430, 2011.
• [11] Emmanuel J. Candes and Terence Tao. The Power of Convex Relaxation: Near-Optimal Matrix Completion. IEEE Transactions on Information Theory, 56(5):2053–2080, May 2010.