# Tensor Ring Decomposition with Rank Minimization on Latent Space: An Efficient Approach for Tensor Completion

In tensor completion tasks, the traditional low-rank tensor decomposition models suffer from laborious model selection problem due to high model sensitivity. Especially for tensor ring (TR) decomposition, the number of model possibility grows exponentially with the tensor order, which makes it rather challenging to find the optimal TR decomposition. In this paper, by exploiting the low-rank structure on TR latent space, we propose a novel tensor completion method, which is robust to model selection. In contrast to imposing low-rank constraint on the data space, we introduce nuclear norm regularization on the latent TR factors, resulting in that the optimization step using singular value decomposition (SVD) can be performed at a much smaller scale. By leveraging the alternating direction method of multipliers (ADMM) scheme, the latent TR factors with optimal rank and the recovered tensor can be obtained simultaneously. Our proposed algorithm effectively alleviates the burden of TR-rank selection, therefore the computational cost is greatly reduced. The extensive experimental results on synthetic data and real-world data demonstrate the superior high performance and efficiency of the proposed approach against the state-of-the-art algorithms.

## Authors

• 6 publications
• 61 publications
• 13 publications
• 7 publications
• 26 publications
• ### Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion

In low-rank tensor completion tasks, due to the underlying multiple larg...
05/22/2018 ∙ by Longhao Yuan, et al. ∙ 0

• ### Low rank tensor completion with sparse regularization in a transformed domain

Tensor completion is a challenging problem with various applications. Ma...
11/19/2019 ∙ by Ping-Ping Wang, et al. ∙ 0

• ### Linearized Alternating Direction Method with Adaptive Penalty and Warm Starts for Fast Solving Transform Invariant Low-Rank Textures

Transform Invariant Low-rank Textures (TILT) is a novel and powerful too...
05/24/2012 ∙ by Xiang Ren, et al. ∙ 0

• ### Tensor Completion using Balanced Unfolding of Low-Rank Tensor Ring

Tensor completion aims to recover a multi-dimensional array from its inc...
03/08/2019 ∙ by Huyan Huang, et al. ∙ 0

• ### Robust low-rank multilinear tensor approximation for a joint estimation of the multilinear rank and the loading matrices

In order to compute the best low-rank tensor approximation using the Mul...
11/14/2018 ∙ by Xu Han, et al. ∙ 6

• ### Stochastically Rank-Regularized Tensor Regression Networks

Over-parametrization of deep neural networks has recently been shown to ...
02/27/2019 ∙ by Arinbjörn Kolbeinsson, et al. ∙ 83

• ### Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion

Tucker decomposition is the cornerstone of modern machine learning on te...
05/10/2015 ∙ by Qibin Zhao, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## Introduction

Tensor decompositions aim to find latent factors in tensor-valued data (i.e., the generalization of multi-dimensional arrays), thereby casting large-scale and intractable tensor problems into a multilinear tensor latent space of low-dimensionality (very few degrees of freedom designated by the rank). The latent factors within tensor decomposition can be considered as the latent features of data, which makes them an ideal set of bases to predict missing entries when the acquired data is incomplete. The specific forms and operations among latent factors determine the type of tensor decomposition. The most classical and successful tensor decomposition models are the Tucker decomposition (TKD) and the CANDECOMP/PARAFAC (CP) decomposition

[Kolda and Bader2009]. More recently, the matrix product state/tensor-train (MPS/TT) decomposition has become very attractive, owing to its super-compression and computational efficiency properties [Oseledets2011]. Currently, a generalization of TT decomposition, termed the tensor ring (TR) decomposition, has been studied across scientific disciplines [Zhao et al.2016a, Zhao et al.2018]

. These tensor decomposition models have found application in various fields such as machine learning

[Wang et al.2018, Novikov et al.2015, Anandkumar et al.2014, Kanagawa et al.2016], signal processing [Cong et al.2015], image/video completion [Liu et al.2013, Zhao et al.2016b], compressed sensing [Gandy, Recht, and Yamada2011], to name but a few. Tensor completion is one of the most important applications of tensor decompositions, with the goal to recover an incomplete tensor from partially observed entries. The theoretical lynchpin in tensor completion problems is the tensor low-rank assumption, and the methods can mainly be categorized into two types: (i) tensor-decomposition-based approach and (ii) rank-minimization-based approach.

Tensor decomposition based methods find latent factors of tensor using the incomplete tensor, and then the latent factors are used to predict the missing entries. Many completion algorithms have been proposed based on alternating least squares (ALS) method [Grasedyck, Kluge, and Kramer2015, Wang, Aggarwal, and Aeron2017], gradient-based method [Yuan, Zhao, and Cao2017, Acar et al.2011]

, to mention but a few. Though ALS and gradient-based algorithms are free from burdensome hyper-parameter tuning, the performance of these algorithms is rather sensitive to model selection, i.e., rank selection of the tensor decomposition. Moreover, since the optimal rank is generally data-dependent, it is very challenging to specify the optimal rank beforehand. This is especially the case for Tucker, TT, and TR decompositions, for which the rank is defined as a vector; it is therefore impossible to find the optimal ranks by cross-validation due to the immense possibilities.

Rank minimization based methods employ convex surrogates to minimize the tensor rank. One of the most commonly-used surrogates is the nuclear norm (a.k.a. Schatten norm, or trace norm), which is defined as the sum of singular values of a matrix and it is the most popular convex surrogate for rank regularization. Based on different definitions of tensor rank, various nuclear norm regularized algorithms have been proposed [Liu et al.2013, Imaizumi, Maehara, and Hayashi2017, Liu et al.2014, Liu et al.2015]. Rank minimization based methods do not need to specify the rank of the employed tensor decompositions beforehand, and the rank of the recovered tensor will be automatically learned from the limited observations. However, these algorithms face multiple large-scale singular value decomposition (SVD) operations on the 2D unfoldings of the tensor when employing the nuclear norm and numerous hyper-parameter tuning, which in turn leads to high computational cost and low efficiency.

To address the problems of high sensitivity to rank selection and low computational efficiency which are inherent in traditional tensor completion methods, in this paper, we propose a new algorithm named tensor ring low-rank factors (TRLRF) which effectively alleviates the burden of rank selection and reduces the computational cost. By virtue of employing both nuclear norm regularization and tensor decomposition, our model provides performance stability and high computational efficiency. The proposed TRLRF is efficiently solved by the ADMM algorithm and it simultaneously achieves both the underlying tensor decomposition and completion based on TR decomposition. Our main contributions in this paper are:

• A theoretical relationship between the multilinear tensor rank and the rank of TR factors is established, which allows the low-rank constraint to be performed implicitly on TR latent space. This has led to fast SVD calculation on small size factors.

• The nuclear norm is further imposed to regularize the TR-ranks, which enables our algorithm to always obtain a stable solution, even if the TR-rank is inappropriately given. This highlights rank-robustness of the proposed TRLRF algorithm.

• An efficient algorithm based on ADMM is developed to optimize the proposed model, so as to obtain the TR-factors and the recovered tensor simultaneously.

## Preliminaries and Related Works

### Notations

The notations in [Kolda and Bader2009] are adopted in this paper. A scalar is denoted by a standard lowercase letter or a uppercase letter, e.g., , and a vector is denoted by a boldface lowercase letter, e.g., . A matrix is denoted by a boldface capital letter, e.g., . A tensor of order is denoted by calligraphic letters, e.g., . The set denotes a tensor sequence, with being the -th tensor of the sequence. Where appropriate, a tensor sequence can also be written as . The representations of matrix sequences and vector sequences are designated in the same way. An element of a tensor of index is denoted by or . The inner product of two tensors , with the same size is defined as . Furthermore, the Frobenius norm of is defined by .

We employ two types of tensor unfolding (matricization) operations in this paper. The standard mode- unfolding [Kolda and Bader2009] of tensor is denoted by . Another mode- unfolding of tensor which is often used in TR operations [Zhao et al.2016a] is denoted by . Furthermore, the inverse operation of unfolding is matrix folding (tensorization), which transforms matrices to higher-order tensors. In this paper, we only define the folding operation for the first type of mode- unfolding as , i.e., for a tensor , we have .

### Tensor ring decomposition

The tensor ring (TR) decomposition is a more general decomposition model than the tensor-train (TT) decomposition. It represents a tensor of higher-order by circular multilinear products over a sequence of low-order latent core tensors, i.e., TR factors. For , the TR factors are denoted by and each consists of two rank-modes (i.e mode- and mode-) and one dimension-mode (i.e., mode-). The syntax denotes the TR-rank which controls the model complexity of TR decomposition. The TR decomposition applies trace operations and all of the TR factors are set to be 3-order; thus the TR decomposition relaxes the rank constraint on the first and last core of TT to

. Moreover, TR decomposition linearly scales to the order of the tensor, and in this way it overcomes the ‘curse of dimensionality’. In this case, TR can be considered as a linear combination of TTs and hence offers a powerful and generalized representation ability. The element-wise relation of TR decomposition and the generated tensor is given by:

 X(i1,i2,…,iN)=Trace{N∏n=1G(n)in}, (1)

where is the matrix trace operation, is the -th mode- slice matrix of , which can also be denoted by according to the Matlab notation.

### Tensor completion

#### Completion by TR decomposition

Tensor decomposition based algorithms do not directly employ the rank constraint to the object tensor. Instead, they try to find the low-rank representation (i.e., tensor decompositions) of the incomplete data from the observed entries. The obtained latent factors of the tensor decomposition are used to predict the missing entries. For model formulation, the tensor completion problem is set as a weighted least squares (WLS) model. Based on different tensor decompositions, various tensor completion algorithms have been proposed, e.g., weighted CP [Acar et al.2011], weighted Tucker [Filipović and Jukić2015], TRWOPT [Yuan et al.2018] and TRALS [Wang, Aggarwal, and Aeron2017]. To the best of our knowledge, there are two proposed TR-based tensor completion algorithms: the TRALS and TRWOPT. They apply the same optimization model which is formulated as:

 min[G]∥PΩ(T−Ψ([G]))∥2F, (2)

where the optimization objective is the TR factors, , denotes all the observed entries w.r.t. the set of indices of observed entries represented by , and denotes the approximated tensor generated by . Every element of is calculated by equation (1). The two algorithms are both based on the model in (2). However, TRALS applies alternative least squares (ALS) method and TRWOPT uses a gradient-based algorithm to solve the model, respectively. They perform well for both low-order and high-order tensors due to the high representation ability and flexibility of TR decomposition. However, these algorithms are shown to suffer from high sensitiveness to rank selection, which would lead to high computational cost.

#### Completion by nuclear norm regularization

The model of rank minimization-based tensor completion can be formulated as:

 minX  Rank(X)+λ2∥PΩ(T−X)∥2F, (3)

where is the recovered low-rank tensor, and is a rank regularizer. The model can therefore find the low-rank structure of the data and approximate the recovered tensor. Because determining the tensor rank is an NP-hard problem [Hillar and Lim2013, Kolda and Bader2009], work in [Liu et al.2013] and [Signoretto et al.2014] extends the concept of low-rank matrix completion and defines tensor rank as a sum of the rank of mode- unfolding of the object tensor. Moreover, the convex surrogate named nuclear norm is applied to the tensor low-rank model and it simultaneously regularizes all the mode- unfoldings of the object tensor. In this way, the model in (3) can be reformulated as:

 minXN∑n=1∥X(n)∥∗+λ2∥PΩ(T−X)∥2F, (4)

where denotes the nuclear norm regularization in the form of a sum of the singular values of the matrix. Usually, the model is solved by ADMM algorithms and it is shown to have fast convergence and good performance when data size is small. However, when dealing with large-scale data, the multiple SVD operations in the optimization step will be intractable due to high computational cost.

## Tensor Ring Low-rank Factors

To solve the issues traditional tensor completion methods have, we impose low-rankness on each of the TR factors and so that our basic tensor completion model is formulated as follow:

 min[G],X N∑n=1∥G(n)∥∗+λ2∥X−Ψ([G])∥2F, (5) s.t. PΩ(X)=PΩ(T).

To solve (5), we first need to deduce the relation of the tensor rank and the corresponding core tensor rank, which can be explained by the following theorem.

###### Theorem 1.

Given an -th order tensor which can be represented by equation (1), then the following inequality holds for all :

 Rank(G(n)(2))≥Rank(X(n)). (6)
###### Proof.

For the -th core tensor , according to the work in [Zhao et al.2016a], we have:

 X=G(n)(2)(G(≠n)<2>)T, (7)

where is a subchain tensor generated by merging all but the -th core tensor. Hence, the relation of the rank satisfies:

 Rank(X) ≤min{Rank(G(n)(2)),Rank(G(≠n))} (8) ≤Rank(G(n)(2)).

The proof is completed by

 Rank(X)=Rank(X(n))≤Rank(G(n)(2)). (9)

This theorem proves the relation between the tensor rank and the rank of the TR factors. The rank of mode- unfolding of the tensor is upper bounded by the rank of the dimension-mode unfolding of the corresponding core tensor , which allows us to impose a low-rank constraint on . By the new surrogate, our model (5) is reformulated by:

 min[G],X N∑n=1∥G(n)(2)∥∗+λ2∥X−Ψ([G])∥2F (10) s.t. PΩ(X)=PΩ(T).

The above model imposes nuclear norm regularization on the dimension-mode unfoldings of the TR factors, which can largely decrease the computational complexity compared to the algorithms which are based on model (4). Moreover, we consider to give low-rank constraints on the two rank-modes of the TR factors, i.e., the unfoldings of the TR factors along mode- and mode-, which can be expressed by +. When the model is optimized, nuclear norms of the rank-mode unfoldings and the fitting error of the approximated tensor are minimized simultaneously, resulting in the initial TR-rank becoming the upper bound of the real TR-rank of the tensor, thus equipping our model with robustness to rank selection. The tensor ring low-rank factors (TRLRF) model can be finally expressed as:

 min[G],X N∑n=13∑i=1∥G(n)(i)∥∗+λ2∥X−Ψ([G])∥2F (11) s.t. PΩ(X)=PΩ(T).

Our TRLRF model has two distinctive advantages. Firstly, the low-rank assumption is placed on tensor factors instead of on the original tensor, this greatly reduces the computational complexity of the SVD operation. Secondly, low-rankness of tensor factors can enhance the robustness to rank selection, which can alleviate the burden of searching for optimal TR-rank and reduce the computational cost in the implementation.

### Solving scheme

To solve the model in (11), we apply the alternating direction method of multipliers (ADMM) which is efficient and widely used [Boyd et al.2011]. Moreover, because the variables of TRLRF model are inter-dependent, we impose auxiliary variables to simplify the optimization. Thus, the TRLRF model can be rewritten as

 min[M],[G],X N∑n=13∑i=1∥M(n,i)(i)∥∗+λ2∥X−Ψ([G])∥2F, (12) s.t. M(n,i)(i)=G(n)(i),n=1,…,N, i=1,2,3, PΩ(X)=PΩ(T),

where are the auxiliary variables of . By merging the equal constraints of the auxiliary variables into the Lagrangian equation, the augmented Lagrangian function of TRLRF model becomes

 L([G],X,[M],[Y]) (13) =N∑n=13∑i=1(∥M(n,i)(i)∥∗+ +μ2∥M(n,i)−G(n)∥2F)+λ2∥X−Ψ([G])∥2F, s.t. PΩ(X)=PΩ(T),

where are the Lagrangian multipliers, and is a penalty parameter. For , , , and are each independent, so we can update them by the updating scheme below.

#### Update of G(n).

By using (13), the augmented Lagrangian function w.r.t. can be simplified as

 L(G(n)) =3∑i=1μ2∥∥M(n,i)−G(n)+1μY(n,i)∥∥2F (14) +λ2∥∥X−Ψ([G])∥∥2F+CG,

where the constant consists of other parts of the Lagrangian function which is irrelevant to updating . This is a least squares problem, so for , can be updated by

 G(n)=fold2((3∑i=1(μM(n,i)(2)+Y(n,i)(2)) (15) +λXG(≠n)<2>)(λG(≠n),T<2>G(≠n)<2>+3μI)−1),

where

denotes the identity matrix.

#### Update of M(n,i).

For , the augmented Lagrangian functions w.r.t. is expressed as

 L(M(n,i)) =μ2∥∥M(n,i)−G(n)+1μY(n,i)∥∥2F (16) +∥∥M(n,i)(i)∥∥∗+CM.

The above formulation has a closed-form [Cai, Candès, and Shen2010], which is given by

 M(n,i)=foldi(D1μ(G(n)(i)−1μY(n,i)(i))), (17)

where is the singular value thresholding (SVT) operation, e.g., if is the singular value decomposition of matrix , then .

#### Update of X.

The augmented Lagrangian functions w.r.t. is given by

 L (X)=λ2∥∥X−Ψ([G])∥∥2F+CX, (18) s.t. PΩ(X)=PΩ(T),

which is equivalent to the tensor decomposition based model in (2). The expression for is updated by inputing the observed values in the corresponding entries, and by approximating the missing entries by updated TR factors for every iteration, i.e.,

 X=PΩ(T)+P¯Ω(Ψ([G])), (19)

where is the set of indices of missing entries which is a complement to .

#### Update of Y(n,i).

For and , the Lagrangian multiplier is updated as

 (20)

In addition, the penalty term of the Lagrangian functions is restricted by which is also updated for every iteration by , where is a tuning hyper parameter.

The ADMM based solving scheme is updated iteratively based on the above equations. Moreover, we consider to set two optimization stopping conditions: (i) maximum number of iterations and (ii) the difference between two iterations (i.e., ) which is thresholded by the tolerance . The implementation process and hyper-parameter selection of TRLRF is summarized in Algorithm 1. It should be noted that our TRLRF model is non-convex, so the convergence to the global minimum cannot be theoretically guaranteed. However, the convergence of our algorithm can be verified empirically (see experiment details in Figure 1). Moreover, the extensive experimental results in the next section also illustrate the stability and effectiveness of TRLRF.

### Computational complexity

We analyze the computational complexity of our TRLRF algorithm as follows. For a tensor , the TR-rank is set as , then the computational complexity of updating represents mainly the cost of SVD operation, which is . The computational complexities incurred calculating and updating are and , respectively. If we assume , then overall complexity of our proposed algorithm can be written as .

Compared to HaLRTC and TRALS which are the representative of the nuclear-norm-based and the tensor decomposition based algorithms, the computational complexity of HaLRTC is . Since TRALS is based on ALS method and TR decomposition, its computational complexity is , where denotes the observation rate. We can see that the computational complexity of our TRLRF is similar to that of the two related algorithms. However, the desirable characteristic of rank selection robustness of our algorithm can help relieve the workload for model selection in practice, and thus the computational cost can be reduced. Moreover, though the computational complexity of TRLRF is of high power in , due to the high representation ability and flexibility of TR decomposition, the TR-rank is always set as a small value. In addition, from experiments, we find out that our algorithm is capable of working efficiently for high-order tensors so that we can tensorize the data to a higher-order tensor and choose a small TR-rank to reduce the computational complexity.

## Experimental Results

### Synthetic data

We first conducted experiments to testify the rank robustness of our algorithm by comparing TRALS, TRWOPT, and our TRLRF. To verify the performance of the three algorithms, we tested two tensors of size and . The tensors were generated by TR factors of TR-ranks and

respectively. The values of the TR factors were drawn from i.i.d. Gaussian distribution

. The observed entries of the tensors were randomly removed by a missing rate of , where the missing rate is calculated by and is the number of sampled entries (i.e., observed entries). We recorded the completion performance of the three algorithms by selecting different TR-ranks. The evaluation index was RSE which is defined by , where is the known tensor with full observations and is the recovered tensor calculated by each tensor completion algorithm. The hyper-parameters of our TRLRF were set according to Algorithm 1. All the hyper-parameters of TRALS and TRWOPT are set according to the recommended settings in the corresponding papers to get the best results.

Figure 2 shows the final RSE results which represent the average values of 100 independent experiments for each case. From the figure, we can see that all the three algorithms had their lowest RSE values when the real TR-ranks of the tensors were chosen and the best performance was obtained from our TRLRF. Moreover, when the TR-rank increased, the performance of TRLRF remained stable while the performance of the other two compared algorithms fell drastically. This indicates that imposing low-rankness assumption on the TR factors can bring robustness to rank selection, which largely alleviates the model selection problem in the experiments.

### Benchmark images inpainting

In this section, we tested our TRLRF against the state-of-the-art algorithms on eight benchmark images which are shown in Figure 3. The size of each RGB image was which can be considered as a three-order tensor. For the first experiment, we continued to verify the TR-rank robustness of TRLRF on the image named “Lena”. Figure 4 shows the completion results of TRLRF, TRALS, and TRWOPT when different TR-ranks for each algorithm are selected. The missing rate of the image was set as , which is the case that the TR decompositions are prone to overfitting. From the figure, we can see that our TRLRF gives better results than the other two TR-based algorithms in each case and the highest performance was obtained when the TR rank was set as 12. When TR-rank increases, the completion performance of TRALS and TRLRF decreases due to redundant model complexity and overfitting of the algorithms, while our TRLRF shows better results even the selected TR-rank is larger than the desired TR-rank.

In the next experiment, we compared our TRLRF to the two TR-based algorithm, TRALS and TRWOPT, and the other state-of-the-art algorithms, i.e., TenALS [Jain and Oh2014], FBCP [Zhao, Zhang, and Cichocki2015], HaLRTC [Liu et al.2013], TMac [Xu et al.2013] and t-SVD [Zhang et al.2014]. We tested these algorithms on all the eight benchmark images and for different missing rates: and . The relative square error (RSE) and peak signal-to-noise ratio (PSNR) were adopted for the evaluation of the completion performance. For RGB image data, PSNR is defined as where MSE is calculated by , and num() denotes the number of element of the fully observed tensor.

For the three TR-based algorithms, we assumed the TR-ranks were equal for every core tensor (i.e., ). The best completion results for each algorithm were obtained by selecting best TR-ranks for the TR-based algorithms by a cross-validation method. Actually, finding the best TR-rank to obtain the best completion results is very tedious. However, this is much easier for our proposed algorithm because the performance of TRLRF is fairly stable even though the TR-rank is selected from a wide large. For the other five compared algorithms, we tuned the hyper-parameters according to the suggestions of each paper to obtain the best completion results. Finally, we show the average performance of the eight images for each algorithm under different missing rates by line graphs. Figure 5 shows the RSE and PSNR results of each algorithm. The smaller RSE value and the larger PSNR value indicate the better performance. Our TRLRF performed the best among all the considered algorithms in most cases. When the missing rate increased, the completion results of all the algorithms decreased, especially when the missing rate was near . The performance of most algorithm fell drastically when the missing rate was . However, the performance of TRLRF, HaLRTC, and FBCP remained stable and the best performance was obtained from our TRLRF.

### Hyperspectral image

A hyperspectral image (HSI) of size which records an area of the urban landscape was tested in this section. In order to test the performance of TRLRF on higher-order tensors, the HSI data was reshaped to higher-order tensors, which is an easy way to find more low-rank features of the data. We compared our TRLRF to the other seven tensor completion algorithms in 3-order tensor (), 5-order tensor () and 8-order tensor () cases. The higher-order tensors were generated from original HSI data by directly reshaping it to the specified size and order.

The experiment aims to verify the completion performance of the eight algorithms under different model selection, whereby the experiment variables are the tensor order and tensor rank. The missing rates of all the cases are set as . All the tuning parameters of every algorithm were set according to the statement in the previous experiments. Besides, for the experiments which need to set rank manually, we chose two different tensor ranks: high-rank and low-rank for algorithms. It should be noted that the CP-rank of TenALS and the Tucker-rank of TMac were set to the same values as TR-rank. The completion performance of RSE and visual results are listed in Table 1 and shown in Figure 6. The results of FBCP, HaLRTC and t-SVD were not affected by tensor rank, so the cases of the same order with different rank are left blank in Table 1. The TenALS could not deal with tensor more than three-order, so the high-order tensor cases for TenALS are also left blank. As shown in Table 1, our TRLRF gives the best recovery performance for the HSI image. In the 3-order cases, the best performance was obtained when the TR-rank was 12, however, when the rank was set to 8, the performance of TRLRF, TRALS, TRWOPT, TMac, and TenALS failed because of the underfitting of the selected models. For 5-order cases, when the rank increased from 18 to 22, the performance of TRLRF kept steady while the performance of TRALS, TRWOPT, and TMac decreased. This is because the high-rank makes the models overfit while our TRLRF performs without any issues, owing to its inherent TR-rank robustness. In the 8-order tensor cases, similar properties can be obtained and our TRLRF also performed the best.

## Conclusion

We have proposed an efficient and high-performance tensor completion algorithm based on TR decomposition, which employed low-rank constraints on the TR latent space. The model has been efficiently solved by the ADMM algorithm and it has been shown to effectively deal with model selection which is a common problem in most traditional tensor completion methods, thus providing much lower computational cost. The extensive experiments on both synthetic and real-world data have demonstrated that our algorithm outperforms the state-of-the-art algorithms. Furthermore, the proposed method is general enough to be extended to various other tensor decompositions in order to develop more efficient and robust algorithms.

## References

• [Acar et al.2011] Acar, E.; Dunlavy, D. M.; Kolda, T. G.; and Mørup, M. 2011. Scalable tensor factorizations for incomplete data. Chemometrics and Intelligent Laboratory Systems 106(1):41–56.
• [Anandkumar et al.2014] Anandkumar, A.; Ge, R.; Hsu, D.; Kakade, S. M.; and Telgarsky, M. 2014. Tensor decompositions for learning latent variable models. The Journal of Machine Learning Research 15(1):2773–2832.
• [Boyd et al.2011] Boyd, S.; Parikh, N.; Chu, E.; Peleato, B.; Eckstein, J.; et al. 2011. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine learning 3(1):1–122.
• [Cai, Candès, and Shen2010] Cai, J.-F.; Candès, E. J.; and Shen, Z. 2010. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20(4):1956–1982.
• [Cong et al.2015] Cong, F.; Lin, Q.-H.; Kuang, L.-D.; Gong, X.-F.; Astikainen, P.; and Ristaniemi, T. 2015. Tensor decomposition of EEG signals: a brief review. Journal of Neuroscience Methods 248:59–69.
• [Filipović and Jukić2015] Filipović, M., and Jukić, A. 2015. Tucker factorization with missing data with application to low--rank tensor completion. Multidimensional Systems and Signal Processing 26(3):677–692.
• [Gandy, Recht, and Yamada2011] Gandy, S.; Recht, B.; and Yamada, I. 2011. Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems 27(2):025010.
• [Grasedyck, Kluge, and Kramer2015] Grasedyck, L.; Kluge, M.; and Kramer, S. 2015. Variants of alternating least squares tensor completion in the tensor train format. SIAM Journal on Scientific Computing 37(5):A2424–A2450.
• [Hillar and Lim2013] Hillar, C. J., and Lim, L.-H. 2013. Most tensor problems are np-hard. Journal of the ACM (JACM) 60(6):45.
• [Imaizumi, Maehara, and Hayashi2017] Imaizumi, M.; Maehara, T.; and Hayashi, K. 2017. On tensor train rank minimization: Statistical efficiency and scalable algorithm. In Advances in Neural Information Processing Systems, 3933–3942.
• [Jain and Oh2014] Jain, P., and Oh, S. 2014. Provable tensor factorization with missing data. In Advances in Neural Information Processing Systems, 1431–1439.
• [Kanagawa et al.2016] Kanagawa, H.; Suzuki, T.; Kobayashi, H.; Shimizu, N.; and Tagami, Y. 2016.

Gaussian process nonparametric tensor estimator and its minimax optimality.

In International Conference on Machine Learning, 1632–1641.
• [Kolda and Bader2009] Kolda, T. G., and Bader, B. W. 2009. Tensor decompositions and applications. SIAM Review 51(3):455–500.
• [Liu et al.2013] Liu, J.; Musialski, P.; Wonka, P.; and Ye, J. 2013. Tensor completion for estimating missing values in visual data. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(1):208–220.
• [Liu et al.2014] Liu, Y.; Shang, F.; Fan, W.; Cheng, J.; and Cheng, H. 2014. Generalized higher-order orthogonal iteration for tensor decomposition and completion. In Advances in Neural Information Processing Systems, 1763–1771.
• [Liu et al.2015] Liu, Y.; Shang, F.; Jiao, L.; Cheng, J.; and Cheng, H. 2015. Trace norm regularized CANDECOMP/PARAFAC decomposition with missing data. IEEE Transactions on Cybernetics 45(11):2437–2448.
• [Novikov et al.2015] Novikov, A.; Podoprikhin, D.; Osokin, A.; and Vetrov, D. P. 2015.

Tensorizing neural networks.

In Advances in Neural Information Processing Systems, 442–450.
• [Oseledets2011] Oseledets, I. V. 2011. Tensor-train decomposition. SIAM Journal on Scientific Computing 33(5):2295–2317.
• [Signoretto et al.2014] Signoretto, M.; Dinh, Q. T.; De Lathauwer, L.; and Suykens, J. A. 2014. Learning with tensors: a framework based on convex optimization and spectral regularization. Machine Learning 94(3):303–351.
• [Wang, Aggarwal, and Aeron2017] Wang, W.; Aggarwal, V.; and Aeron, S. 2017. Efficient low rank tensor ring completion. In

IEEE International Conference on Computer Vision (ICCV)

, 5698–5706.
IEEE.
• [Wang et al.2018] Wang, W.; Sun, Y.; Eriksson, B.; Wang, W.; and Aggarwal, V. 2018. Wide compression: Tensor ring nets. In

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

, 9329–9338.
IEEE.
• [Xu et al.2013] Xu, Y.; Hao, R.; Yin, W.; and Su, Z. 2013. Parallel matrix factorization for low-rank tensor completion. arXiv preprint arXiv:1312.1254.
• [Yuan et al.2018] Yuan, L.; Cao, J.; Wu, Q.; and Zhao, Q. 2018. Higher-dimension tensor completion via low-rank tensor ring decomposition. arXiv preprint arXiv:1807.01589.
• [Yuan, Zhao, and Cao2017] Yuan, L.; Zhao, Q.; and Cao, J. 2017. Completion of high order tensor data with missing entries via tensor-train decomposition. In International Conference on Neural Information Processing, 222–229. Springer.
• [Zhang et al.2014] Zhang, Z.; Ely, G.; Aeron, S.; Hao, N.; and Kilmer, M. 2014. Novel methods for multilinear data completion and de-noising based on tensor-SVD. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3842–3849.
• [Zhao et al.2016a] Zhao, Q.; Zhou, G.; Xie, S.; Zhang, L.; and Cichocki, A. 2016a. Tensor ring decomposition. arXiv preprint arXiv:1606.05535.
• [Zhao et al.2016b] Zhao, Q.; Zhou, G.; Zhang, L.; Cichocki, A.; and Amari, S.-I. 2016b. Bayesian robust tensor factorization for incomplete multiway data. IEEE Transactions on Neural Networks and Learning Systems 27(4):736–748.
• [Zhao et al.2018] Zhao, Q.; Sugiyama, M.; Yuan, L.; and Cichocki, A. 2018. Learning efficient tensor representations with ring structure networks. In Sixth International Conference on Learning Representations (ICLR workshop).
• [Zhao, Zhang, and Cichocki2015] Zhao, Q.; Zhang, L.; and Cichocki, A. 2015. Bayesian CP factorization of incomplete tensors with automatic rank determination. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(9):1751–1763.