1 Introduction
A tensor can be regarded as a multidimensional array whose entries are indexed by several continuous or discrete variables. Tensor is a natural way to represent the highdimensional data, thus it preserves more intrinsic information than matrix when dealing with highorder data
kolda2009tensor ; cichocki2015tensor ; cichocki2016tensor .In practice, parts of the tensor entries are missing during data acquisition and transformation, tensor completion estimates the missing entries based on the assumption that most elements are correlated
gandy2011tensor . This correlation can be modeled as lowrank data structures which can be used in a series of applications, including signal processing cichocki2015tensor sidiropoulos2017tensor , remote sensing signoretto2011tensor liu2013tensor , etc.There are two main frameworks for tensor completion, namely, variational energy minimization as well as tensor rank minimization lubasch2014algorithms ; long2018low , where the energy is usually a recovery error in the context of tensor completion and the definition of rank varies with diverse tensor decompositions. The first method is realized by means of the alternating least square (ALS), in which each core tensor is updated one by one while others are fixed lubasch2014algorithms . The ALSbased method requires a predefined tensor rank, while the rank minimization does not.
Common forms of tensor decompositions are summarized as follows. The CANDECOMP/PARAFAC (CP) decomposition factorizes a order tensor into a linear combination of rankone tensors kolda2009tensor , thus the storage requirement of CP decomposition is , where is the CPrank cichocki2015tensor . However, the determination of CPrank is an NPhard problem hillar2013most , and low CPrank approximation may involve numerical problems cichocki2015tensor . Therefore, low CPrank tensor completion usually recovers the missing data by iteratively updating its factors with a predefined CP rank liu2015trace ; liu2016generalized ; yang2016iterative ; bazerque2013rank ; xu2013block ; zhao2015bayesian ; yang2015rank ; yang2016rank ; yokota2016smooth . Tucker decomposition decomposes a tensor into a set of matrices and one core tensor that models a potential pattern of mutual interaction between components in different modes kolda2009tensor
. Low Tuckerrank tensor completion minimizes the Tuckerrank which is a vector whose entries are the ranks of the factor matrices, or optimizes factors with a fixed Tuckerrank
liu2013efficient ; liu2013tensor ; yang2013fixed ; ng2017adaptive ; gandy2011tensor ; signoretto2011tensor ; romera2013new ; kressner2014low ; mu2014square ; xu2013parallel ; liu2019low. However, the storage complexity of Tucker decomposition is for a order tensor, provided thatis the Tucker rank, which still grows exponentially with respect to its dimension. Tensor singular value decomposition (tSVD) factorizes a
way tensor into two orthogonal tensors and a fdiagonal tensor kilmer2013third , and the tubal rank is defined as the number of nonvanishing singular tubes in the fdiagonal tensor. The minimization of tubal rank is often used for tensor completion zhang2017exact ; zhang2014novel ; liu2016tensor . The storage in low tubal rank representation is , where is the tubal rank of a 3order tensor. However, tSVD can only deal with the order data from a strict viewpoint, which seriously limits its application. In tensor train (TT) decomposition, a higherorder tensor is decomposed into a set of order core tensors with two border factor matrices oseledets2011tensor . The number of parameters in TT decomposition is , suppose that is the TTrank. Completion through two frameworks can be found in wang2016tensor ; yuan2017completion ; bengua2017efficient ; grasedyck2015variants . TT decomposition has some problems such as the intermediate ranks are much larger than the side ones and it highly depends on permutation of tensor dimension, which makes it hard to find the optimal representation. These shortcomings limit its practical applications wang2017efficient . The hierarchical Tucker (HT) decomposition factorizes a tensor like a tree with each degree of nodes less than or equal to kolda2009tensor . Its storage requirement is , where is HTrank. A HTrank minimization model for tensor completion can be found in Liu2018Image .The recently proposed tensor ring (TR) decomposition represents a highorder tensor as a sequence of cyclically contracted order tensors zhao2017learning ; ye2018tensor . The cycle forces TRrank to be small and balanced and TR is consistently invariant under the cyclic permutation zhao2017learning . The storage complexity is , suppose is the TRrank. wang2017efficient proposes an alternating fitting scheme that cyclically updates TRfactors. The authors report that it suffers from the overfitting problem when TRrank is large and a small number of observations are available. yu2018effective proposes a balanced matricization scheme which shows great improvement in time complexity and recovery performance against the methods in wang2017efficient . However, yu2018effective does not provide theoretical analysis. On the other hand, It is not difficult to see that the rank of TTunfolding is ( is the TTrank), but what is the rank of TRunfolding poses us a question. This lack of comprehension spurs us to have a deeper investigation to TR decomposition. Below are our contributions:

We thoroughly analyze the TR decomposition and find it is exactly a singular value decomposition on the assumption that the TR is (sub)critical. Follow the analysis of matrix completion, we define the strong TR incoherence condition and provide the theoretical guarantee for successful recovery. The study of the sample complexity for TR completion gives a similar sampling lower bound to that of matrix completion. The mathematical analysis shows the TR completion inherits the balance of matrix completion.

Consider the exhibited result, we propose a method for TR completion based on sum of TRrank minimization. Then we relax the model using the nuclear norm as surrogate and formulate its corresponding augmented Lagrangian function. The problem is solved by alternating direction method of multipliers (ADMM). The experiments demonstrate the superiority of our method over the existing ones.
The organization for the remainder of this paper is arranged as follows. Section 2 introduces basic notations and preliminaries of TR decomposition. Section 3 provides theoretical analyses about TR completion. The method for TR completion and its algorithmic analysis are discussed in Section 4. Section 5 exhibits the experimental results. Finally we conclude our work in section 6.
2 Notations and Preliminaries
2.1 Notations about TR decomposition
This subsection introduces some basic notations of tensor and TR decomposition. A scalar, a vector, a matrix and a tensor are denoted by normal letter, boldface lowercase letter, boldface uppercase letter and calligraphic letter, respectively. Specifically, a order tensor is denoted as , where is the size corresponding to mode. An entry of is denoted as , where is the index with mode and . A mode fiber of is denoted as , and represents the slice along mode and mode .
We regard
as the identity matrix.
denotes the spectral norm of , which is equal to its maximal singular value. The Frobenius norm of a tensor is defined as . The inner product of two tensors and is defined as . If and , the Kronecker product is written as , where and for , , and . The Hadamard product is an elementwise product, e.g., . The rank of a matrix can be represented as , where is the rank function.Mode unfolding maps a tensor to a matrix by rearranging the fibers as the columns of the matrix. i.e., , and is . Mode matricization unfolds a tensor along its first modes bengua2017efficient , i.e., .
In TR decomposition, denotes the shifting matricization of the tensor . It firstly permutes the tensor with order and performs matricization along first modes. The indices of are and .
Let denote the factors of TR decomposition and the TRrank be , where . Then the scalar form of TR decomposition is . Equivalently, it can be represented in a more compact way , where is the th mode slice of core , and is the trace of the matrix . The tensorized representation is , where is the th mode fiber of core , and denotes the outer product.
We use to represent the tensor connection product which contracts several TRcores into a new core and its formula is , where .
2.2 Preliminaries
This subsection reviews the existing tensor rank minimization methods for tensor completion. At present, these approaches are based on unfolding matrix completion. The formulation for matrix completion is fazel2002matrix
(1) 
Owing to the essence that the rank of a matrix takes discrete values (the combinatorical nature bengua2017efficient ), problem (1) is NPhard. As a result, a convex relaxation of problem (1) is often considered as an efficient surrogate candes2010power :
(2) 
in which is the nuclear norm of a matrix , and is the th singular value.
Following this routine, the tensor rank (Tuckerrank) minimization is conventionally formulated as a weighted sum of mode rank liu2013tensor ; yang2013fixed
(3) 
whose solution is given by the following relaxation liu2013tensor ; yang2013fixed :
(4) 
where is the classical mode unfolding.
The TTrank is defined as the rank of matricization along modes, which is bengua2017efficient
(5) 
problem (5) is solved via optimization of TTnuclear norms bengua2017efficient , whose representation is
(6) 
As to the completion through HTrank minimization Liu2018Image , the HTrank is represented by the rank of matricization along the nodes of tensor tree. The rank optimization can be characterized as
(7) 
if we disregard the total variation term, where is the number of nodes. The convex relaxation of (7) is Liu2018Image
(8) 
One may find that all the tensor completions via rank optimization is based on some particular unfolding matrices. Follow this routine, yuan2018rank proposes a method that simultaneously optimizes the TRfactors and predicted tensor by minimizing the sum of nuclear norms of factors’ unfoldings. However, this method requires the predefined TRrank which causes an inconvenience like the ALSbased methods. In the next sections, we argue that there is no need to use all these latent factors directly and there exists a more efficient way to deal with this TR completion task. We will show the most informative TRunfoldings are the balanced ones which capture more global correlations than others.
3 Sampling for LowRank TR Completion
This section consists of theories that are necessary for the sampling scheme of TR completion. ye2018tensor classifies the TR into three states: subcritical, critical and supercritical. Suppose the TRrank is , the subcritical (supercritical) state requires and for critical state. A supercritical TR can degenerate to a (sub)critical TR on condition that the mode fibers of TRfactors are orthonormal. ye2018tensor declares a proposition which says a supercritical TR can be reduced to a (sub)critical one by a surjective birational map. So in this paper we focus on study of the (sub)critical TR. In the following a (sub)critical TR is abbreviated to TR for concision wherever it appears. Fig. 1 gives a conceptual exhibition of TRunfolding’s rank derived from our analysis. The rank of TR unfolding is , provided that the TR rank is , where and .
The following Lemma plays a key role in our analysis which provides a means of estimating the probability of inequality.
Lemma 1 (McDiarmid inequality warnke2016method ).
Let
be independent random variables such that for all
there are , and . Let be an arbitrary (implicit) function of the variables, e.g., the sum function, then for any there is(9) 
as long as this function changes in a bounded way, i.e., if is changed, the value of this function changes by at most .
Inspired by the strong incoherence in matrix case, we build the strong TRincoherence condition depicted in Definition 1, which is crucial for the exact recovery guarantee.
Definition 1 (Strong TRincoherence property).
A order tensor obeys the TR strong incoherence property with parameter if for any ,
(10) 
Theorem 1 (TR completion, uniformly bounded model).
Remark.
Note that when , the results of TR completion is consistent with that of matrix completion, one can regard as the matrix rank.
4 LowRank TR Completion Method
Throughout this paper, we consider the general tensor completion in the noiseless case. The first subsection will introduce our completion model. The algorithm is given in the second subsection and the third subsection contains the algorithmic analysis.
4.1 Balanced unfolding scheme for TR completion
Motivated by the result of Theorem 1 which implies the balanced unfoldings are easy to recover, a good strategy to improve the sample complexity is to set using the algebraic knowledge, though this behavior has a mild decrement on probability of successful recovery. For the sake of making most of abundant information hidden in TR, a natural consideration is to incorporate at least balanced unfolding matrices. From another perspective, these balanced TRunfoldings capture more global correlations than the unbalanced ones according to the technique introduced in bengua2017efficient . Based on the conventional tensor rank minimization models liu2013tensor ; liu2015trace ; yang2013fixed ; zhang2017exact ; bengua2017efficient ; liu2016tensor , here we simply consider the TR completion model as the weighted sum of balanced TRunfoldings’ ranks:
(12) 
where the subscript is the observation. This model still cannot be solved in practice as a result of its NPhardness caused by the essence that a rank of matrix takes discrete values. One reasonable method is to resort to convex relaxation.
Since nuclear norm is the tightest convex hull of matrix rank, we simplify the notation of th balanced unfolding scheme to and derive the following relaxed model that can be formulated as
(13) 
The proposed model has two main advantages. Form the model perspective, it has the lowest sample complexity. From an algorithm point of view, it results in the lowest computational complexity.
One may argue that model (13) is somewhat similar to (3) and (5) and (7), which seems to act on different unfolding matrices merely. Specifically, FPLRTC yang2013fixed utilizes classical mode unfoldings and SiLRTCTT bengua2017efficient adopts mode matricizations, while the shifting balanced unfoldings proposed in this paper are used instead. However, there is an essential difference behind this intuitive nuance. Concretely, model (13) is founded on the theoretical analysis which proves TR decomposition is equivalent to SVD, model (3) is based on the low rank of Tucker decomposition, model (5) is valid owing to its achievement of capturing more global correlation than conventional mode unfolding, the effectiveness of model (7) is because of the lowrankness of hierarchical Tucker decomposition.
In order to solve model (13), we substitute for to get rid of interdependence and derive the following model:
(14) 
where are matrix variables. Consider the augmented Lagrangian function of (14) is
(15) 
and rewrite (15) as a tensor form
(16) 
in which .
According to the framework of ADMM, the updating scheme is determined by
(17) 
where is the objective function in (16).
The details about the ADMM algorithm for TR rank minimization are summarized in Algorithm 1.
In Algorithm 1, is a matrix shrinkage operator that truncates the singular value matrix by threshold , whose definition is ma2011fixed . Here and its scalar form is for and equals if .
4.2 Algorithmic complexity
For a order hypercubic tensor , the complexity of TRBU algorithm mainly comes from thresholding the unfoldings. Every iteration contains soft thresholdings which has complexity , accordingly one iteration has complexity . To alleviate this computational burden, one strategy is to find a substitution for the computation of SVD. References liu2016generalized and yang2013fixed mention the Lanczos algorithm is a good counterpart which has a linear time complexity for a by matrix. Therefore, the overall computational complexity of TRBU is .
Unlike the ALS method compresses a tensor into a TR with parameters, the storage complexity of TRBU algorithm is , since outcomes of SVDs are stored instead.
4.3 Algorithmic convergence
The ADMM algorithm has a linear rate of convergence when one of the objective terms is strongly convex nishihara2015general . We adopt a rather simple but efficient strategy to improve convergence provided by lin2010augmented , in which the penalty coefficient increases geometrically with iterations, i.e., , where is a constant.
5 Numerical Experiments
In this section, three groups of datasets are used for tensor completion experiments, i.e., synthetic data, realworld images and videos. Seven algorithms are used to test the performance on realworld data, consisting of alternating least square for low rank tensor completion via tensor ring (TRALS) wang2017efficient , simple lowrank tensor completion via tensor train (SiLRTCTT) bengua2017efficient , high accuracy low rank tensor completion algorithm (HaLRTC) liu2013tensor
, low rank tensor completion via tensor nuclear norm minimization (LRTCTNN)
lu2016libadmm , Bayesian CP Factorization (FBCP) for image recovery zhao2015bayesian , smooth low rank tensor tree completion (STTC) Liu2018Imageand the proposed one. These methods come from different tensor decompositions, including CP, Tucker, tSVD, HT, TT and TR decompositions. All the algorithms are based on rank minimization except FBCP and TRALS. Note that the FBCP is powerful among the CPbased methods, as it use fully Bayesian inference to automatically determine the CP rank and, the uncertainty information of latent factors are also taken into account
zhao2015bayesian . TRALS is selected to be compared because it is the first TRbased method. All the experiments are conducted in MATLAB 9.3.0 on a computer with a 2.8GHz CPU of Intel Core i7 and a 16GB RAM.There are several evaluations for the quality of visual data. Relative error (RE), short for the root of relative squared error, is a common indicator for recovery accuracy, which is defined as , where is the ground truth and is the recovered tensor. The second quality metric is peak signaltonoise ratio, often abbreviated PSNR, is the ratio between the maximum possible power of a signal and the power of corrupting noise barnsley1993fractal . Given the ground truth and the estimation , the mean squared error (MSE) is defined as , then the PSNR (in dB) is defined as , where is the maximal pixel value which is for the RGB images and videos, and represents the number of elements in a set. A higher PSNR usually indicates a higher quality of the reconstruction. The third assessment is called structural similarity index (SSIM) which is used for measuring the similarity between the recovered image and original image wang2003multi . It is calculated on various windows of an image. The measure between two windows and of common size is , where and are the averages of and , and
are the variances of
and , and are two variables to stabilize the denominator (default values for and are and ), is the dynamic range of pixelvalues. The last one quantifying the algorithmic complexity is the computational CPU time (in seconds).The sampling ratio (SR) is defined as the ratio of the number of sampled entries to the number of the elements in tensor , noted as .
For fair comparisons, the parameters in each algorithm are tuned to give optimal performance. In our algorithm, is set to be . The convergence is determined by the relative change (RC) RC=, where the tolerance is set to be . The number of maximal iterations is .
In the rest of this section, we firstly verify the theoretic analysis using synthetic data. Then the experiments on realworld data including image and video are used to test the performance of the proposed TRBU algorithm and other methods.
5.1 Synthetic data
We firstly consider a order tensor with TRrank being . For successful identification, the entries of TRrank are prime numbers so that every product of two elements is different from others. The tensor is generated by TR decomposition, where core tensors are with i.i.d. standard Gaussian random variables, i.e., . Subsequently, it is unfolded into different matrices , and we calculate their theoretic ranks and real ranks to validate the conclusion of unfolding’s rank. The results are put in Table 1, which shows the correctness of theoretical analysis.
Unfolding  Unfolding  

To testify Theorem 1, we simulate another two tensors, one is an order tensor with TRrank being , the other is a order tensor with TRrank being
. The entries of core tensors are sampled independently from the standard normal distribution. Their SRs range from
to with linear increment . For each tensor with different SRs we run the TRBU algorithm times to recover its unfolding matrices, i.e., . The averaged results are shown in (a) and (b) within Fig. 2, which gives the recovery probabilities with respect to various SRs. In this experiment, the recovery is considered to be successful when . It can be seen from the results that a balanced unfolding matrix is easier to recover than an unbalanced one, and the more balance of the unfolding the more ease of its completion. This verifies the correctness of Theorem 1.In order to verify the exact recovery guarantee in Theorem 1, we randomly generate a set of order tensors by contracting the cores , whose entries are sampled from an i.i.d.
distribution. The degree of freedom (
df) of a balanced TRunfolding is , and the df for a TR is . We vary TRrank from to so that is ensured to be positive. For each tube , we repeat the completion times. The phase transition of this tensor completion is given as Fig. 2(c), in which a recovery is regarded as a successful completion if . From Fig. 2(c), a large amount of region is successfully recovered, the results in is a convincing validation of the recovery guarantee in Theorem 1. The phenomenon in Fig. 2(c) also implies that there is a tighter bound of TRcompletion, since the starting point of this bound is based on TRunfolding, instead of TR itself.The last Fig. 2(d) gives the logarithm of RE to the base , in which the SR increases linearly from to . Note that the tensor for TRBU is generated by TR decomposition and the tensor for SiLRTCTT is generated by TT decomposition, both tensor has a same size . The TR rank is and the TT rank is . The results are at an average of trials for TRBU with TRrank being , SiLRTCTT with TTrank being . TRBU always has lower RE compared with SiLRTCTT. With increasing SR, the RE of TRBU drops drastically while SiLRTCTT’s decreases slowly, which demonstrates the inefficiency of SiLRTCTT. The results also indicate the correctness of Theorem 1, as it asserts the most “useful” unfolding matrices are the balanced ones. Many unfoldings of TT have “low values”, which results in a worse performance compared to the usage of balanced TRunfoldings.
5.2 Color images
Eight RGB image are tested in this subsection’s experiment, including “kodim04” ^{2}^{2}2http://r0k.us/graphics/kodak/kodim04.html, “peppers”, “sailboat”, “lena”, “barbara”, “house”, “airplane” and “Einstein” wang2017efficient . The original images are shown in Fig. 3. The displayed result of each image is at an average of repetitive experiments (not take the best ones). The SR for all the considered images are from to . In image recovery, we set .
The visual data tensorization (VDT) method introduced in bengua2017efficient and latorre2005image can improve the performance, since a higherorder tensor makes it more efficient to exploit the local structures in original tensor and, if a tensor is slightly correlated, the tensorized one is more likely to have a low rank bengua2017efficient . VDT first transforms an image into a real ket of a Hilbert space by casting the image to a higherorder tensor with an appropriate block structured addressing, i.e., tensorizing an image of size to a tensor of size . Then VDT permutes and reshapes the resulting tensor into another one with size . Using the MATLAB notation, for the first image we use to reshape it and finally get a sized tensor. For the second to the seventh images, we first reshape them into order tensors with size of , then the outcomes are reshaped into sized D tensors after the reordering. As to the last image, and are the resulting orders during the VDT processing. Note that the tensorizations are manually operated, and different operations will cause other results.
After the VDT manipulation, we compare the proposed method with the stateoftheart algorithms. The recovery results (REs, PSNRs, SSIMs and CPU times) of algorithms, based on an average of repetitions, are exhibited in Fig. 4. The FBCP method needs a predefined maximal CP rank, which limits the computational source very much. For instance, the given maximum CP ranks for “kodim04” and “Einstein” are and respectively, while for others, otherwise the algorithm will be out of memory. The TR ranks for all images are both in TRALS. The performances of TRALS does not improve too much with the increasing SR because of the fixed given TR rank. However, the computational complexity of TRALS (one can refer to wang2017efficient
) increases rapidly with linearly enlarged TR rank, which will result in a large amount of time and greatly limit the performance of TRALS. From these results, the TRBU method prevails against all other tested algorithms on all evaluation metrics except for CPU time, which suggests the efficiency of the proposed method. Aside from the high CPU time of TRALS, all other three indicators are also much worse than TRBU’s. This implies the advantage of TRBU that it considers the completion problem from an information theory point of view. Although the closed loop of TR increases difficulty of TR rank minimization, it provides richer information compared with other tensor decomposition formats, which is a main reason why the better performances can be derived by TRBU.
Besides, we use textmasked “house” and palmmasked “llama” images to test the seven methods’ performances on the condition of nonuniform sampling. The maximal CP ranks for two images are both . The corresponding results are presented in Fig. 5, it can be concluded that TRBU is still superior to all other six methods in terms of recovery quality, i.e., RE, PSNR and SSIM, while keeping an acceptable CPU time.
5.3 Realworld videos
In this group of experiments, two videos are used to test the algorithms and each video recovery is tested times. The first is a color video called “explosion” ^{3}^{3}3http://www.newcger.com/shipinsucai/5786.html with the size of . We use the frames from the 81st to the 180th. Each frame is downsampled to the size of , the whole video is reshaped into a D tensor with the size of by VDT method. The second is a color video named “cock” ^{4}^{4}4https://pixabay.com/videos/id10685/ download from website. we downsampled each frame to the size of and finally get a sized tensor. The SRs of both two videos are . Furthermore, we set . The TR ranks for two videos are both .
Fig. 6 gives the completion results of seven methods. The maximal CP ranks are limited by in the completions, which may deteriorate the performances of FBCP. This also implies the disadvantage of FBCP’s huge storage requirement. TRALS can not be able to effectively perform the resolutions, as the contractions of core tensors and calculations of inverse matrices cost too much time when the size of tensor is large. Among all the methods, the proposed TRBU has much better recovery quality for two experiments of video recovery.
6 Conclusion
We rigorously study the TR decomposition and propose the sampling condition for TR completion. Based on this the sampling scheme, we use the balanced TRunfoldings to build the weighted sum of nuclear norm minimization for tensor completion. Using the ADMM scheme we propose a TRBU algorithm to solve this model. The numerical experiments demonstrate the enhancement of the proposed method’s performance on recovery quality.
Appendix A Proof of Definition 1
Proof.
Consider the core tensor , according to the identity , there is , suppose that . Let and , obviously if and, if we have .
The proof to the assertion (Definition 1) is as follows. From the above deductions it is clear that . According to the union bound , the bound of is . Incorporating Lemma 1 we have , let be a proportion of , we prove (10) with probability at least (say). Additionally, there is .
Note that the above result is only for one core of TR, the total probability is .
Appendix B Proof of Theorem 1
Proof.
Consider a generic (sub)critical TR representation . Note that every mode slice of core has the same status when interacting with and , then a substitution for the aforementioned representation is , where holds for all mode slices of . We use matrix to denote the slice of for convenience.
Suppose a order tensor is considered and we first aim to calculate the SVD representation of TR unfolding. For simplicity, we denote by the th mode slice of and the th mode slice of , respectively. Consequently the norm of the mode fiber of is determined by , using the orthonormal condition of and . Thus the norm of mode fibers of is and the reformulation of the unfolding is , where is the slicewise matrix product acting on corresponding mode slices, operator and unfold a TR’s core tensor into a matrix with permuted order and . Thus , where , and
here the operator rearranges a matrix into a vector column by column and rearranges a matrix into a vector row by row. To notify the rank of we have .
The next step is to verify the orthogonality of bases, we use the modified version of the previous representation , where or , which means the two pair of slices can not be the same at the same time. With this expression it is clear that both and are orthogonal.
Generally there are , where , and . The rank is given by .
To calculate the norm of the mode fiber of
Comments
There are no comments yet.