1 Introduction
Tensor is a high order generalization of vectors and matrices, which is suitable for natural data with the characteristic of multidimensionality. For example, a RGB image can be represented as a threeway tensor:
and a video sequence can be represented by a form data. When the original data is transformed into matrix or vector forms, the structure information and adjacent relation of data will be lost. Tensor is the natural representation of data that can retain the high dimensional structure of data. In recent decades, tensor methodologies have attracted a lot of interests and have been applied to various fields such as image and video completion [1, 2], signal processing [3, 4], brain computer interface [5], image classification [6, 7] , etc. Many theories, algorithms and applications of tensor methods have been proposed and studied, which can be referred in the comprehensive review [8].Most tensor decomposition methods assume that the tensor has no missing entries and is complete. However, in practical situations, we may encounter some transmission or device problems which result in that the collected data has missing and unknown entries. To solve this problem, the study on high order tensor decomposition/factorization with missing entries becomes significant and has a promising application aspect. The goal of tensor decomposition of missing data is to find the latent factors of the observed tensor, which can thus be used to reasonably predict the missing entries. The two most popular tensor decomposition methods in recent years are CANDECOMP/PARAFAC(CP) decomposition [9, 10] and Tucker decomposition [11]. There are many proposed methods that use CP decomposition to complete data with missing entries. CP weighted optimization (CPWOPT) [1] applies optimization method to finding the optimal CP factor matrices from the observed data. Bayesian CP factorization [2] exploits Bayesian probabilistic model to automatically determine the rank of CP tensor while finding the best factor matrices. The method in [12] recovers lownrank tensor data with its convex relaxation by alternating direction method of multipliers (ADM).
However, because of the peculiarity of CP and Tucker model, they can only reach a relatively high accuracy in lowdimension tensors. When it comes to a very high dimension, the performance of applying these models to missing data completion will decrease rapidly. As mentioned above, many natural data’s original form is high dimension tensor, so the models which are not sensitive to dimensionality should be applied to perform the tensor decomposition. In this paper, we use tensortrain decomposition [13] which is free from the curse of high dimension to perform tensor data completion. Our works in this paper are as follows: (a) We develop a optimization algorithm named tensortrain weighted optimization (TTWOPT) to find the factor core tensors of tensortrain decomposition. (b) By TTWOPT algorithm, tensortrain decomposition model is applied to incomplete tensor data. Then the factor core tensors are calculated and used to predict the missing entries of the original data. (c) We conduct simulation experiments to verify the accuracy of our algorithm and compare it to other algorithms. In addition, we carry out several real world experiments by applying our algorithm and other stateoftheart algorithms to a set of
images with missing entries. The experiment results show that our method performs better in image inpainting than other stateoftheart approaches. In addition, by converting the image of size
to a much higher dimension, our algorithm can successfully recover images with 99% missing entries while other existing algorithms fail at this missing rate. These results demonstrate that tensortrain decomposition with high order tensorizations can achieve high compressive and representation abilities.2 Notations and Tensortrain Decomposition
2.1 Notations
In this paper, vectors are denoted by boldface lowercase letters, e.g., . Matrices are denoted by boldface capital letters, e.g., . Tensors of order are denoted by Euler script letters, e.g., . denotes the th matrix of a matrix sequence and the representation of vector and tensor sequence is denoted by the same way. When the tensor is in the space of , denotes the nmode matricization of , see [8]. The th element of is denoted by or .
2.2 Tensortrain Decomposition
The most important feature of tensortrain decomposition is that no matter how high the dimension of a tensor is, it decomposes the tensor into a sequence of threeway tensors. This is a great advantage in modeling high dimension tensor because the number of model parameters will not grow exponentially by the increase of the tensor dimension. For example, the number of parameters in Tucker model is where is the number of dimension, is the size of Tucker core tensor and is the size of each dimension of the tensor. For tensortrain decomposition, the number of parameters is where is rank of TTtensor. Therefore, TTmodel needs much fewer model parameters than Tucker model.
Tensortrain decomposition is to decompose a tensor into a sequence of tensor cores. All the tensor cores are threeway tensors. In particular, the TT decomposition of a tensor is expressed as follow:
(1) 
where is a sequence of threeway tensor cores with size of . The sequence is named TTranks which can limit the size of every core tensor. Each element of tensor can be written as the following index form:
(2) 
where is the th slice of the th core tensor. See the concept of slice in [8].
Currently, there is few study about how to compute TTranks efficiently. In paper [13] where tensortrain decomposition is proposed, the author advances an algorithm named TTSVD to calculate the core tensors and TTranks. Although it has the advantage of high accuracy and high efficiency, the TTranks in the middle core tensors must be very high to compensate the low TTranks in the border core tensors, which leads to the unreasonable distribution of TTranks and redundant model parameters. Therefore, the TTranks calculated by TTSVD may not be the optimal one. In this paper, we manually set the TTranks to a smooth distribution and use TTWOPT algorithm to calculate the core tensors. Though we do not have a good TTrank choosing strategy, much fewer model parameters are needed. The simulation results and experiment results also show high accuracy and performance.
3 TTWOPT Algorithm
Most of the tensor decomposition methods, which are used for finding the latent factors, only aim at the fully observed data. When data has missing entries, we cannot use these methods to predict the missing entries. Weighted optimization method minimizes the distance between weighted real data and weighted optimization objective. When the optimization is finished, it means the obtained tensor decomposition factors can match the observed real data well, then the decomposition factors can be converted to original data structure to predict the missing entries.
In our algorithm, TTWOPT is applied to realvalued tensor with missing entries. The index of missing entries can be recorded by a weight tensor which is the same size as . Every entry of meets:
(3) 
In the optimization algorithm, the objective variables are the elements of all the core tensors. Define and ( is the Hadamard product, see [8]), then the objective function can be written as:
(4) 
The relation between original tensor and core tensors can be deduced as the following equation [14]:
(5) 
where for ,
For , the partial derivatives of the objective function w.r.t. the th core tensor can be inferred as follow:
(8) 
After the objective function and the derivation of gradient are obtained, we can solve the optimization problem by any optimization algorithms based on gradient descent method [15]. The optimization procedure of the algorithm is listed in .
Algorithm 1 Tensortrain Weighted Optimization (TTWOPT) 
Input: an way incomplete tensor and a weight tensor . 
Initialization: core tensors of tensor . 
1. Compute . 
For each optimization iteration, 
2. Compute . 
3. Compute objective function: . 
4. Compute all . 
5. Use optimization algorithm to update . 
Until reach optimization stopping condition. 
Return core tensors . 
4 Experiments
In [1] where the CPWOPT method is proposed, only threeway data is tested. When it comes to high dimension data, the performance of CPWOPT will fall. This is not because of the optimization method but the nature limit of CP decomposition. In our paper, we test our TTWOPT on different orders of synthetic data. Then we test our algorithm on real world image data. We also compare the performance of TTWOPT with several stateoftheart methods. is created by randomly setting some percentage of entries to zero while the rest elements remain one.
4.1 Simulation Data
We consider to use synthetic data to validate the effectiveness of our algorithm. Till now, there is few relevant study about applying tensortrain decomposition to data completion, so we compare our algorithm to two other stateoftheart methods–CP weighted optimization (CPWOPT) [1] and Fully Bayesian CP Factorization (FBCP) [2]. We randomly initialize the factor matrices of a tensor with a specified CP rank, then we create the synthetic data by the factor matrices. For data evaluation index, we use relative square error (RSE) which is defined as where is the tensor of full entries generated by core tensors or factor matrices. 1. shows the simulation results of a threeway tensor and a sevenway tensor. The tensor sizes of synthetic data are and , and the CP ranks are set to 10 in both cases.
Though we test the three algorithms on the data generated by CP model, our TTWOPT algorithm shows good results. As we can see from Table 1., when we test on threeway tensor, TTWOPT shows better fitting performance than CPWOPT and FBCP at low data missing rates but a little weak at high missing rates. However, when we test on sevenway tensor, TTWOPT outperforms the other two algorithms. In addition, we also find that the performance of TTWOPT is sensitive to the setting of TTranks, different TTranks will lead to very different model accuracies. It should be noted that till now there is no good strategy to set TTranks and so in our experiments we set all TTranks the same value. This is an aspect that our algorithm needs to improve. Furthermore, the initial values of core tensors also influence the performance of TTWOPT.
threeway tensor  sevenway tensor  
missing rate  0%  50%  95%  0%  50%  95%  
TTWOPT 








CPWOPT 








FBCP 







4.2 Image Data
In this section, we compare our algorithm with CPWOPT and FBCP on image completion experiments. The size of every image data is . We use a set of images with missing rate from 85% to 99% to compare the performance of every algorithm. In this experiment, we do not set tensor ranks and tensor orders identically but use the best ranks to see the best possible result of every algorithm. For TTWOPT, we first reshape original data to a seventeenway tensor of size and permute the tensor according to the order of . Then we reshape the tensor to a nineway tensor of size . This nineway tensor is a better structure to describe the image data. The firstorder of the nineway tensor contains the data of a pixel block of the image and the following orders of the tensor describe the expanding pixel blocks of the image. Furthermore, we set all TTranks to 16 according to our testing experience. For image evaluation index, we use PSNR (Peak Signaltonoise Ratio) to measure the quality of reconstructed image data. 2. shows the testing results of one image. 1. visualizes the image inpainting results.
The experiment result shows that our TTWOPT algorithm outperforms other algorithms for image data completion. Particularly, when the missing rate reaches 98% and 99%, our algorithm can recover the image successfully while other algorithms totally fail. The RSE and the PSNR values of TTWOPT are always better than CPWOPT and FBCP. In addition, the image visual quality of our method is always the best.
missing rate  85%  90%  95%  98%  99%  
TTWOPT 







CPWOPT 







FBCP 






5 Conclusion
In this paper, we first elaborate the basis of tensor and the tensortrain decomposition method. Then we use a gradientbased firstorder optimization method to find the factors of the tensortrain decomposition when tensor has missing entries and propose the TTWOPT algorithm. This algorithm can solve the tensor completion problem of high dimension tensor. From the simulation and image experiments, we can see our algorithm outperforms the other stateoftheart methods in many situations especially when the missing rate of data is extremely high. Our study also proves that high order tensorization of data is an effective and efficient method to represent data. Furthermore, it should be noted that the accuracy of TT model is sensitive to the selection of TTranks. Hence, we will study on how to choose TTranks automatically in our future work.
Acknowledgement
This work is supported by JSPS KAKENHI (Grant No. 17K00326) and KAKENHI (Grant No. 15H04002).
References
 [1] Acar, E., Dunlavy, D.M., Kolda, T.G., Mørup, M., Scalable Tensor Factorizations for Incomplete Data, Chemometrics and Intelligent Laboratory Systems, vol. 106, no. 1, pp. 41–56, (2011).
 [2] Zhao, Q., Zhang, L., Cichocki, A., Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination, IEEE transactions on pattern analysis and machine intelligence, vol. 37, no. 9, pp. 1751–1763, (2015).
 [3] De Lathauwer, L., Castaing, J., Blind Identification of Underdetermined Mixtures by Simultaneous Matrix Diagonalization, IEEE Transactions on Signal Processing, vol. 56, no. 3, pp. 1096–1105, (2008).
 [4] Muti, D., Bourennane, S., Multidimensional Filtering Based on a Tensor Approach, Signal Processing, vol. 85, no. 12, pp. 2338–2353, (2005).
 [5] Mocks, J., Topographic Components Model for Eventrelated Potentials and Some Biophysical Considerations, IEEE transactions on biomedical engineering, vol. 35, no. 6, pp. 482–484, (1988).

[6]
Shashua, A., Levin, A., Linear Image Coding for Regression and Classification Using the Tensorrank Principle, in Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001 IEEE Computer Society Conference on, vol. 1. IEEE, pp. I–I, (2001).

[7]
Vasilescu, M.A.O., Terzopoulos, D., Multilinear Image Analysis for Facial Recognition, in Pattern Recognition, 2002. Proceedings. 16th International Conference on, vol. 2. IEEE, pp. 511–514, (2002).
 [8] Kolda, T.G., Bader, B.W., Tensor Decompositions and Applications, SIAM review, vol. 51, no. 3, pp. 455–500, (2009).
 [9] Harshman, R.A., Foundations of the pParafac Procedure: Models and Conditions for an” Explanatory” Multimodal Factor Analysis, (1970).
 [10] Sorensen, M., De Lathauwer, L., Comon, P., Icart, S., Deneire, L., Canonical Polyadic Decomposition with Orthogonality Constraints, SIAM Journal on Matrix Analysis and Applications, (2012).
 [11] Tucker, L.R., Some Mathematical Notes on Threemode Factor Analysis, Psychometrika, vol. 31, no. 3, pp. 279–311, (1966).
 [12] Gandy, S., Recht, B., Yamada, I., Tensor Completion and Lownrank Tensor Recovery via Convex Optimization, Inverse Problems, vol. 27, no. 2, p. 025010, (2011).
 [13] Oseledets, I.V., Tensortrain Decomposition, SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, (2011).

[14]
Cichocki, A., Lee, N., Oseledets, I.V., Phan, A.H., Zhao, Q., Mandic, D.P., et al., Tensor Networks for Dimensionality Reduction and Largescale Optimization: Part 1 Lowrank Tensor Decompositions, Foundations and Trends® in Machine Learning, vol. 9, no. 45, pp. 249–429, (2016).
 [15] Nocedal, J., Wright, S., Numerical optimization. Springer Science & Business Media, (2006).