Scaled Nuclear Norm Minimization for Low-Rank Tensor Completion

07/25/2017 ∙ by Morteza Ashraphijuo, et al. ∙ 0

Minimizing the nuclear norm of a matrix has been shown to be very efficient in reconstructing a low-rank sampled matrix. Furthermore, minimizing the sum of nuclear norms of matricizations of a tensor has been shown to be very efficient in recovering a low-Tucker-rank sampled tensor. In this paper, we propose to recover a low-TT-rank sampled tensor by minimizing a weighted sum of nuclear norms of unfoldings of the tensor. We provide numerical results to show that our proposed method requires significantly less number of samples to recover to the original tensor in comparison with simply minimizing the sum of nuclear norms since the structure of the unfoldings in the TT tensor model is fundamentally different from that of matricizations in the Tucker tensor model.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Tensors are generalizations of vectors and matrices to higher dimensions. Due to the recent advancement in machine learning, multi-dimensional analysis of data has become indispensable to fully exploit the high-dimensional representation of data as the conventional matrix analysis has only limited capability in exploiting correlations across different attributes in a multi-way representation. The low-rank tensor completion problem refers to completing a tensor given a subset of its entries and the corresponding rank constraints. There exists an extensive literature on the low-rank matrix completion problem, which is a special case of low-rank tensor completion problem (two-dimensional version). In general, there are many applications of low-rank tensor completion in various areas including image or signal processing

[1, 2], data mining [3], network coding [4], compressed sensing [5, 6, 7], reconstructing the visual data [8, 9, 10], seismic data processing [11, 12, 13], etc. Tensors representing the real-world datasets usually exhibit a low rank structure and effectively exploiting such structure for analyzing large-scale high-dimensional datasets has become a hot topic in machine learning and data mining.

The majority of the literature on low-rank tensor completion is based on convex relaxation of matrix rank [14, 15, 16, 17, 1] or different convex relaxations of tensor ranks [7, 18, 19, 20, 21, 11]. In addition, other approaches have been proposed that are based on alternating minimization [13, 9, 10], algebraic geometric analyses [22, 23, 24, 25, 26, 27, 28, 29]

and other heuristics

[30, 31, 32, 33]. There are several well-known tensor decompositions including tensor-train (TT) decomposition [34, 35], Tucker decomposition [36, 37], canonical polyadic (CP) decomposition [38, 39], tubal rank decomposition [40], etc. In this paper, we focus on TT-rank and TT decomposition. TT decomposition was proposed in the field of quantum physics about years ago [41, 42]. Later it was used in the area of machine learning [34, 43, 44]. A comprehensive survey on TT decomposition and the manifold of tensors of fixed TT rank can be found in [45] that also includes a comparison between the TT and Tucker decompositions for a better understanding of the advantages of TT decomposition.

The nuclear norm minimization for matrix completion problem, proposed in [15], can recover the original low-rank sampled matrix under some mild assumptions. The minimization of the sum of nuclear norms of matricizations of the tensor, proposed in [8], can recover the original low-Tucker-rank sampled tensor under some mild assumptions [21]. One natural extension is to use the sum of nuclear norms of unfoldings to obtain the low-TT-rank sampled tensor. In this paper, we propose to use a weighted sum of nuclear norms of unfoldings, which outperforms the simple sum of nuclear norms of unfoldings. The reason behind such performance gain is the difference between the structure of matricizations in Tucker model and that of unfoldings in TT model.

Ii Background on Low-TT-Rank Tensor Completion

Assume that a -way tensor is sampled. Denote as the binary sampling pattern tensor that is of the same size as and if is observed and otherwise, where represents an entry of tensor with coordinate .

Define the matrix as the -th unfolding of the tensor , such that , where and are two bijective mappings.

The separation or TT-rank of a tensor is defined as where , . Note that in general and also is simply the conventional matrix rank when . The TT decomposition of a tensor is defined as

(1)

or in short,

(2)

where the -way tensors for and matrices and are the components of this decomposition.

Let be the -th matricization of the tensor , i.e., the matrix such that , where is a bijective mapping. Observe that for any arbitrary tensor , the first matricization and the first unfolding are the same, i.e., . The Tucker-rank of a tensor is defined as where .

Define as the tensor obtained from sampling according to , i.e.,

(3)

Assuming that a tensor with is sampled according to the sampling pattern . Then, the following NP-hard problem, known as the rank feasibility problem, or tensor completion problem, aims to find a completion of the given rank constarints.

(4)

Iii Optimization Formulations

As mentioned earlier, for the matrix case, by relaxing the rank constraint and minimizing the nuclear norm of the matrix, we can obtain the original low-rank matrix under some mild assumptions [15]. Following this idea, the same problem for low-Tucker-rank tenors is studied in [8], where by relaxing the Tucker-rank constraints and minimizing the sum of nuclear norms of matricizations of the tensor, the low-Tucker-rank sampled tensor can be obtained [21]. This formulation can be written as

(5)

where denotes the nuclear norm of the matrix

, i.e., sum of the singular values of

. Intuitively, this optimization formulation is not efficient for solving (4

) as the tensor is chosen generically from the corresponding low-TT-rank manifold and is not a low-Tucker-rank tensor with high probability. In the numerical experiments, we show that this method performs very poorly. Now, given that the TT-rank is defined through the unfoldings, a natural alternative formulation is to minimize the sum of nuclear norms of unfoldings as

(6)

In the numerical experiments, we show that this method performs much better than the previous formulation (5), which is very reasonable as we are minimizing the tightest convex relaxation of each TT-rank component. On the other hand, since the dimensions of different unfoldings are different, i.e., , an even more efficient formulation is to use the following weighted sum of nuclear norms

(7)

Note that all three optimization formulations (5), (6) and (7) are convex programs and easy to solve.

Iv Numerical Results

In our numerical experiments, we first generate a generic tensor of a given TT rank as the following. We consider a TT-rank vector and we generate completely random two and three-way tensor components and construct according to (1). Hence, is generically chosen from the manifold of tensors of TT-rank . Moreover, we sample the entries of the obtained tensor independently and with some probability .

For the first example, we construct a generic tensor of TT-rank and sample each entry with probability . Then, we solve each one of the optimization problems (5)-(7) for the sampled tensor to reconstruct the original tensor. We define the error as , where is the obtained solution and is the original sampled tensor. In Figure 1, we plot the errors obtained from (5), (6) and (7) in terms of the sampling probability. For this experiment, we repeated each experiment times for each value of the sampling probability and the error curves represent the average over the experiments.

Fig. 1: Comparison of formulations (5), (6) and (7) for of TT-rank .

For example, according to Figure 1, using our proposed weighted sum of nuclear norms of unfoldings, the error of can be obtained for sampling probability , whereas is needed for the same error using the sum of nuclear norms of unfoldings. In other words, our proposed method outperforms the method using the sum of nuclear norms of unfoldings by approximately in terms of the sampling probability. Moreover, by decreasing the sampling probability, the sum of nuclear norms results in a much greater error in comparison with our proposed objective function. Note that formulation based on the sum of nuclear norms of matricizations performs very poorly.

As the second example, in Figure 2, we represent the error obtained from (5), (6) and (7) for a sampled tensor of TT-rank . Using our proposed weighted sum of nuclear norms of unfoldings, the error of can be obtained for sampling probability , whereas is needed for the same error using the sum of nuclear norms of unfoldings. Hence, our proposed method outperforms the method using the sum of nuclear norms of unfoldings by approximately in terms of the sampling probability. Again, the sum of nuclear norms of matricizations performs very poorly. For this experiment, we repeated each experiment times for each value of the sampling probability and the error curves represent the average over the experiments.

Fig. 2: Comparison of formulations (5), (6) and (7) for of TT-rank .

Finally, in Figure 3, we represent the error obtained from (5), (6) and (7) for a sampled tensor of TT-rank . Using our proposed weighted sum of nuclear norms of unfoldings, the error of can be obtained for sampling probability , whereas is needed for the same error using the sum of nuclear norms of unfoldings. Hence, our proposed method outperforms the method using the sum of nuclear norms of unfoldings by approximately in terms of the sampling probability. For this experiment, we repeated each experiment times for each value of the sampling probability and the error curves represent the average over the experiments.

Fig. 3: Comparison of formulations (5), (6) and (7) for of TT-rank .

V Conclusions

Minimizing the nuclear norm of a matrix is a well-known and efficient method to tackle the low-rank matrix completion problem. However, the nuclear norm of a tensor is not well defined, and therefore one way to approach the low-rank tensor completion problem is to minimize the sum of nuclear norms of matricizations or unfoldings of the tensor. In fact, minimizing the sum of nuclear norms of matricizations of a tensor is efficient to recover a low-Tucker-rank sampled tensor. In order to recover a low-TT-rank sampled tensor, we proposed to minimize a weighted sum of nuclear norms of unfoldings of the tensor instead of minimizing the sum of nuclear norms of unfoldings. Through numerical results, we showed that our proposed optimization formulation outperforms the formulations using the sum of nuclear norms of unfoldings or matricizations significantly in the sense of the required number of samples to recover the original tensor.

References

  • [1] E. J. Candès, Y. C. Eldar, T. Strohmer, and V. Voroninski, “Phase retrieval via matrix completion,” SIAM Journal on Imaging Sciences, vol. 6, no. 1, pp. 199–225, 2013.
  • [2] H. Ji, C. Liu, Z. Shen, and Y. Xu, “Robust video denoising using low rank matrix completion.” in

    Conference on Computer Vision and Pattern Recognition

    , 2010, pp. 1791–1798.
  • [3] L. Eldén, Matrix methods in data mining and pattern recognition.   Society for Industrial and Applied Mathematics, 2007, vol. 4.
  • [4] N. J. Harvey, D. R. Karger, and K. Murota, “Deterministic network coding by matrix completion,” in ACM-SIAM symposium on Discrete algorithms, 2005, pp. 489–498.
  • [5] L.-H. Lim and P. Comon, “Multiarray signal processing: Tensor decomposition meets compressed sensing,” Comptes Rendus Mecanique, vol. 338, no. 6, pp. 311–320, 2010.
  • [6] N. D. Sidiropoulos and A. Kyrillidis, “Multi-way compressed sensing for sparse low-rank tensors,” IEEE Signal Processing Letters, vol. 19, no. 11, pp. 757–760, 2012.
  • [7] S. Gandy, B. Recht, and I. Yamada, “Tensor completion and low-n-rank tensor recovery via convex optimization,” Inverse Problems, vol. 27, no. 2, pp. 1–19, 2011.
  • [8]

    J. Liu, P. Musialski, P. Wonka, and J. Ye, “Tensor completion for estimating missing values in visual data,”

    IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 208–220, 2013.
  • [9] X.-Y. Liu, S. Aeron, V. Aggarwal, and X. Wang, “Low-tubal-rank tensor completion using alternating minimization,” arXiv preprint:1610.01690, 2016.
  • [10] ——, “Low-tubal-rank tensor completion using alternating minimization,” in International Society for Optics and Photonics, 2016, pp. 984 809–984 809.
  • [11] N. Kreimer, A. Stanton, and M. D. Sacchi, “Tensor completion based on nuclear norm minimization for 5D seismic data reconstruction,” Geophysics, vol. 78, no. 6, pp. V273–V284, 2013.
  • [12] G. Ely, S. Aeron, N. Hao, M. E. Kilmer et al.

    , “5d and 4d pre-stack seismic data completion using tensor nuclear norm (tnn),” in

    Society of Exploration Geophysicists, 2013.
  • [13] W. Wang, V. Aggarwal, and S. Aeron, “Tensor completion by alternating minimization under the tensor train (TT) model,” arXiv preprint:1609.05587, Sep. 2016.
  • [14] E. J. Candès and B. Recht, “Exact matrix completion via convex optimization,” Foundations of Computational Mathematics, vol. 9, no. 6, pp. 717–772, 2009.
  • [15] E. J. Candès and T. Tao, “The power of convex relaxation: Near-optimal matrix completion,” IEEE Transactions on Information Theory, vol. 56, no. 5, pp. 2053–2080, 2010.
  • [16] J. F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM Journal on Optimization, vol. 20, no. 4, pp. 1956–1982, 2010.
  • [17] M. Ashraphijuo, R. Madani, and J. Lavaei, “Characterization of rank-constrained feasibility problems via a finite number of convex programs,” in 2016 IEEE 55th Conference on Decision and Control (CDC), 2016, pp. 6544–6550.
  • [18] R. Tomioka, K. Hayashi, and H. Kashima, “Estimation of low-rank tensors via convex optimization,” arXiv preprint:1010.0789, 2010.
  • [19] M. Signoretto, Q. T. Dinh, L. De Lathauwer, and J. A. Suykens, “Learning with tensors: a framework based on convex optimization and spectral regularization,” Machine Learning, vol. 94, no. 3, pp. 303–351, 2014.
  • [20] B. Romera-Paredes and M. Pontil, “A new convex relaxation for tensor completion,” in Advances in Neural Information Processing Systems, 2013, pp. 2967–2975.
  • [21] B. Huang, C. Mu, D. Goldfarb, and J. Wright, “Provable models for robust low-rank tensor completion,” Pacific Journal of Optimization, vol. 11, no. 2, pp. 339–364, 2015.
  • [22] D. Pimentel-Alarcón, N. Boston, and R. Nowak, “A characterization of deterministic sampling patterns for low-rank matrix completion,” IEEE Journal of Selected Topics in Signal Processing, vol. 10, no. 4, pp. 623–636, 2016.
  • [23] M. Ashraphijuo and X. Wang, “Fundamental conditions for low-cp-rank tensor completion,” arXiv preprint:1703.10740, 2017.
  • [24] M. Ashraphijuo, V. Aggarwal, and X. Wang, “Deterministic and probabilistic conditions for finite completability of low rank tensor,” arXiv preprint:1612.01597, 2016.
  • [25] M. Ashraphijuo, X. Wang, and V. Aggarwal, “Deterministic and probabilistic conditions for finite completability of low-rank multi-view data,” arXiv preprint:1701.00737, 2017.
  • [26] M. Ashraphijuo, V. Aggarwal, and X. Wang, “A characterization of sampling patterns for low-Tucker-rank tensor completion problem,” in IEEE International Symposium on Information Theory (ISIT), 2017, pp. 531–535.
  • [27] M. Ashraphijuo, X. Wang, and V. Aggarwal, “A characterization of sampling patterns for low-rank multi-view data completion problem,” in IEEE International Symposium on Information Theory (ISIT), 2017, pp. 1147–1151.
  • [28] M. Ashraphijuo and X. Wang, “Characterization of deterministic and probabilistic sampling patterns for finite completability of low tensor-train rank tensor,” arXiv preprint:1703.07698, 2017.
  • [29] M. Ashraphijuo, X. Wang, and V. Aggarwal, “Rank determination for low-rank data completion,” arXiv preprint:1707.00622, 2017.
  • [30] X. Y. Liu, S. Aeron, V. Aggarwal, X. Wang, and M. Y. Wu, “Adaptive sampling of RF fingerprints for fine-grained indoor localization,” IEEE Transactions on Mobile Computing, vol. 15, no. 10, pp. 2411–2423, 2016.
  • [31] D. Kressner, M. Steinlechner, and B. Vandereycken, “Low-rank tensor completion by Riemannian optimization,” BIT Numerical Mathematics, vol. 54, no. 2, pp. 447–468, 2014.
  • [32] A. Krishnamurthy and A. Singh, “Low-rank matrix and tensor completion via adaptive sampling,” in Advances in Neural Information Processing Systems, 2013, pp. 836–844.
  • [33] D. Goldfarb and Z. Qin, “Robust low-rank tensor recovery: Models and algorithms,” SIAM Journal on Matrix Analysis and Applications, vol. 35, no. 1, pp. 225–253, 2014.
  • [34] I. V. Oseledets, “Tensor-train decomposition,” SIAM Journal on Scientific Computing, vol. 33, no. 5, pp. 2295–2317, 2011.
  • [35] S. Holtz, T. Rohwedder, and R. Schneider, “On manifolds of tensors of fixed TT-rank,” Numerische Mathematik, vol. 120, no. 4, pp. 701–731, 2012.
  • [36] T. G. Kolda, “Orthogonal tensor decompositions,” SIAM Journal on Matrix Analysis and Applications, vol. 23, no. 1, pp. 243–255, 2001.
  • [37] D. Kressner, M. Steinlechner, and B. Vandereycken, “Low-rank tensor completion by Riemannian optimization,” BIT Numerical Mathematics, vol. 54, no. 2, pp. 447–468, 2014.
  • [38] R. A. Harshman, “Foundations of the PARAFAC procedure: Models and conditions for an explanatory multi-modal factor analysis,” University of California at Los Angeles, 1970.
  • [39] A. Stegeman and N. D. Sidiropoulos, “On Kruskal’s uniqueness condition for the CANDECOMP/PARAFAC decomposition,” Linear Algebra and its Applications, vol. 420, no. 2, pp. 540–552, 2007.
  • [40] M. E. Kilmer, K. Braman, N. Hao, and R. C. Hoover, “Third-order tensors as operators on matrices: A theoretical and computational framework with applications in imaging,” SIAM Journal on Matrix Analysis and Applications, vol. 34, no. 1, pp. 148–172, 2013.
  • [41] M. H. Beck, A. Jäckle, G. Worth, and H.-D. Meyer, “The multiconfiguration time-dependent hartree (MCTDH) method: a highly efficient algorithm for propagating wavepackets,” Physics Reports, vol. 324, no. 1, pp. 1–105, 2000.
  • [42] U. Schollwöck, “The density-matrix renormalization group,” in Journal of Modern Physics, vol. 77, no. 1, p. 259, 2005.
  • [43]

    I. V. Oseledets and E. E. Tyrtyshnikov, “Breaking the curse of dimensionality, or how to use SVD in many dimensions,”

    SIAM Journal on Scientific Computing, vol. 31, no. 5, pp. 3744–3759, 2009.
  • [44] ——, “Tensor tree decomposition does not need a tree,” Linear Algebra Applications, vol. 8, 2009.
  • [45] S. Holtz, T. Rohwedder, and R. Schneider, “On manifolds of tensors of fixed TT-rank,” Numerische Mathematik, vol. 120, no. 4, pp. 701–731, 2012.