# A novel nonconvex approach to recover the low-tubal-rank tensor data: when t-SVD meets PSSV

In this paper we fix attention on a recently developed tensor decomposition scheme named tensor SVD (t-SVD). the t-SVD not only provides similar properties as the matrix case, but also convert the tensor tubal-rank minimization into matrix rank minimization in the Fourier domain. Generally, minimizing the tensor nuclear norm (TNN) may cause some bias. In this paper, to alleviate these bias phenomenon, we consider to minimize the proposed partial sum of the tensor nuclear norm (PSTNN) in place of the tensor nuclear norm. The novel PSTNN is used for the problems of tensor completion (TC) and tensor principal component analysis (TRPCA). The effectiveness of the proposed methods are conducted on the synthetic data and real world data, and experimental results reveal that the algorithm outperforms TNN based methods.

## Authors

• 13 publications
• 18 publications
• 19 publications
• 7 publications
• ### Reweighted Low-Rank Tensor Decomposition based on t-SVD and its Applications in Video Denoising

The t-SVD based Tensor Robust Principal Component Analysis (TRPCA) decom...
11/18/2016 ∙ by M. Baburaj, et al. ∙ 0

• ### Tensor p-shrinkage nuclear norm for low-rank tensor completion

In this paper, a new definition of tensor p-shrinkage nuclear norm (p-TN...
07/09/2019 ∙ by Chunsheng Liu, et al. ∙ 0

• ### Tensor Completion via Tensor QR Decomposition and L_2,1-Norm Minimization

In this paper, we consider the tensor completion problem, which has many...
11/09/2020 ∙ by Yongming Zheng, et al. ∙ 0

• ### Exact 3D seismic data reconstruction using Tubal-Alt-Min algorithm

Data missing is an common issue in seismic data, and many methods have b...
04/08/2017 ∙ by Feng Qian, et al. ∙ 0

• ### Fast and Accurate Low-Rank Tensor Completion Methods Based on QR Decomposition and L_2,1 Norm Minimization

More recently, an Approximate SVD Based on Qatar Riyal (QR) Decompositio...
08/06/2021 ∙ by HongBing Zhang, et al. ∙ 15

• ### Parallel Active Subspace Decomposition for Scalable and Efficient Tensor Robust Principal Component Analysis

Tensor robust principal component analysis (TRPCA) has received a substa...
12/28/2017 ∙ by Jonathan Q. Jiang, et al. ∙ 0

• ### Hankel Matrix Nuclear Norm Regularized Tensor Completion for N-dimensional Exponential Signals

Signals are generally modeled as a superposition of exponential function...
04/06/2016 ∙ by Jiaxi Ying, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The tensor is an important format for multidimensional data, which play an increasingly significant role in a wide range of real-world applications, e.g., color image and video processing (1; 2; 3; 4; 5), hyperspectral data processing (6; 7; 8; 9), personalized web search (10; 11), high-order web link analysis (12), magnetic resonance imaging (MRI) data recovery (13), seismic data reconstruction (14)

(15). How to characterize and utilize the internal structural information of these multidimensional data is of crucial importance.

In matrix processing, low-rank models can robustly and efficiently handle two-dimensional data of various sources (16; 17; 18; 19; 20; 21; 22; 23; 24). Generalized from matrix format, a tensor is able to contain more essentially structural information, being a powerful tool for dealing with multi-modal and multi-relational data (25; 26; 27). Unfortunately, it is not easy to directly extend the low-rankness from the matrix to tensors. More precisely, there is not an exact (or unique) definition for tensor rank. The most popular rank definitions are CANDECOMP/PARAFAC (CP) rank and Tucker rank (28).

Actually, the CP rank and Tucker rank are both defined based on their corresponding decompositions, respectively. For a tensor , its CP decomposition can be written as

 X≈R∑r=1ar∘br∘cr, (1)

where the symbol “

” represents the vector outer product,

is a positive integer and , and for . Then, the positive integer , i.e., the smallest number of the outer product of 3 vectors (or denoted as rank-one tensors in (28)) that generate , is denoted as the CP rank of . Meanwhile the Tucker decomposition for a tensor is as follow

 X≈G×1A×2B×3C=P∑p=1Q∑q=1R∑r=1gpqrap∘bq∘cr, (2)

where the symbol “” stands for the mode- product (please see details in 2), is called the core tensor, and , and are matrices. Then, the Tucker rank (or denoted as “-rank” in some literatures) is defined as a vector . The Tucker decomposition and CP decomposition are illustrated in Fig. 1.

In this paper, we fix attention on a recently developed novel tensor decomposition scheme named tensor singular value decomposition (t-SVD), which has been well studied in (29; 30; 31; 32; 33). Furthermore, in (34; 35), the bounds and conditions for recovery of corrupted tensors have been well analyzed in the tensor completion and tensor robust principal component analysis problems, respectively. The t-SVD is based on a new definition of the tensor-tensor product, which enjoys many similar properties as the matrix case (Please see Section 2.2 for details). For a tensor , its t-SVD is given by

 X=U∗S∗V⊤ (3)

where the symbol “” denotes the tensor-tensor product (see more details in Sec. 2.2), , and .

Figure 2 exhibits the t-SVD scheme. Then, the tensor tubal-rank is defined as the number of non-zero singular tubes of . Hence, the tensor nuclear norm (TNN, defined in Sec. 2.2) is adopt by (34; 35), as a convex relation of tensor tubal-rank.

The relationship between tubal-rank and CP rank is that a low CP rank tensor is indeed a low tubal-rank tensor. As the analysis in (34), if we take the FFT along the third dimension of a low CP rank tensor , we can obtain , where , . It implies that if a tensor is of CP rank , its tubal-rank is at most . Thus, for a third-order tensor with low CP rank, we can recover it using the t-SVD structure. The relationship between tubal-rank and Tucker rank is not explicit, therefore, the performance of the Tucker rank based method is brought into comparison in our numerical experiments. The experimental results would reveal the superior of the tubal-rank over the Tucker rank.

It should be noted that the t-SVD not only provides similar properties as the matrix case but also convert the tensor tubal-rank minimization into matrix rank minimization in the Fourier domain. Meanwhile, though the selected matrix nuclear norm in the Fourier domain (34; 35) is tractable, it would cause some unavoidable biases (23; 24). First, the nuclear norm minimizes not only the rank of an underlying matrix

, but also the variance of

by simultaneously minimizing all the singular values of . Second, if the ground truth matrix

has a large variance but a sparse distribution within the ground truth subspace, some inliers can be regarded as outliers in order to reduce the singular values within the target rank. For more detailed analysis, please refer to

(24). Therefore, there is still room to further enhance the potential capacity and efficiency of these t-SVD methods.

To alleviate these bias phenomenon caused by a convex surrogate, the non-convex relaxations of the matrix nuclear norm (36; 37) are reasonable options. In this paper, we consider to minimize the proposed partial sum of the tensor nuclear norm (PSTNN) in place of the tensor nuclear norm.

The main contribution of this paper mainly consists of three folds. First, on the foundation of the nonconvex surrogate of matrix rank, we propose a novel nonconvex approximation of the tensor tubal-rank, PSTNN, with superior performance than TNN. To best of our knowledge, it is the first nonconvex approach under the t-SVD scheme. Second, to minimize the proposed PSTNN, we extend the partial singular value thresholding (PSVT) operator, which was primarily proposed in (23), for the matrices in the complex field, and demonstrate that it is the exact solution to the PSTNN minimization problem. Third, we apply PSTNN to two typical tensor recovery problems and propose the PSTNN based tensor completion (PSTNN-TC) model and PSTNN based robust principal component analysis (PSTNN-RPCA) model. Two efficient alternating direction method of multipliers (ADMM) algorithms have been designed to solve the models by using the PSVT solver. Moreover, numerical experiments are conducted on the synthetic data and real-world data and the experimental results demonstrate the effectiveness and robustness of the proposed PSTNN based models.

The outline of this paper is given as follows. In Section 2, some preliminary background on tensors is given. In Section 3, the main result is presented. Experimental results are reported in Section 4. Finally, we draw some conclusions in Section 5.

## 2 Notation and preliminaries

In this section, before going to the main result, we briefly introduce the basic notations and definitions about tensors at first and then give the detailed novel definitions related to the t-SVD scheme.

### 2.1 Basic tensor notations and definition

Following (28), we use lowercase letters for saclars, e.g., , boldface lowercase letters for vectors, e.g., , boldface upper-case letters for matrices, e.g., , and boldface calligraphic letters for tensors, e.g., . Generally, an -mode tensor is defined as , and is its -th component.

Fibers are defined by fixing every index but one. Third-order tensors have column, row, and tube fibers, denoted by , , and , respectively. When extracted from the tensor, fibers are always assumed to be oriented as column vectors.

Slices are two-dimensional sections of a tensor, defined by fixing all but two indices. The horizontal, lateral, and frontal slides of a third-order tensor , denoted by , , and , respectively. The -th frontal slice of a third-order tensor, , may alternatively be denoted as in this paper.

The inner product of two same-sized tensors and is defined as. The corresponding norm (Frobenius norm) is then defined as .

The mode- unfolding of a tensor is denoted as , where the tensor element maps to the matrix element satisfying with . The inverse operator of unfolding is denoted as “fold”, i.e., .

The -mode (matrix) product of a tensor with a matrix is denoted by and is of size . Elementwise, we have

 (X×nA)i1⋯in−1jin+1⋯iN=In∑in=1xi1i2⋯in⋯iN⋅ajin. (4)

Each mode- fiber is multiplied by the matrix . This idea can also be expressed in terms of unfolded tensors

 Y=(X×nA)⇔Y(n)=A⋅unfoldn(X).

Please refer to (28) for a more extensive overview.

### 2.2 Notations and definition corresponding to t-SVD

For a tensor , by using the matlab command fft, we denote

as the result of discrete Fourier transformation of

along the third dimension, i.e., . Meanwhile, the inverse FFT can be denoted as .

###### Definition 2.1 (tensor conjugate transpose (30))

The conjugate transpose of a tensor is tensor obtained by conjugate transposing each of the frontal slice and then reversing the order of transposed frontal slices 2 through :

 (A⊤)(1) =(A(1))⊤and (A⊤)(i) =(A(n3+2−i))⊤,i=2,⋯,n3.
###### Definition 2.2 (t-product (30))

The t-product of and is a tensor of size , where the th tube is given by

 cij:=C(i,j,:)=n2∑k=1A(i,k,:)∗B(k,j,:) (5)

where denotes the circular convolution between two tubes of same size.

Interpreted in another way, a 3-D tensor of size can be viewed as a matrix of fibers (tubes) with each entry as a tube lies in the third dimension. So the t-product of two tensors can be regarded as a matrix-matrix multiplication, except that the multiplication operation between scalars is replaced by circular convolution between the tubes.

###### Definition 2.3 (identity tensor(30))

The identity tensor is the tensor whose first frontal slice is the identity matrix, and whose other frontal slices are all zeros.

###### Definition 2.4 (orthogonal tensor(30))

A tensor is orthogonal if it satisfies

 Q⊤∗Q=Q∗Q⊤=I. (6)
###### Definition 2.5 (block diagonal form(33))

Let denote the block-diagonal matrix of the tensor in the Fourier domain, i.e.,

 ¯¯¯¯¯A ≜blockdiag(ˆA) (7)

It is easy to verify that the block diagonal matrix of is equal to the transpose of the block diagonal matrix of , i.e., . Further more, for any tensor and , we have

 A∗B=C⇔¯¯¯¯¯¯¯¯AB=¯¯¯C.
###### Definition 2.6 (f-diagonal tensor(30))

A tensor is called f-diagonal if each frontal slice is a diagonal matrix.

###### Theorem 2.1 (t-SVD(30; 32))

For , the t-SVD of is given by

 A=U∗S∗V⊤ (8)

where and are orthogonal tensors, and is a f-diagonal tensor.

The illustration of the t-SVD decomposition is in Figure 2. Note that one can efficiently obtain this decomposition by computing matrix SVDs in the Fourier domain as shown in Algorithm 1.

###### Definition 2.7 (tensor tubal-rank and multi-rank(33))

The tubal-rank of a tensor , denoted as , is defined to be the number of non-zero singular tubes of , where comes from the t-SVD of : . That is

 rankr(A)=#{i:S(i,:,:)≠0}. (9)

The tensor multi-rank of is a vector with the -th element equal to the rank of -th frontal slice of .

###### Definition 2.8 (tensor-nuclear-norm (TNN))

The tubal nuclear norm of a tensor , denoted as , is defined as the sum of singular values of all the frontal slices of .

In particular,

 ∥A∥TNN≜∥¯¯¯¯¯A∥∗=n3∑i=1∥ˆA(i)∥∗. (10)

## 3 Main results

In this section, we first present the definition of the proposed PSTNN. Then the PSVT based solver of the PSTNN minimization model is presented. Furthermore, we propose the PSTNN based TC model and TRPCA model and their corresponding algorithms, respectively.

### 3.1 Partial sum of tensor tensor nuclear norm (PSTNN)

In (33; 34), the TNN is selected to characterize the low-tubal-rank structure of a tensor, for the tensor completion problem. TNN is also chosen to approximate the low-rank part to handle the RPCA problem (35) and outlier-RPCA problem (38). It noteworthy that, for a tensor , there is a link between its tensor tubal-rank and multi-rank:

 rankr(A)=∥r∥∞. (11)

Meanwhile, according to Definition 2.7, the -th element of the multi-rank is , and Definition 2.5 implies . Thus the norm of ’s multi-rank equals to the rank of its block-diagonal matrix in the Fourier domain , i.e.,

 ∥r∥1=rank(¯¯¯¯¯A). (12)

More precisely, the TNN defined in (10) is herein a convex relaxation of the norm of a three order tensor’s multi-rank , i.e., .

Although the nuclear norm minimization problem can be easily solved by the singular value thresholding (SVT) (39), the nuclear norm-based methods treat each singular value equally. However, the larger singular values are generally associated with the major information, and hence they should better be shrunk less to preserve the major data information (40). Recent advances show that the low-rank matrix factorization (41; 42) and MCP function (43) outperform the nuclear norm. Therefore, we tend to apply a nonconvex relaxation instead of the nuclear norm.

We firstly give our novel nonconvex tensor tubal-rank approximation, which is derived from the partial sum of singular values (PSSV) (23; 24). The PSTNN of a third order tensor is defined as follow

 ∥A∥PSTNN≜n3∑i=1∥ˆA(i)∥p=N. (13)

In (13), is the PSSV (23; 24), which is defined as for a matrix , where denotes the -th largest singular value of . It is notable that, as illustrated in Figure 3, there is a link between the PSTNN of a tensor and the PSSV of a matrix. From Figure 3, we can find that the definition of PSTNN maintains a distinct meaning in the t-SVD scheme.

### 3.2 The PSTNN minimization model

The fundamental PSTNN-based tensor recovery model aiming at restoring a tensor from its observation with PSTNN regularization. For an observed tensor , the PSTNN regularized tensor recovery model can be written as following:

 X=argminXλ∥X∥PSTNN+β2∥X−Y∥2F, (14)

where and .

If we take FFT of and along the third mode, it is easy to see that solving the above optimization problem (14) is equivalent to solving matrix optimization problems in the Fourier domain,

 ˆX(k)=argminˆX(k)λ∥ˆX(k)∥p=N+β2∥ˆX(k)−ˆY(k)∥2F, (15)

where and . Thus, the tensor optimization problem (14) is transformed to matrix optimization problems in (15) in the Fourier transform domain. It should be note that, Oh et al. have proposed the close-formed solution of (15) in (23; 24) for real matrices. Hence, we restated the solving results in (23; 24), and generalize it to the complex matrices, in the followings.

To minimize (15) for the real matrices case, Oh et al. (23; 24) defined the PSVT operator . Before extending the PSVT operator for the matrices in the complex field, we first restate the von Neumann’s lemma (44; 45; 46).

###### Lemma 3.0 (von Neumann)

If are complex matrices with singular values

 σX1≤⋯≤σXmin(m,n),σY1≤⋯≤σYmin(m,n)

respectively, then

 |⟨X,Y⟩|=|Tr(XHY)|≤min(m,n)∑r=1σXrσYr. (16)

Moreover, equality holds in (16) there exists a simultaneous singular value decomposition and of and in the following form:

 X=Udiag(σ(X))VH and Y=Udiag(σ(Y))VH, (17)

where and .

The von Neumann’s lemma shows that is always bounded by the inner product of and . Notice that the maximum value of can be only achieved when has the same singular vector matrices and as . This fact is useful to derive the PSVT.

###### Theorem 3.3 (Psvt)

Let , and which can be decomposed by SVD. can be considered as the sum of two matrices, , where , are the singular vector matrices corresponding to the largest singular values, and , from the -th to the last singular values. Define a complex minimization problem for the PSSV as

 argminXτ∥X∥p=N+β2∥X−Y∥2F. (18)

Then, the optimal solution of (18) can be expressed by the PSVT operator defined as:

 PN,τ(Y) =UY(DY1+Sτ[DY2])VHY (19) =Y1+UY2Sτ[DY2])VHY2,

where , , and is the soft-thresholding operator.

###### Proof 3.1

Lets consider , where , and , where the singular values are sorted in a non-increasing order. Also we define the function as the objective function of (18). The first term of (18) can be derived as follows:

 12∥X−Y∥2F=12(∥Y∥2F−2+∥X∥2F) (20) =12(∥Y∥2F−2l∑i=1σi(X)uHiYvi+l∑i=1σi(X)2)

In the minimization of (20) with respect to , is regarded as a constant and thus can be ignored. For a more detailed representation, we change the parameterization of to and minimize the function:

 J(UX,VX,DX) (21) = 12l∑i=1(−2σi(X)uHiYvi+σi(X)2)+τl∑i=N+1σi(X)

From von Neumann’s lemma, the upper bound of is given as for all when and . Then (21) becomes a function depending only on as follows:

 J(UY,VY,DX)= 12l∑i=1(−2σi(X)σi(Y)+σi(X)2)+τl∑i=N+1σi(X) (22) = 12N∑i=1(−2σi(X)σi(Y)+σi(X)2) +12l∑i=N+1(−2σi(X)σi(Y)+σi(X)2+2τσi(X)).

Since (22) consists of simple quadratic equations for each independently, it is trivial to show that the minimum of (22) is obtained at by derivative in a feasible domain as the first-order optimality condition, where is defined as

 ^σ(X)={σi(Y),ifi

Hence, the solution of (18) is . This result exactly corresponds to the PSVT operator where a feasible solution exists.

Therefore, the solution of (15) is

 ˆX∗(k)=PN,τ(ˆY(k)). (24)

Moreover, the pseudocode of the proposed algorithm to solve (14) is given in Algorithm 2.

In the following subsections, based on the proposed rank approximation, we can easily give our proposed tensor completion model and tensor RPCA model.

### 3.3 Tensor completion using PSTNN

A tensor completion model using PSTNN can be formulated as

 minX ∥X∥PSTNN (25) s.t. XΩ=OΩ.

Let

 IΦ(X)={0,ifX∈Φ,∞,otherwise, (26)

where . Thus, the problem (25) can be rewritten as the following unconstraint problem:

 minXIΦ(X)+∥X∥PSTNN. (27)

Then, the problem (27) can be solved efficiently using ADMM (47; 7; 48; 49; 22).

After introducing a auxiliary tensor, the problem (27) can be rewritten as follows:

 minX IΦ(Y)+∥X∥PSTNN (28) s.t. Y=X.

The augmented Lagrangian function of (28) is given by

 Lβ(X,Y,M)= IΦ(Y)+∥X∥PSTNN+⟨M,X−Y⟩+β2∥X−Y∥2F (29) = IΦ(Y)+∥X∥PSTNN+β2∥X−Y+Mβ∥2F+C,

where is the Lagrangian multiplier, is the penalty parameter for the violation of the linear constraints and is a constant.

Then, the problem in (29) can be updated as:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Xk+1=argminX∥X∥% PSTNN+β2∥X−Yk+Mkβ∥2F,Yk+1=1β(βXk+1+Mk)ΩC+OΩ,Mk+1=Mk+β(Xk+1−Yk+1). (30)

Algorithm 3 shows the pseudocode for the proposed PSTNN based tensor completion method.

### 3.4 Tensor RPCA using PSTNN

A tensor RPCA model using PSTNN can be formulated as

 minL,E (31) s.t. O=L+E.

Its Lagrangian function is

 Lβ(L,E,M)= (32) =

where is the Lagrangian multiplier, is the penalty parameter for the violation of the linear constraints, and is a constant.

Then, the problem in (32) can be updated as:

 ⎧⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎩Lk+1=argminL∥L∥% PSTNN+β2∥O−L−Ek−Mkβ∥2F,Ek+1=S% hrinkλβ(O−Lk+1+Mkβ),Mk+1=Mk+β(O−Lk+1−Ek+1), (33)

where the tensor non-negative soft-thresholding operator is defined as

 Shrinkv(B)=¯B

with

 ¯bi1i2⋯iN={bi1i2⋯iN−v,bi1i2⋯iN>v,0,otherwise.

Algorithm 4 shows the pseudocode for the proposed PSTNN based tensor robust component analysis method.

## 4 Experimental results

To validate the effectiveness and efficiency of the proposed method, we compare the performance of the proposed method with the tensor nuclear norm based methods on both synthetic data sets and real world application examples. To measure the reconstruction accuracies, we employ the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) (50). PSNR is defined as

 PSNR=10log10¯Y2true1n2∥Y−Ytrue∥2F,

where , , and

are the original tensor, the maximum pixel value of the original tensor, and the estimated tensor, respectively. SSIM measures the structural similarity of two images, and please see

(50) for details. Better completion results correspond to larger values in PSNR and SSIM. All algorithms are implemented on the platform of Windows 10 and Matlab (R2017b) with an Intel(R) Core(TM) i5-4590 CPU at 3.30 GHz and 16 GB RAM. Our Matlab code now is available at https://github.com/uestctensorgroup/PSTNN.

### 4.1 Synthetic data

To synthesize a ground-truth low tubal-rank tensor of rank , we perform a t-prod , where and

are independently sampled from an i.i.d. Gaussian distribution

.

#### 4.1.1 Tensor completion

For the tensor completion task, we try to recover from the partial observation which is randomly sampled entries of . To verify the robustness of the TNN based TC method and the proposed PSTNN based TC method, we conducted the experiments with respect to data sizes, the tubal-rank , the sampling rate, i.e. , respectively. We examine the performance by counting the number of successes. If the relative square error of the recovered and the ground truth , i.e. , is less than , then we claim that the recovery is successful. We repeat each case 10 times, and each cell in Figure 4 reflects the success percentage, which is computed by the successful times dividing 10. Figure 4 illustrates that the proposed PSTNN based TC method is more robust than the TNN based TC method, because of bigger brown areas.

#### 4.1.2 Tensor robust principal components analysis

For the tensor robust principal components analysis task, is corrupted by a sparse noise with sparsity

and uniform distributed values. We try to recover

using Algorithm 4 and the TNN based tensor completion method. The setting of the experiments in this part is similar to that in Section 4.1.1. We conducted the experiments with respect to data sizes, the tubal-rank , sparsity , respectively. We examine the performance by counting the number of successes. We repeat each case 10 times, and each cell in Figure 5 reflects the success percentage, which is computed by the successful times dividing 10. Figure 5 illustrate that the proposed PSTNN TC method is more robust than the TNN based TC method, because of smaller blue areas.

#### 4.1.3 Sensitivity to initialization

The converged solution may be different with different initializations, on account of that the proposed objective function is non-convex. To study the sensitivity of the optimization against the initialization, we conducted 1000 experiments with random initialization on a tensor with tubal-rank 5 and with missing entries for the TC task.The distribution of the rooted relative squared error are shown in Figure 6. While the convergence of non-convex problem to an optimum is hard to be guaranteed, most solutions are concentrically distributed in regions near the ground-truth solution with small errors.

### 4.2 Tensor completion for the real-world data

In this subsection, we compare our PSTNN based TC method with the HaLRTC (2) and the TNN based TC method (34) on the real-world data, including the video data, the MRI data and the multispectral image (MSI) data. The ratio of the missing entries is set as 80%. Figure