 # Low Rank Triple Decomposition and Tensor Recovery

A simple approach for matrix completion and recovery is via low rank matrix decomposition. Recently, low rank matrix decomposition was extended to third order tensors, by T-product, then used in tensor completion and recovery. In this approach, a third order tensor is decomposed as the T-product of two low rank third order tensors. The three modes of the third order tensor are not balanced in this approach. This may cause some problems in applications where three modes have different meanings. The innovation of this paper is to decompose a third order tensor to three low rank third tensors in a balanced way. We call such a decomposition the triple decomposition, and the corresponding rank the triple rank. Numerical tests show that third order tensor data from practical applications such as internet traffic and video image are of low triple ranks. A tensor recovery method is proposed based on such low rank triple decomposition. Comparing with some main methods for tensor recovery, this method has its own merits, and can be a new choice for practical users.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A tensor is a multi-dimensional array of numbers, and a higher order generalization of vectors and matrices. It is a natural form of the real world multi-way data. The low rank tensor completion and recovery problem is an extension of the low rank matrix completion and recovery problem.

One approach for matrix completion and recovery is via low rank matrix decomposition. Suppose that we have an real valued matrix . Then we may decompose as the product of two low rank matrices and , where is an matrix, and is an matrix, such that

 X=AB (1.1)

and

 r≤min{m,n}. (1.2)

See Figure 1.

The smallest integer value such that (1.1) holds is called the rank of . Such a decomposition is especially attractive for data matrices arising from real world applications, where, even though the dimensions and may be huge, often the rank may be well smaller than , i.e.,

 r≪min{m,n}. (1.3)

Suppose that is observed as an matrix , which is only known in . Let be the projection operator for . Assume that the rank of takes a value satisfying (1.3). Then the matrix may be recovered as , where and are solutions of

 min12{∥PΩ(AB−M)∥2F:A∈Rm×r,B∈Rr×n}, (1.4)

where is the Frobenius norm for matrices.

Recently, Zhou, Lu, Lin and Zhang  extended the above decomposition based approach to third order tensors. Third order tensors are the most useful higher order tensors in applications [1, 4, 5, 9, 10, 12, 13, 14, 15, 16, 17]. Suppose that we have an real valued tensor . The authors of  showed that we may decompose as the T-product of two low rank third order tensors and , where is an tensor, and is an tensor, such that

 X=A∗B, (1.5)

where denotes the T-product , and

 r≤min{n1,n2}. (1.6)

See Figure 2.

The smallest integer value such that (1.5) holds is called the tubal rank of . For a third order data tensor arising from real applications, we may have,

 r≪min{n1,n2}. (1.7)

Suppose that is observed as an tensor , which is only known in . Let be the projection operator for . Assume that the tubal rank of takes a value satisfying (1.7). Then the tensor may be recovered by , where and are solutions of

 min12{∥PΩ(A∗B−M)∥2F:A∈Rn1×r×n3,B∈Rr×n2×n3}, (1.8)

is the Frobenius norm for tensors. The authors of 

showed that for image and video data recovery, their method clearly demonstrated the superior performance and efficiency over state of the arts including the Tensor Nuclear Norm (TNN) based and matricization based methods.

One may argue, however, that this approach treats the three modes somewhat unequally. In particular, structural properties of the third mode (i.e. corresponding to ) is not exploited in this approach. For many applications, the roles of the three modes have important physical meanings. For example, in the tensor recovery problem for internet traffic data, the three modes of a third order tensor have explicit space and time meanings and their spatio-temporal relations should be considered in its tensor recovery [1, 10, 16]. If we apply the approach in  to such problems, it is not sure which two modes should be used as the first two modes in the T-product.

In this paper, we decompose a third order tensor to three low rank third tensors in a balanced way. We call such a decomposition the triple decomposition, and the corresponding rank the triple rank. Numerical tests show that third order tensor data from practical applications such as internet traffic and video image are of low triple ranks. A tensor recovery method is proposed based on such low rank triple decomposition. Numerically, this method is shown to be efficient.

The rest of this paper is distributed as follows. We introduce triple decomposition and triple rank in the next section. In Section 3, we show that practical data of third order tensors from internet traffic and video image are of low triple ranks. A tensor recovery method is proposed in Section 4, based on such low rank triple decomposition. Numerical comparisons of our method with some main methods for tensor recovery are also presented in that section. Further discussions are made in Section 5.

Comparing with some main methods for tensor recovery, this method has its own merits, and can be a new choice for practical users. This is the contribution of our paper.

## 2 Low Rank Triple Decomposition

Let . As in [5, 6], we use to denote the -th horizontal slice, to denote the -th lateral slice; to denote the -th frontal slice. We say that is a third order horizontally square tensor if all of its horizontal slices are square, i.e., . Similarly, is a third order laterally square tensor (resp. frontally square tensor) if all of its lateral slices (resp. frontal slices) are square, i.e., (resp. ).

Let be a non-zero tensor. We say that is the triple product of a third order horizontally square tensor , a third order laterally square tensor and a third order frontally square tensor , and denote

 X=ABC, (2.9)

if for , and , we have

 xijt=r∑p,q,s=1aiqsbpjscpqt. (2.10)

If

 r≤mid{n1,n2,n3}, (2.11)

then we call (2.9) a low rank triple decomposition of . See Figure 3.

The smallest value of such that (2.10) holds is called the triple rank of , and is denoted as Rank. For a zero tensor, we define its triple rank as zero. Note that Rank is zero if and only if it is a zero tensor. This is analogous to the matrix case.

###### Theorem 2.1

Low rank triple decomposition and triple ranks are well-defined. A third order nonzero tensor always has a low rank triple decomposition (2.9), satisfying (2.11).

Proof Without loss of generality, we may assume that a third order nonzero tensor and . Then mid. Let , if , if , and for , , and , where and are the Kronecker symbol such that and if . Then (2.10) holds. .

Let . By the above argument, we know that . However, in general, has independent entries. By (2.10), we have

 n1n2n3≤r2(n1+n2+n3).

Hence,

 √n1n2n3n1+n2+n3≤r≤n2. (2.12)

One cannot change (2.11) to

 r≤min{n1,n2,n3}. (2.13)

Let and . Then has independent entries. If (2.13) is required, then , , and have only independent entries. We cannot find , and , satisfying (2.9), (2.10) and (2.13).

In the next two sections, we demonstrate that low rank triple decomposition and triple ranks are useful in applications. This is enough to justify their introduction. In the appendix, we study some of their properties and raise some theoretical questions. For practical users, these may be skipped.

## 3 Practical Data of Third Order Tensors

The formula (2.12) holds only for general third order tensors. For a sparse third order tensor , we may have

 r<√n1n2n3n1+n2+n3. (3.14)

The practical data of third order tensors from applications are sparse. In this section, we investigate practical data from applications and show that they can be approximated by triple decomposition of low triple ranks very well.

### 3.1 A Method for Testing The Low Triple Rank Property

We are going to study an alternating least squares (ALS) algorithm for the triple decomposition of third order tensors in this subsection. Consider a given third order tensor with and a fixed positive integer . The following cost function will be minimized

 f(A,B,C):=∥X−ABC∥2F=n1∑i=1n2∑j=1n3∑t=1(xijt−r∑p=1r∑q=1r∑s=1aiqsbpjscpqt)2,

where , , are unknown. In this way, we will obtain a triple decomposition of triple rank not greater than , to approximate .

ALS is an iterative approach starting from an initial points . Then we set and perform the following steps until the iterative sequence converges.

Update . Fixing and , we solve a subproblem

 minA∈Rn1×r×r ∥ABkCk−X∥2F+λ∥A−Ak∥2F,

where is a constant. Let be the mode-1 unfolding of the tensor and be the mode-1 unfolding of the tensor . By introducing a matrix with elements

 Fkℓm=r∑p=1bkpjsckpqt where ℓ=q+(s−1)r,m=j+(t−1)n2, (3.15)

the -subproblem could be represented as

 argminA(1)∈Rn1×r2 ∥A(1)Fk−X(1)∥2F+λ∥A(1)−Ak(1)∥2F (3.16) = (X(1)FkT+λAk(1))(FkFkT+λIr2)−1.

Then, we obtain from which has a closed-form solution (3.16).

Update . Consider the following subproblem

 minB∈Rr×n2×r ∥Ak+1BCk−X∥2F+λ∥B−Bk∥2F,

where and are known. Unfolding in mode-2, we get with entries

 ¯xjm=xijt, where m=i+(t−1)n1. (3.17)

In a similar way, we have . Define with entries

 Gkℓm=r∑q=1ak+1iqsckpqt where ℓ=p+(s−1)r,m=i+(t−1)n1. (3.18)

Then, the -subproblem is rewritten as

 argmin ∥B(2)Gk−X(2)∥2F+λ∥B(2)−Bk(2)∥2F (3.19) = (X(2)GkT+λBk(2))(GkGkT+λIr2)−1.

Hence, could be derived from (3.19).

Update . Using and at hand, we minimize

 minC∈Rr×r×n3 ∥Ak+1Bk+1C−X∥2F+λ∥C−Ck∥2F.

Let be a matrix with entries

 Hkℓm=r∑s=1ak+1iqsbk+1pjs where ℓ=i+(j−1)n1,m=p+(q−1)r. (3.20)

Then, we derive

 argmin ∥HkC(3)−X(3)∥2F+λ∥C(3)−Ck(3)∥2F (3.21) = (HkTHk+λIr2)−1(HkTX(3)+λCk(3)),

where and are the 3-mode unfolding of and , respectively. The third order tensor is a tensor-form of (3.21).

Set and repeat the process. See Algorithm 1 for a complete algorithm. Here, we use the extrapolation technique with step size to deal with the swamp effect. ALS may terminates if the difference between two iterates are tiny

 max{∥Ak+1−Ak∥F∥Ak+1∥F,∥Bk+1−Bk∥F∥Bk+1∥F,∥Ck+1−Ck∥F∥Ck+1∥F}≤ε,

or the iteration arrives a preset maximal iterative number. Convergence analysis of Algorithm 1 could be obtained by a similar approach presented in  using the semialgebraic property of tensors.

### 3.2 Abilene Internet Traffic Data

The first application we consider is the internet traffic data. The data set is the Abilene data set 111The abilene observatory data collections. http://abilene.internet2.edu/observatory/data-collections.html . Figure 4: Relative errors of low triple rank approximations of the 11×11×2016 internet traffic tensor from Abilene dataset.

The Abilene data is arising from the backbone network located in North America. There are 11 routers: Atlanta GA, Chicago IL, Denver CO, Houston TX, Indianapolis, Kansas City MO, Los Angeles CA, New York NY, Sunnyvale CA, Seattle WA, and Washington DC. These routers send and receive data. Thus we get 121 original-destination (OD) pairs. For each OD pair, we record the internet traffic data of every 5 minutes in a week from Dec 8, 2003 to Dec 14, 2003. Hence, there are numbers for each OD pairs. In this way, we get a third order tensor with size -by--by-. This model was used in [1, 16] for internet traffic data recovery.

Now, we examine the triple decomposition approximation of the tensor with different triple rank upper bound among to . For each triple rank upper bound, we compute the triple decomposition approximation by Algorithm 1 and calculate the relative error of low triple rank approximation

 RelativeError=∥XAbil−ABC∥F∥XAbil∥F.

Figure 4 illustrates the relative error of the low rank approximations via triple rank upper bound . When we takes and , the relative error is about and , respectively. This shows that the Abilene data can ba approximated by triple decomposition of low triple rank well. Obviously, the relative error is zero if as this is an upper bound on the rank as shown in (2.12).

A similar conclusion is obtained if we view the Abilene traffic data as a third order tensor arranged differently as which is indexed by source-destination pairs, time slots for each day, and days. This is the model used in . Figure 5 shows the relative error of the low rank approximations obtained by Algorithm 1 as a function of the target ranks upto 30. Actually, this is more illustrative as here and the low rank triple decomposition is very good when . Figure 5: Relative errors of low triple rank approximations of the 121×96×21 internet traffic tensor from Abilene dataset.

### 3.3 ORL Face Data

We now investigate the ORL face data in AT & T Laboratories Cambridge [2, 8, 11]. Figure 6: Low rank approximations of a third order ORL face data tensor of size 112×92×10.

The ORL dataset of faces contains images of 40 persons. Each image has pixels. For each person, there are 10 images taken at different times, varying the lighting, facial expressions and facial details. For instance, the first line of Figure 7 illustrates ten images of a person. Hence, there is a -by--by- tensor . Using Algorithm 1, we compute best low triple rank approximations of the tensor . The relative error of approximations via triple ranks are illustrated in Figure 6. When the triple rank upper bound , the relative error of low triple rank approximations are , respectively. Corresponding images of low triple rank approximations are illustrated in lines 2–4 of Figure 7.

This result shows clearly the ORL data can be approximated by low rank triple decomposition very well. Figure 7: Illustration of faces from the ORL dataset. Original images are illustrated in the first line. Approximations with rank 4, 10, 16 are shown in lines two, three, four, respectively.

## 4 Tensor Recovery

In this section, we consider the tensor recovery problem:

 min ∥P(ABC)−d∥2F, (4.22)

where is a linear operator, is a given vector, and , , and are unknown. To solve (4.22), we introduce a surrogate tensor and transform (4.22) to the closely related optimization problem

 min ∥ABC−T∥2F, (4.23) s.t. P(T)=d.

An alternating least squares algorithm could be proposed for solving the tensor recovery problem (4.23). For a fixed positive integer , we choose , , , and set . Using a similar approach introduced in subsection 3.1, we perform the following steps.

Update . We solve a subproblem

 minT∈Rn1×n2×n3 ∥T−AkBkCk∥2F+λ∥T−Tk∥2F s.t. P(T)=d.

That is

 minT∈Rn1×n2×n3 s.t. P(T)=d.

Define an operator that maps to where . Then, the equality constraint may be rewritten as , where, is the matrix corresponding to the application of the operator

when viewed as a linear transformation from

to . Here we assume that is invertible. Then, the above optimization problem may be represented as

 min s.t. Pvec(T)=d,

which has a closed-form solution

 (I−PT(PPT)−1P)11+λvec(AkBkCk+λTk)+PT(PPT)−1d. (4.24)

This is defined as . Then, we set

 Tk+1=γ˜Tk+(1−γ)Tk.

Update . To solve

 minA∈Rn1×r×r ∥ABkCk−Tk+1∥2F+λ∥A−Ak∥2F,

we obtain by calculating

 argminA(1)∈Rn1×r2 ∥A(1)Fk−Tk+1(1)∥2F+λ∥A(1)−Ak(1)∥2F (4.25) = (Tk+1(1)FkT+λAk(1))(FkFkT+λIr2)−1,

where is defined in (3.15) using and , and , , are 1-mode unfolding of tensors , , , respectively. Then, we set

 Ak+1=γ˜Ak+(1−γ)Ak.

Update . To solve

 minB∈Rr×n2×r ∥Ak+1BCk−Tk+1∥2F+λ∥B−Bk∥2F,

we obtain by calculating

 argminB(2)∈Rn2×r2 ∥B(2)Gk−Tk+1(2)∥2F+λ∥B(2)−Bk(2)∥2F (4.26) = (Tk+1(2)GkT+λBk(2))(GkGkT+λIr2)−1,

where is defined in (3.18) using and , and , , are 2-mode unfolding (3.17) of tensors , , , respectively. Then, the extrapolation is to set

 Bk+1=γ˜Bk+(1−γ)Bk.

Update . To solve

 minC∈Rr×r×n3 ∥Ak+1Bk+1C−Tk+1∥2F+λ∥C−Ck∥2F,

we obtain by calculating

 argminC(3)∈Rr2×n3 ∥HkC(3)−Tk+1(3)∥2F+λ∥C(3)−Ck(3)∥2F (4.27) = (HkTHk+λIr2)−1(HkTTk+1(3)+λCk(3)),

where is defined in (3.20) using and , and , , are 3-mode unfolding of tensors , , , respectively. Then, the extrapolation is to set

 Ck+1=γ˜Ck+(1−γ)Ck.

Subsequently, we set and repeat this process until convergence. The detailed algorithm are illustrated in Algorithm 2.

Next, we are going to compare the triple tensor recovery model (4.23) with the canonical polyadic (CP) decomposition tensor recovery model and the Tucker decomposition tensor recovery model. Let , , and . The CP tensor has entries

 (A,B,C)ijk=r∑p=1aipbjpckp.

In addition, let be a core tensor. The Tucker tensor has entries

 (A,B,C;D)ijk=r∑p,q,s=1aipbjqcksdpqs.

Then, the CP tensor recovery model and the Tucker tensor recovery model could be represented by

 min ∥(A,B,C)−T∥2F s.t. P(T)=d

and

 min ∥(A,B,C;D)−T∥2F s.t. P(T)=d,

respectively. These two models are solved by variants of ALS in Algorithm 2.

### 4.1 ORL Face Data

Next, we apply the triple decomposition tensor recovery method, the CP decomposition tensor recovery method, and the Tucker decomposition tensor recovery method for the ORL dataset of faces. Original images of a person are illustrated in the first line of Figure 8. We sample fifty percent of pixels of these images as shown in the second line of Figure 8. Figure 8: Original images are illustrated in the first row. Samples of 50 percent pixels are illustrated in the second row. The third to last rows report the recovered images by the proposed method, CP tensor recovery, and Tucker tensor recovery, respectively.

Once a tensor is recovered, we calculate the relative error

 RE=∥Trec−Ttruth∥F∥Ttruth∥F.

By varying rank from one to seven, the relative error of recovered tensor by the triple decomposition tensor recovery method, the CP decomposition tensor recovery method, and the Tucker decomposition tensor recovery method are reported in Table 1. Obviously, when the rank is one, all the models are equivalent. As the rank increases, the relative error of the recovered tensor by each method decreases. It is easy to see that the relative error corresponding to the proposed triple tensor recovery method decreases quickly. Hence, the new method performs better than the CP decomposition tensor recovery method and the Tucker decomposition tensor recovery method. We take rank for instance. The recovered tensors by the triple tensor recovery method, the CP decomposition tensor recovery method, and the Tucker decomposition tensor recovery method are illustrated in lines 3–5 of Figure 8. Clearly, the quality of images recovered by proposed triple decomposition tensor recovery method is better than the others.

### 4.2 McGill Calibrated Colour Image Data

We now investigate the McGill Calibrated Colour Image Data [2, 7]. We choose three colour images: butterfly, flower, and grape, which are illustrated in the first column of Figure 9. By resizing the colour image, we get a -by--by- tensor. We randomly choose fifty percent entries of the colour image tensor. Tensors with missing entries are shown in the second column of Figure 9. We choose rank

, the proposed triple decomposition tensor recovery method generated an estimated tensor with relative error

. Estimated color images are illustrated in the third column. Colour images estimated by the CP decomposition tensor recovery method and the Tucker decomposition tensor recovery method are shown in the fourth and the last columns with relative error and , respectively. Obviously, the proposed triple tensor recovery method generates unambiguous colour images. Figure 9: Original images are illustrated in the first column. Samples of 50 percent pixels are illustrated in the second column. The third to last columns report the recovered images by the proposed method, CP tensor recovery, and Tucker tensor recovery, respectively.

## 5 Further Discussion

In this section, we further discuss possible advantages of the tensor recovery method based on low rank triple decomposition with some tensor recovery methods based on low rank decomposition.

In the literature, tensor recovery methods based on low rank decomposition are methods based on low rank CP decomposition , low rank Tucker decomposition [2, 11], low rank T-product decomposition .

Comparing with the CP decomposition methods, our method is a simple algorithm as we only need to consider tensors of smaller dimension to form the triple product. The CP rank is usually higher, may be higher than mid. See Appendix. In applications, practical data may have nonnegativity constraints. It would be hard for CP-decomposition based methods to apply the nonnegativity constraints, but it is not difficult to do this for our method.

Tensor recovery methods based upon low rank Tucker are easy to implement the nonnegativity constraints [2, 11]. However, the Tucker rank is a vector rank . It is not very sure what appropriate values of such and should take . The choices of such and will add the working loads . Our method only involves a triple rank with its value not greater than mid. This is the advantage of our method.

Tensor recovery methods based upon low rank T-product decomposition  are not balanced for the three modes. Further, the computations involve the complex Fourier domain; hence non-negativity are difficult to incorporate. Our method treats three modes in a balanced way, and is not difficult to incorporate the nonnegativity constraints.

We do not think that our method is superior to these methods in every aspects. However, our method has its own merits as discussed above, and thus would be a new choice of the practical applications.

Acknowledgment We are thankful to Dr. Qun Wang, who drew Figures 1-3 for us, and Prof. Ziyan Luo for her comments.

## 6 Appendix: Some Properties of Low Rank Triple Decomposition

We consider some special cases of low rank triple decomposition. If , , and , by (2.10), then elements of the tensor are

 xijk=r∑p,q,s=1aiqsbpjscpqk=aibjckr∑p,q,s=1αqsβpsγpqa scalar.

This means that the product tensor is a rank-one tensor.

Let , where and . Let , where and . Let , where and . Then, by letting , we similarly have

 xijk=r∑p,q,s=1aiqsbpjscpqk=r1∑ur2∑vr3∑w˜Aiu˜Bjv˜Ckwr∑p,q,s=1FuqsGpvsHpqwa core tensor FGH. (6.28)

Then, this is a formulation of the Tucker decomposition. In addition, if the core tensor is a diagonal tensor, we get the CP decomposition.

We have the following proposition.

###### Proposition 6.1

Suppose that ,

 Rank(X)≤RankCP(X)≤Rank(X)3.

Here is the triple rank of , and denotes the CP-rank of tensor .

Proof Suppose that with the CP-rank decomposition with , , for all . Here “” is the outer product, that is,

 Xijk=r∑t=1λtx(t)iy(t)jz(t)k,i∈[n1],  j∈[n2