 # Triple Decomposition and Tensor Recovery of Third Order Tensors

In this paper, we introduce a new tensor decomposition for third order tensors, which decomposes a third order tensor to three third order low rank tensors in a balanced way. We call such a decomposition the triple decomposition, and the corresponding rank the triple rank. We show that the triple rank of a third order tensor is not greater than the CP rank and the middle value of the Tucker rank, is strictly less than the CP rank with a substantial probability, and is strictly less than the middle value of the Tucker rank for an essential class of examples. This indicates that practical data can be approximated by low rank triple decomposition as long as it can be approximated by low rank CP or Tucker decomposition. This theoretical discovery is confirmed numerically. Numerical tests show that third order tensor data from practical applications such as internet traffic and video image are of low triple ranks. A tensor recovery method based on low rank triple decomposition is proposed. Its convergence and convergence rate are established. Numerical experiments confirm the efficiency of this method.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Higher Order tensors have found many applications in recent years. Third order tensors are the most useful higher order tensors in applications [1, 9, 14, 15, 17, 18, 19, 20, 21, 22]. Tensor decomposition has emerged as a valuable tool for analyzing and computing with such tensors . For example, a key idea behind tensor recovery algorithms is that many practical datasets are highly structured in the sense that the corresponding tensors can be approximately represented through a low rank decomposition.

Two most well-known tensor decompositions are the CANDECOMP/PARAFAC (CP) decomposition and the Tucker decomposition . Their corresponding ranks are called CP rank and Tucker rank  respectively. In the next section, we will review their definitions.

Suppose that we have a third order tensor , where and are positive integers. The CP rank of may be higher than max. For example, the CP rank of a tensor given by Kruskal is between and . See . It is known  that an upper bound of the CP rank is min.

The Tucker decomposition decomposes into a core tensor multiplied by three factor matrices , and along three modes, i.e.,

 X=D×1U×2V×3W.

The minimum possible values of and are called the Tucker rank of . Then for . Thus, the Tucker rank is relatively smaller.

In this paper, we introduce a new tensor decomposition for third order tensors, which decomposes a third order tensor to a product of three third order low rank tensors in a balanced way. We call such a decomposition the triple decomposition, and the corresponding rank the triple rank. We show that the triple rank of a third order tensor is not greater than the CP rank and the middle value of the Tucker rank, is strictly less than the CP rank with a substantial probability, and is strictly less than the middle value of the Tucker rank for an essential class of examples. This indicates that practical data can be approximated by low rank triple decomposition as long as it can be approximated by low rank CP or Tucker decomposition. This theoretical discovery is confirmed numerically. Numerical tests show that third order tensor data from practical applications such as internet traffic and video image are of low triple ranks. A tensor recovery method based on low rank triple decomposition is proposed. Its convergence and convergence rate are established. Numerical experiments confirm the efficiency of this method.

The rest of this paper is distributed as follows. Preliminary knowledge on CP decomposition, Tucker decomposition, and related tensor ranks is presented in the next section. In Section 3, we introduce triple decomposition and triple rank, prove the above key properties and some other theoretical properties. In particular, we show that the triple rank of a third order tensor is not greater than the triple rank of the Tucker core of that third order tensor. If the factor matrices of the Tucker decomposition of that third order tensor are of full column rank, than the triple ranks of that third order tensor and its Tucker core are equal. We present an algorithm to check if a given third order tensor can be approximated by a third order tensor of low triple rank such that the relative error is reasonably small, and make convergence analysis for this algorithm in Section 4. In Section 5, we show that practical data of third order tensors from internet traffic and video image are of low triple ranks. A tensor recovery method is proposed in Section 6, based on such low rank triple decomposition. Its convergence and convergence rate are also established in that section. Numerical comparisons of our method with tensor recovery based upon CP and Tucker decompositions are presented in Section 7. Some concluding remarks are made in Section 8.

## 2 CP Decomposition, Tucker Decomposition and Related Tensor Ranks

We use small letters to denote scalars, small bold letters to denote vectors, capital letters to denote matrices, and calligraphic letters to denote tensors. In this paper, we only study third order tensors.

Perhaps, the most well-known tensor decomposition is CP decomposition . Its corresponding tensor rank is called the CP rank.

###### Definition 2.1

Suppose that . Let , and . Here are positive integers. If

 xijt=r∑p=1aipbjpctp (2.1)

for , and , then has a CP decomposition . The smallest integer such that (2.1) holds is called the CP rank of , and denoted as CPRank.

As shown in , CPRank. A tensor recovery method via CP decomposition can be found in .

Another well-known tensor decomposition is Tucker decomposition . Its corresponding tensor rank is called the Tucker rank. Higher order SVD (HOSVD) decomposition  can be regarded as a special variant of Tucker decomposition.

###### Definition 2.2

Suppose that , where and are positive integers. We may unfold to a matrix , or a matrix , or a matrix . Denote the matrix ranks of and as and , respectively. Then the triplet is called the Tucker rank of , and is denoted as TucRank with TucRank for .

The CP rank and Tucker rank are called the rank and -rank in some papers . Here, we follow  to distinguish them from other tensor ranks.

###### Definition 2.3

Suppose that . Let , , and . Here are positive integers. If

 xijt=r1∑p=1r2∑q=1r3∑s=1uipvjqctswpqs (2.2)

for , and , then has a Tucker decomposition . The matrices are called factor matrices of the Tucker decomposition, and the tensor is called the Tucker core. We may also denote the Tucker decomposition as

 X=D×1U×2V×3W. (2.3)

The Tucker ranks of are the smallest integers such that (2.2) holds . Nonnegative tensor recovery methods via Tucker decomposition can be found in [16, 5].

## 3 Triple Decomposition, Triple Rank and Their Properties

Let . As in , we use to denote the -th horizontal slice, to denote the -th lateral slice; to denote the -th frontal slice. We say that is a third order horizontally square tensor if all of its horizontal slices are square, i.e., . Similarly, is a third order laterally square tensor (resp. frontally square tensor) if all of its lateral slices (resp. frontal slices) are square, i.e., (resp. ).

###### Definition 3.1

Let be a non-zero tensor. We say that is the triple product of a third order horizontally square tensor , a third order laterally square tensor and a third order frontally square tensor , and denote

 X=ABC, (3.4)

if for , and , we have

 xijt=r∑p,q,s=1aiqsbpjscpqt. (3.5)

If

 r≤mid{n1,n2,n3}, (3.6)

then we call (3.4) a low rank triple decomposition of . See Figure 1 for a visualization.

The smallest value of such that (3.5) holds is called the triple rank of , and is denoted as TriRank. For a zero tensor, we define its triple rank as zero.

Note that TriRank is zero if and only if it is a zero tensor. This is analogous to the matrix case.

###### Theorem 3.2

Low rank triple decomposition and triple ranks are well-defined. A third order nonzero tensor always has a low rank triple decomposition (3.4), satisfying (3.6).

Proof Without loss of generality, we may assume that we have a third order nonzero tensor and . Thus, mid. Let . Let , , and be such that if , if , , and for , , and , where and are the Kronecker symbol such that and if . Then (3.5) holds for the above choices of , , and . Thus, the triple decomposition always exists with . .

Note that one cannot change (3.6) to

 r≤min{n1,n2,n3}. (3.7)

The above assertion can be seen through the following argument. Let and . Suppose that is chosen to have independent entries. If (3.7) is required, then with , the decomposition consists of , and that can have only a maximum of independent entries in total. Thus, we cannot find , , and , satisfying (3.4), (3.5) and (3.7).

Suppose that , where , , , , , , , and . Then, we have

 xijk=r∑p,q,s=1aiqsbpjscpqk=r1∑ur2∑vr3∑w˜Aiu˜Bjv˜Ckwr∑p,q,s=1FuqsGpvsHpqwa core tensor FGH. (3.8)

Thus, this is a formulation of the Tucker decomposition. In addition, if the core tensor is a diagonal tensor, we get the CP decomposition.

We now study the relation between triple decomposition and CP decomposition. We have the following theorem.

###### Theorem 3.3

Suppose that . Then we have

 TriRank(X)≤CPRank(X)≤TriRank(X)3.

Proof Suppose that with , and is a CP decomposition. Denote , and with

Then for all , and , there holds

 (ABC)ijt=r∑s,p,q=1¯aipq¯bsjq¯cspt=r∑p=1aipbjpctp=Xijt.

This means that and TriRank CPRank from the definition of the triple rank.

On the other hand, suppose that is of the form . Then, can be represented as a sum of rank-one tensors. Hence, the last inequality in the theorem holds by setting . .

This theorem indicates that the triple rank is not greater than the CP rank. As the CP rank may be greater than max, while the triple rank is not greater than mid, there is a good chance that the triple rank is strictly smaller than the CP rank. By , Monte Carlo experiments reveal that the set of tensors are of CP rank three with probability . Since the triple rank is not greater than two in this case, with a substantial probability the triple rank is strictly less than the CP rank.

Next, we study the relation between triple decomposition and Tucker decomposition. We have the following theorem.

###### Theorem 3.4

Suppose that and with TriRank, , . Furthermore,

 X=D×1U×2V×3W

is a Tucker decomposition of with and factor matrices . Then

 TriRank(X)≤TriRank(D)≤mid{r1,r2,r3}. (3.9)

Furthermore, if TucRank, then we have

 TriRank(X)=TriRank(D). (3.10)

Thus, we always have

 TriRank(X)≤mid{TucRank(X)1,TucRank(X)2,TucRank(X)3}. (3.11)

Proof For convenience of notation, let TriRank. By (3.6), we have the second inequality of (3.9).

We first show that . Assume that with , and . Then

 X=(¯A¯B¯C)×1U×2V×3W=(¯A×1U)(¯B×2V)(¯C×3W).

Clearly, , and . Hence, from the definition of TriRank. This proves the first inequality of (3.9).

Now we assume that TucRank, and show that . By TucRank, we know that factor matrices and are of full column rank. Then and are invertible. From , we have that

 X×1UT(UTU)−1×2VT(VTV)−1×3WT(WTW)−1=(D×1U×2V×3W)×1UT(UTU)−1×2VT(VTV)−1×3WT(WTW)−1=D×1(UTU)(UTU)−1×2(VTV)(VTV)−1×3(WTW)(WTW)−1=D×1Ir1×2Ir2×3Ir3=D.

Hence, it holds that

 D=(ABC)×1UT(UTU)−1×2VT(VTV)−1×3WT(WTW)−1=(A×1UT(UTU)−1)(B×2VT(VTV)−1)(C×3WT(WTW)−1).

It is easy to see that , and . From definition of TriRank, we have . Therefore and (3.10) holds.

Note that the condition that TucRank always can be realized. For example, in HOSVD , all factor matrices are orthogonal, and we always have TucRank. This shows that (3.11) always holds. .

The condition that TucRank holds if all factor matrices are of full column rank. In , if the factor matrices of a Tucker decomposition are of full column rank, then that Tucker decomposition is called independent.

We now give an example that TriRank min TucRank, TucRank, TucRank.

###### Example 3.5

Let and . Consider , , and such that and otherwise, and otherwise, and and otherwise. Then TucRank TucRank TucRank. Let . Then TriRank and . We have and otherwise. We may easily check that TucRank TucRank TucRank. Thus, TriRank TucRank TucRank TucRank.

Taking the conclusion of the above example further, the following probabilistic argument shows that, in fact, the triple rank is smaller than the smallest Tucker rank for an essential class of examples. Let and , , and . Then and are matrices. With probability one, these three matrices are nonsingular, i.e., TucRank TucRank TucRank. Let . Then and and are matrices. With probability one, these three matrices are also nonsingular, i.e., TucRank TucRank TucRank. Then, with probability one, we have TriRank TucRank TucRank TucRank. This shows that there is a substantial chance that TriRank mid TucRank, TucRank, TucRank.

The above two theorems indicate that practical data can be approximated by low rank triple decomposition as long as it can be approximated by low rank CP or Tucker decomposition. This theoretical discovery will be confirmed numerically in the later sections.

Next, we have the following proposition relating the triple rank to the Tucker rank.

###### Proposition 3.6

Suppose that and with and . Then

 TucRank(X)1≤TucRank(A)1≤(TriRank(A))2≤(TriRank(X))2, (3.12)
 TucRank(X)2≤TucRank(B)2≤(TriRank(B))2≤(TriRank(X))2, (3.13)

and

 TucRank(X)3≤TucRank(C)3≤(TriRank(C))2≤(TriRank(X))2. (3.14)

Proof Let TriRank, TucRank, TucRank and TuckRank. Let be a Tucker decomposition of with core tensor and factor matrices . Denote . Then .

Similarly, there exist , such that

 A=¯A×1U,B=¯B×2V,C=¯C×3W.

Hence, according to (3.8). From definition of the Tucker rank, we have the first inequalities of (3.12-3.14).

Assume that TriRank. Then there are tensors , and such that . Replacing and in the first inequality of (3.12) by and , we have TucRank TucRank. Note that . By the definition of the Tucker rank, TucRank is the matrix rank of an matrix. Hence, TucRank. This proves the second inequality of (3.12).

Since and , by (3.6), TriRank TriRank. Then the third inequality of (3.12) holds.

The second and third inequalities of (3.13) and (3.14) hold similarly. .

## 4 A Method for Checking The Triple Rank of a Third Order Tensor

In this section, we present an algorithm for checking the triple rank of a third order tensor and establish its convergence. Strictly speaking, our algorithm is not guaranteed to find the triple rank of a third order tensor exactly. Instead, it gives an upper bound on the relative error obtainable by approximating with a third order tensor of triple rank not higher than a given integer . This algorithm will be useful in the next section to verify that third order tensors from several practical datasets can be approximated by low triple rank tensors.

### 4.1 A Modified Alternating Least Squares Method

We are going to present a modified alternating least squares (MALS) algorithm for the triple decomposition of third order tensors in this subsection. Consider a given third order tensor with and a fixed positive integer mid. The following cost function will be minimized

 f(A,B,C):=∥X−ABC∥2F=n1∑i=1n2∑j=1n3∑t=1(xijt−r∑p=1r∑q=1r∑s=1aiqsbpjscpqt)2, (4.15)

where , , are unknown. In this way, we will obtain a triple decomposition of triple rank not greater than , to approximate .

MALS is an iterative approach starting from an initial points . We initialize and perform the following steps until the iterative sequence converges.

Update . Fixing and , we solve a subproblem

 argminA∈Rn1×r×r ∥∥ABkCk−X∥∥2F+λ∥∥A−Ak∥∥2F,

where is a constant in this algorithm. If , then this is the classical ALS algorithm. We take . Hence, our method is a modified ALS algorithm. Let be the mode-1 unfolding of the tensor and be the mode-1 unfolding of the tensor . By introducing a matrix with elements

 Fkℓm=r∑p=1bkpjsckpqt where ℓ=q+(s−1)r,m=j+(t−1)n2, (4.16)

the -subproblem may be represented as

 argminA(1)∈Rn1×r2 ∥∥A(1)Fk−X(1)∥∥2F+λ∥∥A(1)−Ak(1)∥∥2F (4.17) = [X(1)(Fk)T+λAk(1)][Fk(Fk)T+λIr2]−1.

Then, we obtain from which is the closed-form solution (4.17).

Update . Consider the following subproblem

 argminB∈Rr×n2×r ∥∥Ak+1BCk−X∥∥2F+λ∥∥B−Bk∥∥2F,

where and are known. Let and be the -mode unfolding of tensors and , respectively. Define with entries

 Gkℓm=r∑q=1ak+1iqsckpqt where ℓ=p+(s−1)r,m=i+(t−1)n1. (4.18)

Then, the -subproblem is rewritten as

 argminB(2)∈Rn2×r2 ∥∥B(2)Gk−X(2)∥∥2F+λ∥∥B(2)−Bk(2)∥∥2F (4.19) = [X(2)(Gk)T+λBk(2)][Gk(Gk)T+λIr2]−1.

Hence, may be derived from defined by (4.19).

Update . Using and at hand, we minimize

 argminC∈Rr×r×n3 ∥∥Ak+1Bk+1C−X∥∥2F+λ∥∥C−Ck∥∥2F.

Let be a matrix with entries

 Hkℓm=r∑s=1ak+1iqsbk+1pjs where ℓ=p+