1 Introduction
LowRank Tensor Recovery (LRTR) [1, 2, 3], as a natural higherorder generalization of Compressed Sensing (CS) [4, 5, 6, 7] and LowRank Matrix Recovery (LRMR) [8, 9, 10]
, is being extensively applied in various fields of artificial intelligence, including computer vision
[11], image processing [12]and machine learning
[13], etc. LRTR aims at recovering a lowrank tensor ^{1}^{1}1For simplicity, this letter considers only the thirdorder tensor. All results can be extended with minor modifications to order tensor. from linear noise measurements , where is a random map from to with andis a vector of measurement errors with noise level
.It’s not that easy to achieve this goal. On one hand, the naive approach of solving the nonconvex program
(1) 
is NPhard in general, where the operation
acts as a sparsity regularization of tensor singular values of
. On the other hand, some existing tensor ranks do not work well, such as CP rank [14] and Tucker rank [15]. Since calculating the CP rank of a tensor is usually NPhard [16] and the convex surrogate of the Tucker rank, Sum of Nuclear Norms (SNN) [17], is not the tightest convex relaxation. To avoid these defects, Lu et al. [18] first pay attention to the novel tensor tubal rank of (see Definition 1), denoted as , induced by tensortensor product (tproduct) [19]and tensor Singular Value Decomposition (tSVD)
[20]and consider the following convex Tensor Nuclear Norm Minimization (TNNM) model
(2) 
where is referred to as Tensor Nuclear Norm (TNN) (see Definition 1) which has been proved to be the convex envelop of tensor average rank^{2}^{2}2The reference [18] indicates that low average rank assumption is a weaker low tubal rank assumption, i.e., a tensor with low tubal always has low average rank. Its definition can be found in [18]. within the unit ball of the tensor spectral norm [18]. In order to facilitate the design of algorithms and the needs of practical applications, in previous work [21], Zhang et al. first present a theoretical analysis for Regularized Tensor Nuclear Norm Minimization (RTNNM) model, which takes the form
(3) 
with a positive parameter . Especially, the RTNNM model (3) is more commonly used than the constrainedTNNM model (2
) when the noise level is not given or cannot be accurately estimated. The tensor Restricted Isometry Property (tRIP) was first defined based on tSVD in
[21] as an analysis framework for LRTR via (3). For an integer , the tensor restricted isometry constants of a linear map is defined as the smallest constants satisfying(4) 
for all tensors whose tubal rank is at most . Moreover, our Theorem 4.1 in [21] shows that if satisfies the tRIP with for certain , then the solution to (3) can robustly recover the lowtubalrank tensor .
Note that Zhang et al. [21] have derived a deterministic condition of robust recovery for the RTNNM model (3) based on the tRIP. Unfortunately, it is unknown how to construct a linear map that satisfies tRIP. The purpose of this paper is precisely to show their existence under suitable conditions on the number of measurements in terms of the tubal rank and the size of tensor using probabilistic arguments. We consider the subGaussian measurement ensemble whose all elements (tensors with size ) are drawn independently according to a subGaussian distribution. This includes Gaussian, Bernoulli and all bounded distributions. For such liner maps, the tRIP holds with high probability in the stated parameter regime.
In 2018, Lu et al. [1] provided an exact recovery result based on the Gaussian width for TNNM model (2). Specifically, they pointed out that the unknown tensor of size with tubal rank can be exactly recovered with high probability by solving (2) when the given number of Gaussian measurements is of the order . In 2019, Wang et al. [22] presented a generalized tensor Dantzig selector for lowtubalrank tensor recovery problem with noisy measurements where is the noise term. They showed that whenever the sample size , the solution of generalized tensor Dantzig selector satisfies with high probability. In the noiseless setting (i.e., ), their results will degenerate to Lu’s case. All recovery results mentioned are probabilistic. Some deterministic results involved tensor RIP have emerged in LRTR. In 2013, the first tensor deterministic condition—tensor RIP based on Tucker decomposition [15] which can guarantee that a given linear map can be utilized for LRTR was proposed by Shi et al. [23]. They showed that a tensor with Tucker rank can be exactly recovered in the noiseless case if the linear map satisfies the tensor RIP with the constant for . Such tensor RIP is hardly practical because it depends on a rank tuple that differs greatly from the definition of familiar matrix rank, which will result in some existing analysis tools and techniques that can not be used for tensor cases. What’s more, which linear mappings satisfy such tensor RIP is still an open problem for them.
In previous work [21], Zhang et al. used the tRIP to answer under what conditions the robust solution to model (3) can be obtained. In this paper, we continue the work and answer a quintessential and allimportant question: which liner maps satisfy the tRIP? Our main contributions are summarized as follows:

Using the arguments of covering numbers and chaos processes as well as concentration inequalities, we determine how many random measurements are sufficient for the linear maps that satisfy a tRIP with high probability.

We consider a large class of subGaussian distributions that include Gaussian, Bernoulli and all bounded distributions, which makes the conclusions in this paper more general.

In order to verify our conclusions, we carry out some numerical experiments on studying the variation of success recovery ratio in term of increasing measurements.
The remainder of the paper is organized as follows. In Section 2, we introduce some notations and definitions. In Section 3, some probabilistic tools for proving are given. In Section 4, our main results and their proofs are presented and discussed. Section 5 conducts some numerical experiments to support our analysis. The conclusion is addressed in Section 6.
2 Notations and preliminaries
For the sake of brevity, we list main notations which will be used later in Table 1. For a thirdorder tensor , let
be the Discrete Fourier transform (DFT) along the third dimension of
, i.e., . Utilizing the inverse DFT, can be calculated from by . Let be the block diagonal matrix with each block on diagonal as the frontal slice of and be the block circular matrix, i.e.,The operator and its inverse operator are, respectively, defined as
The tensor transpose [20] of , denoted as , is obtained by transposing each of the frontal slice and then reversing the order of transposed frontal slices 2 through . The identity tensor [20] is the tensor whose first frontal slice is the identity matrix, and other frontal slices are all zeros. For tensors and , the tensortensor product (tproduct) [20], , is defined to be a tensor of size . The orthogonal tensor [20] is the tensor which satisfies . A tensor is called Fdiagonal [20] if each of its frontal slices is a diagonal matrix.
Format  Description  Format  Description  Format  Description 
A tensor.  A subset of .  The th lateral slice of  
A matrix.  An net of .  The tube fiber of .  
A vector.  The identity tensor.  The transpose of .  
A scalar.  or  The th entry of .  The DFT of .  
A set.  or  The th frontal slice of .  .  
With the above notations, we first introduce three basic concepts of tensor algebra which will be used later. [tSVD [20]]Let , the tSVD factorization of tensor is
where and are orthogonal, is an Fdiagonal tensor. Figure 1 illustrates the tSVD factorization.
[Tensor tubal rank [19]]For , the tensor tubal rank, denoted as , is defined as the number of nonzero singular tubes of , where is from the tSVD of . We can write
3 Probabilistic tools
This paper aims to answer which liner maps
satisfy the tRIP. We will analyze this question from a more general perspective by considering the class of subGaussian distributions. To this end, we first introduce some probabilistic tools that will be required for our results. [SubGaussian random variables
[24]]A random variable is called subGaussian if there exists a number such that the inequalityholds for all , and we denote that satisfies the above formula by . SubGaussian distributions is a wider class of distributions as it contains Gaussian, Bernoulli and all bounded distributions. For example, if
is a Gaussian random variable with zeromean and variance
, then is also a subGaussian random variable, i.e., . Therefore, we require that the distribution of all elements (tensors with size ) of the measurement ensemble is a subGaussian distribution.Next we provide some instrumental theoretical skills for the analysis of our main results which include net, covering numbers, functional and concentration inequalities. [net [25]]For a metric space , and , if each element in is within distance () of some elements of , i.e.
then the subset is referred to as an net of , denoted .
Throughout the article, we consider that and is the Euclidean distance, i.e. .
[Covering numbers [25]]Let be a subset of metric space . For , the covering number of is defined as the smallest possible cardinality of an net of . [Covering numbers and volume [25]]If be a subset of metric space , then for , we have
where is the volume in and is Euclidean ball with radius .
Note that when is a unit Euclidean ball in dimensions (or it is the surface of the unit Euclidean ball), is contained in the ball. If we assume that , then we have the following crucial inequality,
(5) 
which will is employed repetitively.
It is useful to observe that the tensor restricted isometry constants can be expressed as a random variable as follows
where is a set of matrices and is a subGaussian vector. In order to obtain deviation bounds for random variables of this form in terms of a complexity parameter of the set of matrices , we need to introduce the complexity parameter, i.e., Talagrand’s functional. [functional [28, 27, 26]]Given a metric space , a collection of subsets of , , is referred to as an admissible sequence if and for every , then the functional with any of is defined by
where the infimum is taken in regard to all admissible sequences of and .
In this paper, we mainly focus on the functional of a set of matrices with the operator norm. The proof of our results requires the use of the covering number to give the bound of functional. In order to do this, we will utilize the Schatten spaces. Its detailed definition is as follows:
and are defined as the Schatten norms of a given matrix , and
is defined as the radius of any set of matrices. Especially, . With these notions, for a given metric space and with the covering number , by exploiting the Dudley type integral, we have the following inequality for functional
(6) 
where is a universal constant.
In CS, the following concentration inequality which involves functional is often adopted to estimate the deviation bound of . We will also make use of this important result. [[26]]Suppose that is a random vector whose entries with mean and variance . Let be a set of matrices, and
Then, there exist constants , depending only on such that for all ,
4 Main results
In this section, we will show that the tRIP (4) holds with high probability for certain linear maps from a large class of random distributions satisfying the required number of measurements. We first compute the covering number of the set of tensors whose tubal rank is at most and Frobenius norm is . [Covering number for lowtubalrank tensors]For a set
there exists an net in regard to the Frobenius norm obeying
(7) 
Proof.
Here we take the proof strategy of Lemma 3.1 in [8] and modify it to accommodate our tSVD. For any , we have the skinny tSVD
where and are two orthogonal tensors and is an Fdiagonal tensor. Since
so we have
We first construct nets for sets of , and respectively, and then achieve the purpose of covering . Without loss of generality, we may assume that since the adjustments for the general case will be obvious.
Let be the set of Fdiagonal tensors whose first frontal slice has nonnegative and nonincreasing diagonal entries. According to Lemma 3 and (5), there exists an net for with . And then we let and use the notation to denote the th lateral slice of , i.e., a tensor in . Definition 3.6 in [19] shows that is an orthogonal tensor if and only if the lateral slices form an orthonormal set of matrices with . Therefore, it is less difficult to know that is a subset of the unit ball under the following norm
Hence, due to (5), there is an net for satisfying . Then we can construct an net
such that the covering number of the corresponding set satisfies
The rest of the work is to prove that is an net for the set , i.e., . In other words, we need to prove that for any , there exists with .
Next, let with
where satisfying , , and , then we have
where the first inequality uses the triangle inequality. Since Frobenius norm has the property of being invariant under orthogonal multiplication and , are two orthogonal tensors, we thus obtain
and
So similarly, we would find that . Thus, we conclude that . This completes the proof. ∎
Lemma 4 leads to an important consequence of volumetric bound (7) is that the covering numbers of the collection of lowtubalrank tensors of interest and plays a key role in the proof of Theorem 4. Besides, note that the proof of Lemma 4 is based on the tproduct and tSVD whose definitions are consistent with matrix cases. Benefit from the good property of the tproduct and tSVD, the bound (7) can reduce to the corresponding result in lowrank matrix [8] when .
We are in the position to state our main results. Fix and let be an any given thirdorder tensor whose tubal rank is at most , then a random draw of a subGaussian measurement ensemble satisfies with probability at least provided that
where the constant only depends on the subGaussian parameter.
Proof.
Given a tensor and a measurement ensemble , then we can construct a matrix of size as follow
with being the vectorized version of the tensor . and by utilizing an dimensional random vector whose entries with mean 0 and variance 1 to obtain the measurements, that is
Recall that the tensor restricted isometry constant can be expressed as
where . In order to apply Lemma 3 to estimate the probabilistic bound for above expressions, we define the set in Lemma 3. It remains to check that the radii , , and of the set and the complexity parameter—Talagrand’s functional . Clearly, is on account of for all . In addition, based on this fact that the operator norm of a blockdiagonal matrix is the maximum of the operator norms of the diagonal blocks and the operator norm of a vector is its norm, we see that
Thus, we have . And because of
for all , we obtain
for all . This implies that . Furthermore, by exploiting the Dudley type integral (6) and bound (7) for , we obtain the bound of the functional
where is a universal constant. Let us now compute the constants , and in Lemma 3. This gives
By applying Lemma 3 and let , we conclude that if
then and hold. This completes the proof. ∎
Theorem 4 tells us that a random subGaussian measurement ensemble obeys (4). We know that subgaussian distributions belong to a larger class of random distributions, including Gaussian, Bernoulli and all bounded distributions. Thus, in some sense, Theorem 4 completely characterizes the behavior of numerous random measurement ensembles in term of the tRIP. Note that an tensor with tubal rank has at most degrees of freedom. So the required number of measurements is very reasonable and nearly optimal compared with the degrees of freedom. It is worth mentioning that there exists a similar conclusion (refer to Theorem 2 in [2]) motivated by some special tensor decompositions. However, our Theorem 4 improves on the result in [2] by a factor of ( denotes the order of a tensor) and implies that one only needs a constant number of measurements per degree of freedom of the underlying ranktensor in order to obtain the tRIP at rank. In addition, if , the threeorder tensor will reduce to a twoorder tensor, i.e., a matrix. Accordingly, the tensor tubal rank will reduce to the matrix rank, and tRIP will reduce to the Definition 2.1 in [8]. Thus the required number of measurements for random subGaussian measurement ensembles in Theorem 4 includes the results of Theorem 2.3 in [8] for LRMR.
The following is a trivial corollary but an important special case of Theorem 4. Let be a Gaussian or Bernoulli measurement ensemble. Then there exists a universal constant such that the tensor restricted isometry constant of satisfies with probability at least provided that
In CS and LRMR, Gaussian random matrix or Bernoulli random matrix is often used as a universal measurement matrix (ensemble) because they satisfy vector RIP
[5] with high probability. Accordingly, Corollary 4 guarantees that the Gaussian or Bernoulli measurement ensemble can also be used for LRTR. The proof of Corollary 4 is trivial, which is omitted here.5 Numerical experiments
In CS, it has been proved that it is NPhard to verify vector RIP [5] for a specific random matrix directly [29]. Similarly, it seems very complex to check whether a given instance of a random measurement ensemble fails to obey tRIP. Therefore, in this section we conduct several numerical experiments to corroborate indirectly our main results.
We present numerical results for recovery of threeorder tensors with different problem setups, i.e., different tensor sizes , tubal ranks , measurement ensembles and sampling rate . We perform to get the linear noise measurements instead of where is a long vector obtained by stacking the columns of . In all experiments,
is the Gaussian white noise with mean
and variance . We consider two sizes of and different tubal ranks: (a) , , , , ; (b) , , , , . is a measurement matrix with i.i.d. zeromean Gaussian entries having variance or i.i.d. Bernoulli entries, i.e., . Then the RTNNM model (3) can be reformulated as(8) 
We adopt effective Algorithm 1 in [21] to solve (8). With the experimental results in [21], the regularization parameter is set to . We deem that the tensor can be as a successful reconstruction for the original tensor from the measurements if the relative error satisfies .
Figure 2 and Figure 3 show the success rate of recovery in trials versus the sampling rate for the random Gaussian measurements ensemble and random Bernoulli measurements ensemble, respectively. The minimum required sampling rate by theory (the minimum required number of measurements, i.e., ) for successful recovery is indicated by the vertical lines. All of the cases consistently show that the unknown tensor of size with tubal rank can be successfully recovered by solving (3) when the given number of measurements . This conclusion, combined with Theorem 4.1 in [21], indirectly verifies our Theorem 1. However, from Figure 2 and Figure 3, it is not difficult to find that there is a small gap between the required number of measurements by theory and that required by experiment. This gap is allowed because there are many factors in the experiment such as the choice of algorithm, parameter setting, etc., which may cause this gap.