Internet network traffic data can often be arranged in the form of multidimensional arrays. For example, a traffic matrix is often applied to track the volume of traffic between origin-destination (OD) pairs in a network 
. Estimating the end-to-end traffic data in a network is an essential part of many network design and traffic engineering tasks, including capacity planning, load balancing , network provisioning 
, path setup and anomaly detection, and failure recovery . Unfortunately, direct and precise end-to-end flow traffic measurement is very difficult or even infeasible in the traditional IP network. Missing data is unavoidable. Since many traffic engineering tasks require the complete traffic volume information or are highly sensitive to the missing data, the accurate reconstruction of missing values from partial traffic measurements becomes a key problem.
To infer the missing data in the internet network, many research works have been developed, for instance, see [3, 31] and the references therein. Using optimization technology to recover incomplete internet traffic data is a very important way, which are somewhat different from the existing methods for dealing with general data recovery [9, 10, 13, 35, 51], including Compressive Sensing (CS) , Singular Value Thresholding (SVT) algorithm  and Low-rank Matrix Fitting (LMaFit) algorithm , etc. Since most of the known approaches for missing internet network data are designed based on purely spatial or purely temporal information [26, 43, 52], sometime their data recovery performance is low. To capture more spatial-temporal information in the traffic data, various modified matrix based models and corresponding optimization algorithms were presented to recover missing traffic data, for example, see [4, 20, 34, 41]. The first spatio-temporal optimization model of traffic matrices (TMs) was established in 
, which is designed based on low-rank approximation combined with the spatio-temporal operation and local interpolation. The proposed optimization model in can be formulated as
where is a linear operator, the matrix contains the measurements, and and are the spatial and temporal constraint matrices, respectively, which express the known knowledge about the spatio-temporal structure of the traffic matrix (e.g., temporally nearby its elements have similar values). Based on the model (1.1), the authors applied an alternating least squares procedure to solve it. Numerical experiments show that the proposed method in  has better performance. Since then, several other matrix recovery optimization models and algorithms [14, 17, 20, 27, 29, 48, 57] have been proposed to recover the missing data from partial traffic or network latency measurements. Although these approaches enjoy good performance when the data missing ratio is low, their performance suffers when the missing ratio is large, especially in the extreme case when the traffic data on several time intervals are all lost 
. In fact, such matrix based methods may destroy the intrinsic tensor structure of high-dimensional data and also increases the computational cost of data recovery.
In order to improve the recovery performance of the matrix-based methods mentioned above, several tensor optimization methods have been applied to recover missing traffic data, for example, [2, 15, 18]. The core of the tensor methods lies in the tensor decomposition, which commonly takes two forms: CANDECOMP/PARAFAC (CP) decomposition [11, 21] and Tucker decomposition . For a network with a location set , let cardinality . As a straightforward way of modeling , traffic tensor may be formed with a third order tensor , corresponding respectively to the origin, destination and the total number of time intervals to consider. Traffic data are typically measured over some time intervals, and the value reported is an average. Therefore, the element in is used to represent the traffic from origin to destination averaged over the time duration , where denotes the measurement interval, and and . After analyzing the spatial-temporal features in the traffic data, Zhou et al  applied the tensor completion method to recover the traffic data from partial measurements and loss. With the help of tensor CP decomposition, an optimization model with spatial-temporal constraints was proposed, whose form can be expressed as
where is a shorthand notation of CP decomposition of third tensors, and are the spatial constraint matrices and is the temporal constraint matrix. However, such a third order tensor model cannot fully exploit the traffic periodicity in the traffic data, so the recovery accuracy is not very high. To fully exploit the traffic features of periodicity pattern, Xie et al  modeled the considered traffic data as another type of third order tensor , where corresponds to OD pairs, and there are days to consider with each day having time intervals. It is obvious that and . In that paper, Xie et al further proposed two sequential tensor completion algorithms to recovery internet traffic data. A public traffic trace, Abilene trace data , was used as an example to illustrate the model established in , the related traffic data was modeled as a tensor in . Since its sizes of three dimensions are much more balanced, the authors found that this model is better than the one in  for missing data recovery. However, in the case where the balance of the above three dimensions is not satisfied (e.g., is much bigger than , and and remain unchanged), the recovery performance of the corresponding models may be greatly affected.
Existing methods for missing internet network data, whether they are matrix or tensor based methods, sometimes, even when the sampling rate is not too low, the relative error rate of data recovery is still relatively high. Therefore, although various studies have been made to recover missing internet traffic data, how to choose an appropriate tensor to represent traffic data, and how to establish a related optimization model and design an efficient algorithm to solve the established model, are still main challenges in the areas of network management and internet traffic data analysis. On the other hand, tensor t-product methods introduced by Kilmer et al [23, 24] is a powerful tool for decomposing third-order tensors, and has been well applied in image and video inpainting, for example, see [51, 56]. However, due to the appearance of spatio-temporal features that must be considered in internet data recovery, the models and algorithms used in [51, 56] cannot be directly applied to internet data recovery.
In this paper, we present a new third-order tensor optimization method to recover missing internet traffic data. We first use a third order tensor to represent the traffic data, where the means of , and are stated above, i.e., represents the numbers of the observing time intervals in each day, is the number of days considered, and where two correspond respectively to the origin and destination. Based upon the third-order tensor mentioned above, we use t-product of tensors as a useful tool to establish a tensor completion model for internet traffic data recovery. The temporal stability and periodicity characteristics of the considered original traffic data are used to improve the established model, and the resulting optimization model has a good separation structure, which is conducive to the design of parallel algorithms and improves efficiency. While tensor recovery methods based upon t-product decomposition are already very successful in the fields of image, video data recovery, etc. [51, 56], what we propose in this paper is the first spatio-temporal tensor t-product method for recovering internet traffic data.
The paper is organized as follows. We present the notation and preliminaries of tensors in Section 2, which will be used throughout the paper. In Section 3, we model the traffic data as a third order tensor and formulate the traffic data recovery problem as a low-rank tensor completion model. After an equivalence transformation, a separable optimization model is presented. Furthermore, a modified version of the block coordinate gradient method for missing internet network traffic data is proposed, and the convergence of this algorithm to a stationary point is analyzed in Section 4. In Section 5, we conduct extensive simulation experiments to evaluate the performance of the proposed algorithm. Our experiments are performed on two real-world traffic datasets, the first is the Abilene traffic dataset, and the second is the GÉANT traffic dataset. The simulation results demonstrate that our model and algorithm can achieve significantly better performance compared with tensor and matrix completion algorithms in the literature, even when the data missing ratio is high. Conclusions and future work are discussed in Section 6.
2 Notation and preliminaries
In this section, we present the notation and some basic preliminaries related to the tensor, tensor t-product and real value functions of complex variables, which will be used in this paper.
In this paper, the fields of real numbers and complex numbers are denoted as and , respectively. The identity matrix is denoted by . For an matrix , a subset of and a subset of , we use the notation for the sub-matrix obtained by deleting all rows and all columns , and use the notation for the sub-matrix obtained by deleting all rows (all columns ). And the spectral norm for a given matrix is denoted by . In what follows, the nomenclatures and the notations in 
on tensors are partially adopted. A tensor is a multidimensional array, and the order of a tensor is the number of dimensions, also called way or mode. For example, a vector is a first-order tensor, and a matrix is a second-order tensor. For the sake of brevity, in general, tensors of orderare denoted by Euler script letters , matrices by capital letters , vectors by bold-case lowercase letters , and scalars by lowercase letters . More specifically, a real (complex) tensor of order is represented by , and its -th element is represented by . For any two tensors , the inner product between and is denoted as where is the conjugate of for and , and the Frobenius norm associated with the above inner product is .
For a given -th order tensor , it has modes, namely, mode-, mode-, , mode-. For , denote the mode- matricization (or unfolding) of tensor to be , then the -th entry of tensor is mapped to the -th entry of matrix , where
The corresponding inverse operator is denoted as “fold”, i.e., .
In the third order tensor case, we use the terms horizontal, lateral, and frontal slices to specify which two indices are held constant. For a given , using Matlab notation, corresponds to the -th horizontal slice for , corresponds to the -th lateral slice for , and corresponds the -th frontal slice for . More compactly, is used to represent . It is easy to verify that and for .
2.2 t-product and t-SVD of tensors
For a given third order tensor with frontal slices, we define the block circulant matrix as
Notice that, in this paper, we will always assume the block circulant matrix is created from the frontal slices, and thus there should be no ambiguity with the following notation.
We anchor the “bvec” command to the frontal slices of the tensor, that is, takes an tensor and returns a block matrix and the corresponding inverse operation is defined as . Moreover, the block diagonalization operation and its inverse operation are defined as and , respectively.
(, Definition 2.5) (t-product) Let and be two real tensors. Then the t-product is an real tensor defined by
where “” means standard matrix product.
Write with . It is clear that
for any integers , where is the conjugate complex number of . Denote
which is called the normalized discrete Fourier transform (DFT) matrix, and denote the conjugate transpose of by . Just as circulant matrices can be diagonalized by the DFT , block-circulant matrices can be block diagonalized, i.e.,
where “” denotes the Kronecker product, and with being
It should be noticed that, for any , which can be block diagonalized as (2.4), most of the matrices may be complex, even when is symmetric, and they satisfy the relationships:
On the other hand, it is easy to see that any tensors , which satisfy the above relationships, can lead to real tensors by the inverse operation of (2.5). This fact can be seen from the following formula
The identity tensor of size is a tensor whose first frontal slice is a identity matrix, and all other frontal slices are zero matrices. The conjugate transpose of a tensor is a tensor in , denoted by , whose each frontal slices are conjugate transposed and then the order of frontal slices are reversed. A f-diagonal tensor is a tensor whose frontal slices are all diagonal matrices. A third order tensor is orthogonal, if .
 (t-SVD) A tensor can be factored as where and are orthogonal tensors , and is a f-diagonal tensor.
For any , its tubal rank is defined as the number of nonzero singular tubes of , i.e., , where is from the t-SVD of , and for .
 For any tensors , and , the following properties hold.
(1) If , then can be written into a tensor product form , where and are two tensors of smaller sizes and they meet ;
2.3 Real value functions of complex variable
For , denote
where and are the real and imaginary part of , respectively, for . We consider a real-valued function defined by , where . The following theorem can be found in .
Let be a function of a complex vector and its conjugate vector , and let be analytic with respect to each variable ( and ) independently. Let be the function of the real variables and such that . Then the partial derivative (treating as a constant in ) gives the same result (on substituting for ) as . Similarly, is equivalent to . Moreover, either of the conditions or is necessary and sufficient to determine a stationary point of .
3 Tensor model of internet traffic data completion and its reformulation
Let be a given incomplete tensor in , and be the index set of known entries of , i.e., its entries are given for while are missing for . Here, , and are corresponding to node , and of the collected internet network traffic data in Section 1, respectively. We model the traffic data in an internet network by a tensor . Based on the incomplete tensor , in order to recover internet network traffic data (i.e., ), we consider the following low rank tensor optimization problem
where denotes the tensor tubal rank of , and is the linear operator to extract known elements in the subset and fills the elements that are not in with zero values.
The target of this paper is to estimate those elements with zero values as accurately as possible. To solve this problem, a tensor factorization method was proposed to optimize this problem by factorizing tensor into two smaller tensors and , where is the beforehand estimated tubal rank of and is usually much smaller than . According to Proposition 2.4, we further approximate the model (3.8) by introducing an intermediate variable as follows
where is an regularization parameter, which allows a tunable tradeoff between fitting error and achieving tensor low-rank.
3.2 Improvement with temporal stability
In real-world network, most of traffic data often have considerably large difference in start sampling time and end sampling time, but every successive time intervals the sampling data have pretty small difference. That is, traffic usually change slowly over time, which exhibit temporal stability feature in time dimension. We use matrices and to express our knowledge about the traffic temporal properties, in which, is used to capture the stability of traffic values at two adjacent time slots, and is used to express the periodicity of traffic data, i.e., the similarity in internet visiting behaviors at the same time of different days, such as the similar traffic mode in working hours and sleeping hours.
We first consider for setting . The temporal constraint matrix captures temporal stability feature of the traffic tensor, i.e., the traffic data is similar at adjacent time slots. Based on time dimension (mode-) unfolding matrix or , a simple choice for the temporal constraint matrix is of the size , which denotes the Toeplitz matrix with central diagonal given by ones, and the first upper diagonal given by negative ones, i.e.,
By using the temporal constraint matrix mentioned above, we can check that
which means that by minimizing or , we seek an approximation that also has the property of having similar temporally adjacent values, i.e, the fact that traffic tensor at adjacent points in time of same day are often similar. Similarly, we use a simple Toeplitz matrix
to capture the periodicity of traffic data. Therefore, by minimizing , we may approximate the temporal stability feature of (i.e., ) and are expected to improve recovery accuracy.
Based on the argument above, the model (3.9) can be improved into the following optimization model
where and are two appropriately chosen penalty parameters.
3.3 Reformulation and simplification of model
In this subsection, we continue to consider how to transform the model (3.11) into an equivalent form that is conducive to effective algorithm design. For simplicity, let us write . It is obvious that
Since , which is equivalent to , i.e., for every , it holds that
Similarly, we have
Now we consider . Since , it holds that
On the other hand, by (2.5), we have
Due to the special structure of equivalent reformulation (3.20) of the objective function in (3.11), our algorithm, which will be described in the next section, is closely related to the following two type of optimization sub-problems.
where , and
where and .
Notice that the models (3.21) and (3.22) have both very good separable structure. It is clear that the optimal solution set of (3.11) is nonempty, which is denoted by , and the corresponding optimal value is denoted by , since the objective function value is bounded below with zero. Moreover, we see that is a optimal solution of (3.11), if and only if that is the optimal solution of (3.21) with