1 Introduction
The radiance of a real scene is distributed across a wide range of spectral bands. A hyperspectral image (HSI) consists of various intensities that represent the integrals of the radiance captured by sensors over hundreds of discrete bands. As compared with traditional image systems, HSI facilitates delivering more faithful representation for real scenes, and thus tends to be better performed on various computer vision tasks, such as classification
Zhao2016High Dong2016HSI , compressed sensing Zhang2015HSI ; Mart2015HYCA , and mineral exploration W2013S ; Goetz2009 .In real cases, however, an HSI is always significantly corrupted by noises that are generally conducted by sensor sensitivity, photon effects, light condition and/or calibration error Goetz2009 . The HSI denoising problem is thus a critical issue and the resolving of the problem could greatly ameliorate the performance of the subsequent HSI processing tasks. Besides, the data need to be transmitted to ground station to deal with, but an HSI practically collected from an airborne sensor or a satellite is always with hundreds of image bands, which makes the HSI requiring a large storage space. This will conduct severe issues of low efficiency and high cost on transmitting them to the ground stations. Thus it is necessary to design effective techniques on HSI compressed sensing to satisfy the ontime transmit.
One of the most widely utilized priors for HSI denoising/compressed sensing is the local smoothness prior. The spatial local smoothness prior refers to the fact that similar objects/scenes (with shape) are often adjacently distributed with similar spectrum wave, and the spectral smoothness prior delivers the fact that adjacent bands of images of an HSI are usually collected with similar sensor parameter settings, and thus with similar values. Such local smoothness prior structure possessed by an HSI can be equivalently understood as the sparsity of the gradient maps calculated along both the spatial and spectral modes of the HSI, as shown in Fig. 1
. The local smoothness prior of HSI space and spectrum can then be naturally encoded as the total variances along different modes of the HSI
( are the sizes of the spatial height, width and spectrum of the HSI, respectively), that is, the norm on the gradient maps (), where , and represent the gradient maps of calculated along its spatial height, width and spectrum modes, respectively^{1}^{1}1The definitions for will be provided in detail in Section 3.. This term is called 3Dtotal variation regularization, or 3DTV briefly, with the form:(1) 
It has been extensively shown that this term can be very helpful for various HSI processing issues He2016 ; Wang2017Hyperspectral ; Hes2016 . Some even attain stateoftheart performance for certain tasks.
Although being successfully utilized in various tasks, the 3DTV regularization still has not sufficiently considered the insightful sparsity structure knowledge underlying the gradient maps of an HSI. The most typical limitation reflects on its similar sparsity consideration (i.e., identical and independently distributed Laplacian prior distribution set on gradient maps) imposed on all bands of the gradient maps in spectrum, spatial width and height. That is, the sparsity extents on all bands of these gradient maps are implicitly assumed to be similar and unrelated. This, however, is always deviated from the real cases. On one hand, in different bands of the gradient maps, instead of identical, the sparsity exist evident variations, which can be easily observed from Fig. 1(c). On the other hand, instead of independent, there are evident correlations across different bands of the gradient maps of an HSI, as clearly shown in Fig. 1(e), which are naturally succeeded from the band correlation property of the original HSI ZHY2014HSI ; XY2016Hyperspectral . Such a prior knowledge deviation from the real HSIs makes the capability of such a useful regularization term still has a large room to be further strengthened.
To alleviate this issue, this paper proposes an enhanced 3DTV regularization term. While similar to the conventional 3DTV, the proposed term is also operated on gradient maps of height, width and spectrum mode of HSI, the new term applies the sparsity measure (
norm, just like the conventional 3DTV) on their subspace basis maps along bands instead of the gradient maps themselves. Each basis is obtained by a linear combination of the original gradient map vectors along bands:
, where is the transformation matrix, with size , facilitating to find the () bases of . Such new regularization term rationally reflects practical correlated and different sparsity extents across different bands of the gradient maps. Specifically, representing the gradients by such calculated less bases delivers the sparsity correlation insight of an HSI, and their different coefficients (which will be introduced in detail in Section 4) represents the difference among bands of these gradient maps. Besides, just like other subspace learning methods, the gradient bases always contain much less bands as compared with original gradient maps, making them less negatively influenced by embedded corruptions, and more robust for different HSI processing tasks in practical scenarios.In summary, this paper makes mainly threefold contributions as follows:

We propose an enhanced 3DTV (E3DTV, briefly) regularization term, to better reflect the sparsity insight of the gradient maps underlying an HSI. The noni.i.d. sparse distribution prior characteristic of gradient maps along all HSI bands is more faithfully represented by the new regularization, and thus it is expected to replace the 3DTV term of a general HSI processing model to improve its performance.

By easily using the proposed E3DTV as the unique regularization term, we formulate two models for two typical HSI processing tasks: HSI denoising and compressed sensing. The ADMM algorithms are readily designed to solve the corresponding models. The closedform updating equations are deduced for each step of both algorithms, and they can thus be implemented efficiently.

Comprehensive experimental results substantiate the superiority of the proposed method beyond stateoftheart methods previously designed on both tasks. It is verified that such easy substitution of E3DTV to 3DTV can always bring significant performance improvements on most of experiments.
The remainder of this paper is organized as follows. Section 2 reviews related work. Section 3 introduces some notations and preliminaries related to this work. Section 4 presents the new regularization term, as well as the rationality of its formulation. Section 5 gives the HSI denoising model by involving E3DTV regularization term, and Section 6 demonstrates the corresponding experimental results. Section 7 provides the model for HSI compressed sensing using the E3DTV as its regularization, and Section 8 reports the related experimental results. Conclusions are finally made in Section 9. Throughout the paper, we denote scalars, vectors, matrices and tensors as the nonbold, bold lower case, bold upper case and upper cursive letters, respectively.
2 Related Work
In the following, we will review the most relevant studies on two investigated HSI processing tasks, HSI denoising and HSI compressed sensing, respectively.
2.1 HSI Denoising
A number of methods such as KSVD Elad2006Image , nonlocal means Buades2005A , BM3D Dabov2007Image and wavelet shrinkage Othman2006Noise have been developed for 2D image denoising. These methods can be directly applied to HSI denoising by denoising each band of image independently. However, such an easy manner ignores the correlations among the spectral bands or spatial pixels, and thus usually could not provide satisfactory results.
More substantial progress on HSI denoising have been made by directly constructing models on HSIs, especially considering their spectral information. Most of them considered to construct a rational prior to deliver intrinsic structures underlying HSI. One representative prior is global spectral correlation. This prior indicates that the images located at different HSI bands are generally closely correlated. A generally adopted trick to formulate such a prior is first to unfold the HSI along its spectral mode to form a HSI matrix, and then use lowrank regularization terms to encode this prior Chen2011Denoising , Gu2014Weighted , He2015Hyperspectral , Peng2014Reweighted , XY2016Hyperspectral , ZHY2014HSI . Such a lowrank structure can be formulated implicitly into an HSI denoising model by a nuclear norm (e.g., the NNM method Candes2009Robust , Wright2009Robust ) or its nonconvex variations, like the weighted nuclear norm (e.g., the WNNM method Gu2014Weighted , Peng2014Reweighted ) and the Schatten norm (e.g., the WSNM method XY2016Hyperspectral ). Besides, it can also be directly encoded as an explicit low rank matrix factorization form, like that used in the LRMR method ZHY2014HSI ; Zhou2011GoDec . Very recently, to more faithfully deliver the multifactor affiliation underlying HSI, some HSI denoising methods Lu2017Tensor , Wang2017Hyperspectral treated an HSI directly as its original 3mode tensor form, and use certain lowrank tensor approximation techniques Kolda2009Tensor , e.g., the Tucker and CP decompositions, to characterize such lowrank characteristic under tensor expressions.
The spatial nonlocal similarity prior is another widely utilized prior for the task. A practical HSI always contains a collection of similar local patches all over the space, composing of homologous aggregation of microstructures. Such a nonlocal spatial similarity prior among HSI patches is also commonly used in the HSI denoising model. Specifically, by extracting the common patterns among these nonlocally similar patches, the spatial noise is expected to be prominently alleviated. There are mainly two manners to encode such prior knowledge. One is by dictionary learning, which aims to build each local patch similar group of an HSI by the linear combination of a small number of atoms from a dictionary Mairal2009 , Liu2013A , Qian2013Hyperspectral , Xing2011Dictionary . It has been empirically verified that such modeling can lead to a good performance of HSI denoising. Typical methods designed in this way include BM4D Maggioni2013Nonlocal and TDL Peng2014Decomposable . Another rational way to encode such nonlocal similarity prior is to specify a lowrank regularization term on these similar groups, and use lowrank matrix/tensor approximation techniques to realize HSI recovery Dong2015Denoising , Mairal2009 , xie2016multispectral .
Besides, spatial localsmoothness prior is also a powerful prior utilized for HSI denoising. The gray values of an HSI collected from natural scenes are generally continuous along its spatial and spectral modes. Such prior structure can be readily characterized by the well known total variation (TV) regularization term Rudin1992Nonlinear , Chambolle2004An . Multiple HSI denoising methods Bouali2011Toward , He2016 , Wu2017Structure , Wang2017Hyperspectral , Jiang2016Hyperspectral have been raised by utilizing such a regularization, and the utilization of such prior has been verified to be helpful for the fine denoising performance. Besides, due to the convexity and conciseness of such a regularization term, it is easy to integrate this term with other prior terms into a unified model to enhance the capability of the model on HSI recovery. A typical example is the LRTV method He2016 , which combines this smoothness prior with the global correlation prior along spectrum, and obtained superior performance compared with other TVbased HSI denoising approaches.
2.2 HSI Compressed Sensing
HSI compressed sensing aims to possibly precisely reconstruct an HSI from a small set of compressed measurements imposed on it Candes2006Robust ; Donoho2006Compressed , which is a typical inverse problem, and needs to make it calculable by setting regularization/prior terms on the toberecovered HSI variables.
Based on the prior knowledge that an HSI can be sparsely represented under an appropriate redundant dictionary, Duarte and Baraniuk Duarte2010Kronecker proposed a compressed sensing method based on Kronecker product. They considered to use Kronecker products from sparse basis and sensing matrix. For a multidimensional signal, its section is defined to be the part where all dimension indicators other than the th dimension are fixed, than a sparse base of a multidimensional signal may consist of Kronecker products of sparse bases for each section.
Similar to the HSI denoising method, the global spectral correlation and spatial smoothness priors have also been considered for the task. In 2011, Waters et al.Waters2011SpaRCS proposed a low rank and sparse matrix recovery method under compressed sensing, and applied the method to video and HSI compression recovery. Golbabaee et al. Golbabaee2012Hyperspectral further discussed the correlation of HSIs in the spectral dimension, and shows certain joint sparseness in the spatial dimension under a wavelet representation and proposed a method based on low rank and joint sparse matrix recovery. In 2013, Golbabaee et al. Golbabaee2013Joint proposed the joint norm and TV norm minimization model to encode the correlation of the spectral bands and local smoothness prior. Gogna et al. Gogna2015Split designed the splitBreggman algorithm with low rank and sparse or joint sparse signal recovery, combined with the Kronecker product method, and applied it to the compression recovery of HSI images.
Besides, spectral correlation priors have also been used for the task by directly setting the HSI as a tensor and constructing tensor processing models Yang2015Compressive , Wang2017Compressive for the problem in the recent years. For example, Yang et al. Yang2015Compressive used a tensor sparse representation of a HSI cube, under nonlinear compressed operator, which got some comparable results. To further ameliorate the performance especially under lower sampling rates, Wang et al. proposed the JTenDTV Wang2017Compressive model by jointing the Tucker decomposition and 3DTV norm to character the correlation of spectral bands as well as the spatial local smoothness prior, and got the comparable result.
3 Notations and Preliminaries
For a given HSI , where , and denote the sizes of spatial height, width and number of spectrum in the HSI, respectively. We denote the unfolding matrix of along the spectral mode as , which satisfies and .
Let denote the intensity at the voxel , and let
(2) 
denote three difference operations at the voxel along the spatial height, width and the spectrum, respectively. We can now introduce the difference operations calculated three different modes on an HSI , i.e., , as follows:
(3) 
where , are the gradientmap tensors corresponds to spatial height, weight and spectrum modes respectively^{2}^{2}2
Here, we use zero padding for
before applying difference operation on it, which can keep the size of the same as and make calculation convenient in the later sections.. Then, we denote the unfolding matrices of these gradient maps as:(4) 
One can also see Fig. 1 (b) (c) and (d) for easy understanding these operations.
It is easy to find that the difference operations and on is equivalent to applying subtractions between rows in , and on is equivalent to applying subtractions between columns in . This means all the three different operations are linear^{3}^{3}3Subtractions between rows of a matrix is equivalent to premultiply the matrix by a proper matrix, and subtractions between columns of a matrix is equivalent to postmultiply the matrix by a proper matrix.. We can then denote following three linear operations:
(5) 
Note that this linear operations encode the relationship between an HSI and its gradient maps s, facilitating us to construct HSI processing method with priors on the gradient maps.
The most commonly used sparsity measure on the gradients maps, s, is the so call norm sparse measure:
(6) 
By adopting this sparsity measure, the widely used smoothness regularization term for HSI, 3DTV Wang2017Hyperspectral , Wang2017Compressive , can be constructed by
(7) 
One can also see Fig. 1 for easy understanding this term. We can then present the enhanced 3DTV term beyond this conventional one in the next section.
4 Enhanced 3DTV Regularization
We first discuss how to construct an enhanced sparsity term for each , which will help us construct the E3DTV norm in the following sections.
4.1 Enhanced TV Regularization Term
For an given HSI whose unfolding matrix is , we seek to model the correlated sparse structure along its gradient map, i.e., our goal is to construct a rational sparsity measure for each gradient map , .
As studied in many previous works, the unfolded HSI can be rationally assumed to be of low rank Gu2014Weighted , He2015Hyperspectral , Peng2014Reweighted , XY2016Hyperspectral , ZHY2014HSI , He2016 . Then, it is easily to find that also possesses lowrankness property^{4}^{4}4 Since is a linear operation, we have . . This means can be expressed in following lowrank matrix factorization form:
(8) 
where and , () is the rank of the HSI. In addition, it easy to find the columns in is in fact a set of basis to the gradient maps . We can also rationally assume to make the subspace bases contain possibly compensate while nonrepetitive information underlying the original . In this case, we can rewrite equation (8) in following formulation:
(9) 
where is now represented as a linear combination of all columns of gradient maps. Furthermore, to guarantee the bases keep sufficient information of the original , we further assume , i.e., , is of approximately similar capacity with the original .
Fig. 2 shows the bases calculated on a clean HSI and it noisy observation. It is easy to observe from Fig. 2 (b) and (c) that by adding a little simulated Gaussion noise to the HSI, the quality of gradient maps, as well as its sparsity structures, can be badly damaged, which makes the corrupted images not easily to be restored by only using the original 3DTV regularization term (7) directly imposed on it. However, we can easily observe form Fig. 2 (d) and (e) that is evidently less corrupted than original gradient maps, and its sparse structure is evidently more evident and less hampered by noises. It is thus expected to be easier to recover the bases from the corrupted ones by E3DTV than to do this task on the original gradient maps by conventional 3DTV.
Thus, instead to design sparse regularization term for itself, we seek to design sparse regularization for , i.e. for . Based on the aforementioned analysis, the E3DTV measure for each , , is of the following implicit expression formulation:
(10) 
Here, can be seen as a coefficient matrix that relates different gradient maps together.
It should be noted that the proposed E3DTV measure is actually a linear transformed sparsity on
, with the transforming matrix to be determined by the input data automatically. Compared with the traditional sparsity measure (6), the proposed (40) takes the correlation prior among gradient maps into consideration when constructing sparse regularization term. It takes the advance that there exist some lowrank basis for s, which is sparse but can represent all information in with a coefficient matrix , and these bases are always more stable to be calculated than the gradient maps themselves when the HSI is corrupted.Since it is not easy to solve (40) directly, we reformulate (40) as the following equivalent formulation:
(11) 
Here, we call two problems equivalent if from a solution of one, a solution of the other is readily found, and vice versa. The proof of the equivalence here is presented in theorem 9.1 in the supplementary material.
4.2 Enhanced 3DTV Regularization Term
By adopting the proposed sparsity measure (41), we can now construct E3DTV regularization term imposed on gradient maps calculated along all its spectral, spatial width and height modes. In similar way as 3DTV regularization (7), we model our regularization term by summing the sparsity measure of the gradient maps along different modes:
(12) 
One can also see Fig. 1 for easy understanding this term. Moreover, this term is equivalent to:
(13) 
In the following sections, we will verify that the proposed E3DTV regularization term (13) can be easily embedded into HSI processing models and the corresponding resolving algorithms can be easily designed.
5 E3DTV methods for HSI denoising
5.1 The E3DTV based HSI denoising model
The denoising task is to separate the clean data and noise from the noisy data. In many of the denoising tasks, the noise is often assumed to be Gaussian, and a quadratic loss function between the clean data and noisy data is thus often exploited. However, in real hyperspectral scenes, the types of noise are various, such as sparse noise, stripes noise, dead lines, missing pixels, and so on
XY2016Hyperspectral ; He2016, which obviously do not obey a Gaussian distribution. Therefore, we choose the
norm, a more robust loss function to general heavy noises Wright2009Robust ; Candes2009Robust , to model the noise and construct our denoising model as:(14) 
where , , denote the input noisy HSI, tobereconstructed HSI and the noise embedded on it, respectively. is tradeoff parameter between the proposed E3DTV measure and the noise level. Based on (13), it is easy to find that this model is equivalent to:
(15) 
This optimization problem (15) can be readily solved by the ADMM strategy as follows.
5.2 The ADMM Algorithm
To slove the proposed HSI denoising model by ADMM algorithm, the following augmented Lagrangian function is firstly required to be minimized:
(16) 
where , and are the Lagrange multipliers and is a positive scalar in ADMM algorithm.
In the ADMM framework, we need to alternatively optimize each variable involved in (16) with all others fixed. We then discuss how to solve its subproblems with respect to each involved variable.
Update . Extracting all items containing from Eq. (16), we can obtain the subproblem as:
(17) 
Optimizing (17) with respect to can thus be equivalently treated as solving the following linear system:
(18) 
where indicates the adjoint operator of . Attributed to the blockcirculant of matrix corresponding to the operator , it can be diagonalized by using the FFT matrix Ng1999A . Then, can be efficiently computed by
(19) 
where fftn and ifftn indicate fast Fourier transform and its inverse transform, respectively.
is the elementswise square, and the division is also performed elementwisely.Update . Extracting all items containing from Eq. (16), we can obtain:
(20) 
This subproblem can be efficiently solved by calling the known softthresholding operator Donoho2002De :
(21) 
where and , is the threshold value. Then the solution of (20) can be calculated in closeform as follows:
(22) 
Update . Extracting all items containing orthogonal from Eq. (16), we have:
(23) 
The global solution of this subproblem can be achieved in closeform by following theorem xie2016multispectral :
Theorem 5.1
For any , the global solution of the problem:
(24) 
is =, where denotes the condensed SVD of .
We can then obtain the updating equation of as:
(25) 
Update . Extracting all items containing from Eq. (16), we can get:
(26) 
Based on the general ADMM principle, the multipliers are further updated by the following equations:
(27) 
Summarizing the aforementioned descriptions, we can get the entire ADMM algorithm, as summarized in , for solving the model (15).
6 Experimental Results on HSI denoising
In this section, extensive experiments are presented to evaluate the performance of the proposed method. We applied our proposed E3DTV prior to HSI denoising in comparison with 8 stateoftheart methods. In order to give an overall evaluation, three quantitative quality indices are employed: PSNR, SSIMWang2004Image , ERGASWald2010Data . PSNR and SSIM are two conventional spatialbased metrics, while ERGAS is spectralbased evaluation measure. The larger PSNR and SSIM values are, and the smaller ERGAS values is, the better quality the restored images tends to be of.
To demonstrate the effectiveness of the proposed algorithm, we compared our denoising results with 8 recently developed stateoftheart denoising methods, including nonlocal transform domain filter for volumetric data (BM4D) Maggioni2013Nonlocal , TDL ^{5}^{5}5http://gr.xjtu.edu.cn/c/document_library/get_file?folderId=1766524&name=DLFE38410.zip Peng2014Decomposable , NNM Wright2009Robust , WNNM Gu2014Weighted , LRMR ZHY2014HSI , WSNM XY2016Hyperspectral , LRTV He2016 and LRTDTV Wang2017Hyperspectral . All the involved parameters in these competing methods are finetuned by default settings or following the rules in their papers to guarantee their possibly optimal performance. Before the experiments, the gray value of data were rescaled to [0,1]. All the experiments were performed using MATLAB (R2014a) on Windows 7 OS with dualcore Inter 3.60GHz processor and 64 GB of RAM.


Noise case  Index  Noise  NNM  WNNM  LRMR  BM4D  TDL  WSNM  LRTV  LRTDTV  Ours 


Case a  PSNR  19.99  30.97  32.58  36.20  38.44  38.05  37.32  38.68  40.76  42.33 
SSIM  0.3672  0.8727  0.8420  0.9311  0.9763  0.9674  0.9453  0.9853  0.9804  0.9910  
ERGAS  233.99  69.02  57.65  36.85  29.04  30.72  32.37  28.47  23.02  24.32  


Case b  PSNR  19.34  30.81  32.46  35.67  35.10  33.22  36.12  38.04  40.54  41.97 
SSIM  0.3592  0.8715  0.8415  0.9291  0.9391  0.8743  0.9402  0.9818  0.9895  0.9910  
ERGAS  257.88  70.06  58.39  39.84  110.40  112.35  44.89  49.18  23.44  24.78  


Case c  PSNR  13.07  31.61  32.36  36.40  28.59  27.50  38.14  39.54  41.08  43.13 
SSIM  0.1778  0.8889  0.8786  0.9345  0.8401  0.9076  0.9551  0.9866  0.9910  0.9928  
ERGAS  520.53  64.36  59.71  36.04  90.29  102.08  29.52  35.22  21.98  22.55  


Case d  PSNR  12.92  31.45  32.29  35.76  27.02  26.54  36.63  38.75  40.72  42.68 
SSIM  0.1748  0.8878  0.8777  0.9316  0.8073  0.8365  0.9485  0.9826  0.9906  0.9926  
ERGAS  529.82  65.45  60.10  39.99  128.11  120.54  46.32  55.44  22.90  23.26  


Case e  PSNR  13.80  29.62  31.33  33.72  27.95  24.34  35.02  36.54  38.83  40.80 
SSIM  0.2038  0.8633  0.8445  0.8951  0.8201  0.5931  0.9285  0.9742  0.9859  0.9894  
ERGAS  500.68  81.16  66.39  50.16  121.95  158.75  51.33  72.52  28.66  28.37  


Case f  PSNR  13.73  29.54  29.97  33.42  27.53  23.34  33.88  36.35  38.63  40.54 
SSIM  0.2022  0.8612  0.8431  0.8918  0.8060  0.5583  0.9261  0.9736  0.9852  0.9888  
ERGAS  504.37  82.18  82.48  52.62  127.28  174.28  53.29  72.05  29.82  28.87  

6.1 Synthetic Data Experiments Setting
6.1.1 Experiments Setting
We adopted two synthetic HSIs data. One was considered by He2016 , generated from the Indian Pines dataset^{6}^{6}6https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html. The size of the Indian pines synthetic HSI is , with evident local smoothness property. The other one is ARIVIS DCMall HSI dataset, whose size is , with complex structure and texture information. To simulate noisy HSI data, we added six different types of noises to both original HSI data. The details are listed as follows:
Case a: Zeromean Gaussian noise is added to each band. The variances of the Gaussian noise is 0.1.
Case b: Gaussian noise is added to each band just as Case a. In addition, some deadlines are added from band 91 to band 130, with the number of stripes randomly selected from 3 to 10, and the width of the stripes randomly generated from 1 to 3.
Case c: The mixture of zeromean Gaussian noise and impulse noise are added to each band. The variance of the Gaussian noise and percentage of the impulse noise are set as 0.075 and 0.15, respectively.
Case d: The mixture of Gaussian and impulse noises are added like Case c, and the deadlines are added as Case b.
Case e: Zeromean Gaussian noises with different variances and impulse noise with different percentage are added to each band, with the variance value being randomly selected from 0 to 0.2, and percentage value being randomly selected from 0 to 0.2. In addition, the deadlines are added to some bands as Case d.
Case f: The mixture of Gaussian noise, impulse noise, and deadlines are added as Case e did. In addition, some stripes are added from band 161 to band 190 in Indian pines, and some stripes are added from band 141 to band 160 in DC mall, with the number of stripes being randomly selected from 20 to 40.
As for parameter settings, there are three parameters in the model that need to be set. They are the desired rank , the noise coefficient , and the sparsity basis coefficient . Since we used the norm to model the noise, inspired by the parameter settings of the RPCA, we set , and set the sparse basis coefficient as . Note that we should manual adjust the value, because the sparse basis coefficient and data is relevant, different data have different coefficient. In our synthetic data experiments, the value of Indian pines was set as 0.004, and that of DC mall was set as 0.0004. The rank is set as 13 in all Indian pines experiments, and 6 in DC mall experiments.


Noise case  Index  Noise  NNM  WNNM  LRMR  BM4D  TDL  WSNM  LRTV  LRTDTV  Ours 


Case a  PSNR  20.00  32.03  33.00  33.77  32.39  35.53  31.68  35.16  35.23  36.56 
SSIM  0.5147  0.9584  0.9569  0.9584  0.9308  0.9710  0.9429  0.9639  0.9636  0.9761  
ERGAS  375.83  90.87  83.40  74.33  86.67  60.37  95.76  62.98  61.83  55.74  


Case b  PSNR  19.87  31.87  32.93  33.55  31.49  34.51  31.63  34.18  34.33  35.57 
SSIM  0.5092  0.9581  0.9565  0.9572  0.9209  0.9650  0.9423  0.9531  0.9575  0.9698  
ERGAS  381.50  92.38  83.92  76.63  101.86  71.58  96.51  76.38  68.90  61.19  


Case c  PSNR  12.39  32.70  33.49  31.77  23.63  23.73  31.92  34.46  35.41  36.35 
SSIM  0.2207  0.9651  0.9632  0.9390  0.6869  0.8010  0.9465  0.9563  0.9672  0.9751  
ERGAS  925.41  84.53  79.64  94.28  251.78  250.93  93.28  91.02  61.54  56.02  


Case d  PSNR  12.37  32.52  33.40  31.61  23.55  23.80  31.87  34.33  35.23  36.08 
SSIM  0.2187  0.9646  0.9627  0.9369  0.6814  0.8003  0.9461  0.9548  0.9658  0.9732  
ERGAS  926.52  86.13  80.39  96.41  252.44  246.69  94.05  97.59  63.98  58.49  


Case e  PSNR  13.70  31.36  32.72  30.83  24.53  25.64  31.57  33.37  34.25  35.42 
SSIM  0.2698  0.9559  0.9551  0.9256  0.7094  0.8420  0.9413  0.9460  0.9580  0.9694  
ERGAS  793.92  98.30  85.48  106.43  228.07  203.72  96.44  99.25  70.40  62.27  


Case f  PSNR  13.67  31.45  32.76  30.80  24.52  25.56  31.47  33.43  34.34  35.23 
SSIM  0.2721  0.9566  0.9561  0.9252  0.7100  0.8383  0.9410  0.9446  0.9575  0.9672  
ERGAS  812.13  98.20  85.78  108.08  230.84  209.68  98.38  115.93  71.13  63.51  

6.1.2 The Synthetic Indian pines HSI experiments
In terms of visual quality, two representative bands of restored HSIs in two typical Case c and Case e obtained by different methods are shown. Fig. 12 shows the restoration results of band 220 under Case c, and Fig. 4 shows the restoration results of band 120 under Case e. Compared with other methods, it can be easily observed that the proposed method recovers more meaningful details including textures and edges, and meanwhile produces better reconstruction results in smooth regions with higher PSNR and SSIM values. And the mean overall quantitative assessment results by all competing denoising methods are listed in Table. 1, and the distribution of quality indices under Case e is depicted in Fig. 5 to show the each band’s restoration performance. In order to clearly depict the top of curve in the (a2) and (b2) pictures, we enlarge this area in the (a1) and (b1) figures. From the figure, we can easily see that the evaluation index curve of E3DTV is higher than other methods, which verifies the better recovery performance of the E3DTV in almost all bands.
To further compare the performance of all competing methods, it would be necessary to show the spectral signatures before and after restoration. As such, Fig. 6 shows one simple example namely the spectral signatures of pixel (50, 50) in Case e. It can be clearly seen that the LRTDTV as well as E3DTV methods show the better restoration results than other methods. Especially, the E3DTV method can preserve some detailed changes of spectral curve. Combined with the quality indices values shown in Table 1, our proposed E3DTV method attains the best restoration results among all compared methods.
6.1.3 The Synthetic DC Mall HSI Experiments
We then show the performance of all competing methods on the DC mall data, which have more complex structure than previous one. We also provide the two visual restoration images in Figs. 7 and 8, respectively. The average of all quantitative assessment results are provided in Table. 2, the distribution of quality indices under Case e is shown in Fig. 9, and the signature restoration of pixel (100, 100) in case e are provided in Fig. 10.
From the table and figures, it can be easily seen that our proposed E3DTV method is also of superior performance among other competing methods both in quantity and visualization. The effectiveness of the proposed E3DTV regularizer is further substantiated.
6.2 Real Data Experiments
Four typical realworld HSI datasets were further used for experiments. These datasets include ARIVIS Indian^{7}^{7}7https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html, ARIVIS Low altitude^{8}^{8}8https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html, HYDICE Urbanpart^{9}^{9}9http://www.tec.army.mil/hypercube and Terrain^{10}^{10}10http://www.tec.army.mil/hypercube. These real HSIs all contain serious pollutions including water pollution, deadlines, strips and sparse noise, and their shapes and structures are very different. It is thus a big challenge to recover the clean HSIs from them. In order to comprehensively display the restoration effects of all methods, we select two representative bands to show the visual restoration effects for all competing methods.
From Figs. 1118, it can be clearly observed that the methods based on utilizing the local smoothness prior and low rank prior such as LRTV and LRTDTV, together with the proposed E3DTV method, perform better than other competing methods. Specifically, the E3DTV method can better recover the overall structure information of HSI and suppress the noises as shown in Figs. 11, and get the best restoration result in preserving the local edge/texture information. The E3DTV method can also be seen to better remove the structure noises (such as deadlines and stripes) and preserve the local texture of the HSIs from the Figs. 13 to 16. The proposed method can better remove the artificial shadows and overlap to effectively avoid false error information caused by lowrank frames as depicted in Fig. 18.
In all, we can see that the proposed E3DTV method clearly outperforms all other competing methods in both its better noise suppressing effect and its finer meaningful structure preservation capabilities. This substitutes the usefulness of the proposed E3DTV regularization on such task in real complicated cases.
7 E3DTV Method for HSI Compressed Sensing
7.1 The E3DTV Model
The compressed sensing task is to reconstruct the HSI from a small number of compressed measurements with global information. Then the compressed measurements can be obtained by
(28) 
where denote the tobereconstructed HSI. is generally considered to contain the clean HSI term and noise term mixed in the signals during the acquisition. Denote the clean HSI and noise as , , respectively, and then we can get . The compressive operator can be instantiated as , where is a random permutation matrix, is the WalshHadamard transform, and is a random down sampling operator. Such a compressive operator has been shown to satisfy the restricted isometry property and successfully used in dealing with various compressed sensing problems Jiang2017Surveillance ; Gogna2015Split ; CaoW2016H .
Recovering based on (28) is an illposed inverse problem, and thus prior information should necessarily be exploited. Here we simply adopted the proposed E3DTV regularization as the unique prior term in the model. In addition, considering the actual transmission, some seriously polluted bands have little or no information. In order to save the transmission resources, these bands can be thrown away at transmission. So here, we assume that the noise obeys a Gaussian distribution as previous works did Wang2017Compressive , and a norm term is adopted on the noise term. The proposed HSI compressed sensing model is then formulated as follows:
(29) 
where is the tradeoff parameter for controlling the sparsity of the gradient map of the HSI and noise regularization parameter. By adopting (13), and introducing a auxiliary matrix , we can rewrite (29) as:
(30) 
7.2 The ADMM Algorithm
The proposed optimization problem (30) can be readily solved by the ADMM strategy. Firstly, the following augmented Lagrangian function is required to be minimized:
(31) 
where , , and are the Lagrange multipliers and s are a positive scalar in ADMM algorithm. We then alternatively optimize each variable involved in (31) with all others fixed, by following steps:
Update . Extracting all items containing from Eq. (31), we can obtain that:
(32) 
Optimizing (32) with respect to can thus be equivalently treated as solving the following linear system:
(33) 
where indicates the adjoint of . This linear system can be solved by offtheshelf techniques such as the preconditioned conjugate gradient method.
Update . Extracting all items containing from Eq. (31), similar to the process of solving , we can obtain that optimizing is equivalent to solve the following linear system:
(34) 
Similar to the process of solving in the GCS denosing model, we can get that can be efficiently computed by
(35) 
where fftn and ifftn indicate fast Fourier transform and its inverse transform, respectively.
Update . Extracting all items containing from Eq. (31), similar to the process of solving in the GCS denosing model, we get that can be efficiently computed by
(36) 
Update . Extracting all items containing orthogonal from Eq. (31), similar to the process of solving in the GCS denosing model, we get that can be efficiently computed by
(37) 
Update . Extracting all items containing from Eq.(31), similar to the process of solving in the GCS denosing model, we can get that can be efficiently computed by
(38) 
Based on the general ADMM principle, the multipliers are further updated by the following equations:
(39) 
The above steps can then be summarized as , for solving the model (30).


Data  Sampling  Quality  Methods  
set  ratio  indices  SpaRCS  KCS  JTTV  SLNTCS  JTenRe3DTV  ours 


Urban  0.3%  PSNR  18.04  15.01  18.90  17.62  23.48  24.17 
SSIM  0.2974  0.1121  0.3730  0.1552  0.3388  0.5881  
ERGAS  462.37  693.67  419.99  485.40  255.74  230.26  
1%  PSNR  18.33  18.51  19.96  20.42  27.93  28.86  
SSIM  0.3119  0.1982  0.4489  0.2348  0.6049  0.7976  
ERGAS  441.76  515.53  373.06  351.99  158.96  141.51  
5%  PSNR  19.87  20.54  26.64  24.98  35.18  38.10  
SSIM  0.3789  0.4139  0.7775  0.5326  0.8305  0.9506  
ERGAS  399.97  370.58  182.05  219.86  77.48  72.98  
10%  PSNR  23.84  23.45  34.22  27.62  36.99  41.60  
SSIM  0.5619  0.5118  0.9266  0.6002  0.8724  0.9688  
ERGAS  298.62  272.78  91.50  159.94  65.27  64.83  
20%  PSNR  30.45  29.94  42.35  30.60  38.04  43.13  
SSIM  0.7356  0.6648  0.9745  0.7322  0.8894  0.9727  
ERGAS  118.62  124.77  54.40  118.46  57.54  62.06  


DC Mall  0.3%  PSNR  18.46  16.24  19.75  17.83  22.10  24.46 
SSIM  0.2228  0.047  0.3066  0.1188  0.3205  0.6171  
ERGAS  467.33  635.11  408.50  518.46  324.00  244.14  
1%  PSNR  18.70  19.48  20.86  20.66  28.90  30.10  
SSIM  0.2409  0.1423  0.4129  0.2708  0.7897  0.8689  
ERGAS  456.27  450.13  360.06  373.86  142.82  128.59  
5%  PSNR  20.40  19.68  27.14  25.40  38.92  41.32  
SSIM  0.4177  0.4724  0.7908  0.5752  0.9673  0.9871  
ERGAS  430.98  522.79  185.48  220.35  47.26  37.12  
10%  PSNR  24.63  19.94  34.52  27.92  40.12  44.46  
SSIM  0.6095  0.4939  0.9479  0.7387  0.9746  0.9936  
ERGAS  347.45  467.29  84.84  165.81  41.56  26.38  
20%  PSNR  33.65  30.94  44.65  34.33  41.15  47.32  
SSIM  0.9065  0.8564  0.9940  0.9043  0.9792  0.9966  
ERGAS  82.31  118.52  25.79  85.68  36.94  19.08  


Moffett Field  0.3%  PSNR  26.58  23.71  27.66  24.85  29.78  30.17 
SSIM  0.5592  0.001  0.5975  0.0333  0.2869  0.6995  
ERGAS  264.89  368.24  237.38  331.08  180.28  182.70  
1%  PSNR  27.13  26.03  29.01  27.03  34.01  34.19  
SSIM  0.5691  0.0166  0.6598  0.2666  0.6198  0.8369  
ERGAS  256.62  295.53  206.65  257.57  107.48  119.19  
5%  PSNR  29.23  28.07  34.73  30.31  41.67  43.42  
SSIM  0.6528  0.2492  0.8559  0.4194  0.8975  0.9713  
ERGAS  234.04  226.74  114.02  181.53  49.37  47.02  
10%  PSNR  33.36  29.93  39.45  32.18  43.18  47.72  
SSIM  0.7374  0.4292  0.9447  0.4301  0.9223  0.9833  
ERGAS  210.29  217.25  65.12  178.19  42.65  36.39  
20%  PSNR  39.11  34.35  46.85  37.93  44.12  49.77  
SSIM  0.8577  0.6157  0.9861  0.8197  0.9333  0.9868  
ERGAS  61.08  116.60  32.10  68.19  38.50  33.16  


Lowal Altitude  0.3%  PSNR  16.91  13.75  17.69  16.67  24.47  24.50 
SSIM  0.2687  0.0237  0.3412  0.0678  0.4818  0.6357  
ERGAS  383.13  490.69  349.63  399.84  146.30  148.59  
1%  PSNR  17.23  16.85  18.96  19.14  28.47  30.76  
SSIM  0.2821  0.1328  0.4341  0.3220  0.7344  0.8623  
ERGAS  374.25  387.59  304.41  272.33  98.85  74.90  
5%  PSNR  17.96  20.28  25.32  24.28  35.39  38.32  
SSIM  0.3525  0.4824  0.7534  0.6136  0.8983  0.9526  
ERGAS  322.52  250.12  158.30  151.82  50.03  47.82  
10%  PSNR  21.24  26.99  32.97  31.56  36.91  41.24  
SSIM  05326  0.6522  0.9151  0.8385  0.9242  0.6490  
ERGAS  231.86  112.80  73.35  76.41  43.64  41.77  
20%  PSNR  31.60  30.18  42.47  33.88  38.03  43.05  
SSIM  0.8517  0.7891  0.9724  0.8927  0.9389  0.9693  
ERGAS  104.22  81.92  35.34  55.43  38.34  42.82  

8 Experimental Results on HSI Compressed Sensing
In this section, several real HSI data experiments were carried out to substantiate the effectiveness of the E3DTV regularizer on HSI compressed sensing.
Four popular realworld HSI data sets were used in our experiments, including the HYDICE Urban, the HYDICE Washington DC Mall, the AVIRIS Moffett Field, and the AVIRIS Lowal altitude data sets. After removing the seriously polluted bands and cropping images for each data set, the HSI cube used for the experiments are of sizes , , and , respectively.
To thoroughly evaluate the performance of the proposed method, we implemented five stateoftheart methods for comparison, including Kronecker compressed sensing (KCS) Li2012A , lowrank and jointsparse recovery (SparCS) Golbabaee2012Hyperspectral , joint nuclear/TV norm minimization (JTTV) Golbabaee2013Joint , selflearning tensor nonlinear compressed sensing (SLTNCS) Yang2015Compressive and Joint tensor/reweight 3DTV norm minimization (JTenRe3DTV) Wang2017Compressive . As previously stated, we adopted the random permuted Hadamard transform as the compressive operator. Five different sampling ratios, namely , , , , were considered. To sufficiently assess the performance of all compared approaches, we also employed the above three quantitative picture quality indices (PQIs), PSNR, SSIM and ERGAS, for performance evaluation.
As for parameter settings, there are two parameters in the model that need to be set: the desired rank and the sparsity basis coefficient . Note that with the increase of the sampling rate, the ideal and would also increase. Specifically, we set the value of rank as 4 or 5, and the sparsity basis coefficient as 0.0075 at the sampling rate are , . With the increase of the sampling rate, the small value of rank and sparsity basis coefficient will cause the algorithm to lose some information, so as the sampling rates are , , , we increase the rank as a value from to , and set the sparsity basis coefficient as 0.015. As for the competing methods’ parameters, we use the default settings or tuned all their values to possibly guarantee their optimal performance for fair comparison.
Table. 3 compares the reconstruction results by all the compared methods. It can be easily seen that, the proposed method achieves a significantly improved performance under all such five sampling ratios with all PQIs, as compared with other competing methods.
In terms of visual quality, two representative bands of four HSIs in sampling ratio were presented in Fig. 1926. Some demarcated areas in the subfigure are enlarged in the right bottom corner and left top corner for better visualization. From these figures and the quantitative indices numerical table, we can see that the method utilizing the local smoothness property of image, such as the JTTV, SLNCS, JTenRe3DTV, can relatively more effectively eliminate the nonsmoothness caused by compression reconstruction as the restoration image of SpaRCS show. Meanwhile, it is evident that the method based on the E3DTV regularizer can preserve the best detail edge, texture information and image fidelity than other competing method.
We also show the PSNR and SSIM values by reconstructing each band of three data sets under sampling ratio in Fig. 27. It is easy to see that the proposed method can get higher SSIM and PSNR values than other ones for almost all HSI bands. This comprehensively illustrates the robustness of our proposed method.
From the above presentation, we can see that the structures of these four real HSIs are different, while our proposed method can get satisfactory performance. Combined the performance of denoising, it should be expected to use the E3DTV regularizer as a general tool to ameliorate the performance for general HSI processing tasks.
9 Conclusion
In this paper, we have introduced an enhanced 3DTV regularization (E3DTV, briefly), to better reflect the sparsity characteristic of the gradient maps of a natural HSI. It is surprising to us that by simply embedding such regularization term as the unique regularizer into the models designed for HSI denoising and compressed sensing, respectively, it attains stateoftheart performance superior to the methods previously used for the tasks. This makes it hopeful further extend such a regularizer to more general HSI processing tasks, which is the main task of our future research.
There are still some limitations on the practical use of the proposed E3DTV regularizer. For example, it should be necessary to consider to design an automatic parameter tuning strategy based on certain HSI as well as its size and structure complexity. Besides, the E3DTV regularizer is still imposed on the unfolding matrices of the considered HSI, while not its multidimensional structure (i.e., tensor) itself. It should be another meaningful investigation on reformulating the proposed regularizer directly on the tensor to more sufficiently utilizing the structural knowledge of the HSI. We will further investigate these issues in our future research.
References

(1)
Buades, A., Coll, B., & Morel, J. M. (2005). A nonlocal algorithm for image denoising.
Computer Vision and Pattern Recognition,c2
, 6065.  (2) Bouali, M., & Ladjal, S. (2011). Toward Optimal Destriping of MODIS Data Using a Unidirectional Variational Model. IEEE Transactions on Geoscience and Remote Sensing, 49(8), 259268.
 (3) Chambolle, A. (2004). An Algorithm for Total Variation Minimization and Applications. Journal of Mathematical imaging and vision, 20(12), 8997.
 (4) Candes, E. J., Romberg, J., & Tao, T. (2006). Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Press, 489509.
 (5) Cao, W. F., Wang, Y., Sun, J., & et al. (2016). Total Variation Regularized Tensor RPCA for Background Subtraction from Compressive Measurements. IEEE Transactions on Image Processing, 25(9),40754090.

(6)
Chen, G. Y., & Qian, S. E. (2011). Denoising of Hyperspectral Imagery Using Principal Component Analysis and Wavelet Shrinkage.
IEEE Transactions on Geoscience and Remote Sensing, 49(3), 973980.  (7) Candes, E. J., Li, X., Ma, Y., & et al. (2009). Robust principal component analysis? IEEE Journal of the Acm, 58(3).
 (8) Donoho, D. L. (2002). Denoising by softthresholding. IEEE Transactions on Information Theory, 41(3), 613627.
 (9) Donoho, D. L. (2006). Compressed sensing. IEEE Transactions on Information Theory, 52(4), 12891306.
 (10) Dabov, K., Foi, A., Katkovnik, V., & et al. (2007). Image Denoising by Sparse 3D TransformDomain Collaborative Filtering. IEEE Transactions on Image Processing, 16(8), 20802095.
 (11) Dias, J. M. B., & Plaza, A. J. (2012). A new technique for hyperspectral compressive sensing using spectral unmixing. SPIE Optical Engineering + Applications 85140N.
 (12) Dong, W. S., Li, G. Y., Shi, G. M., & et al. (2015). LowRank Tensor Approximation with Laplacian Scale Mixture Modeling for Multiframe Image Denoising. IEEE Conference on Computer Vision and Pattern Recognition, 442449.
 (13) Duarte, M. F., & Baraniuk, R. G. (2010). Kronecker product matrices for compressive sensing. IEEE International Conference on Acoustics Speech and Signal Processing, 36503653.
 (14) Dabov K., Foi A. , Katkovnik V., & et al. (2007). Image Denoising by Sparse 3D TransformDomain Collaborative Filtering. IEEE Transactions on Image Processing, 16(8), 20802095.
 (15) Mairal, J., Bach, F., Ponce, J., & et al. (2010). Nonlocal sparse models for image restoration. IEEE International Conference on Computer Vision, 22722279.
 (16) Dong, W., Fu, F., Shi, G., & et al. (2016). Hyperspectral Image Superresolution via Nonnegative Structured Sparse Representation. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 25(5), 23372352.
 (17) Elad, M., & Aharon, M. (2006). Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries. IEEE Transactions on Image Processing, 15(12), 37363745.
 (18) Gogna, A., Shukla, A., Agarwal H. K., & et al. (2015). Split Bregman algorithms for sparse / jointsparse and lowrank signal recovery: Application in compressive hyperspectral imaging. IEEE International Conference on Image Processing, 13021306.
 (19) Golbabaee, M., & Vandergheynst, P. (2012). Hyperspectral image compressed sensing via lowrank and jointsparse matrix recovery. IEEE International Conference on Acoustics, Speech and Signal Processing, 27412744,.
 (20) Golbabaee, M., & Vandergheynst, P. (2013). Joint trace/TV norm minimization: A new efficient approach for spectral compressive imaging. IEEE International Conference on Image Processing, 933936.
 (21) Goetz, A. F. H., Ustin, S. L., & Schaepman, M. E. (2009). Three decades of hyperspectral remote sensing of the Earth: a personal view. Remote Sensing of Environment, 113(9), 516.
 (22) Gu, S. H., Zhang, L., Zuo, W. M., & et al. (2014). Weighted Nuclear Norm Minimization with Application to Image Denoising. IEEE Conference on Computer Vision and Pattern Recognition, 28622869.
 (23) He, W., Zhang, H. Y., Zhang, L. P., & et al. (2015). Hyperspectral Image Denoising via NoiseAdjusted Iterative LowRank Matrix Approximation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 8(6), 30503061.
 (24) He, S., Zhou, H., Wang, Y., & et al. (2016). Superresolution reconstruction of hyperspectral images via low rank tensor modeling and total variation regularization. International Geoscience and Remote Sensing Symposium, 69626965.
 (25) He, W., Zhang, H. Y., Zhang, L. P., & et al. (2016). TotalVariationRegularized LowRank Matrix Factorization for Hyperspectral Image Restoration. IEEE Transactions on Geoscience and Remote Sensing, 54(1), 178188.
 (26) Jiang, H., Deng, W.,& Shen, Z. W. (2017). Surveillance Video Processing Using Compressive Sensing. Inverse Problems and Imaging, 6(2), 201214.
 (27) Jiang, C., Zhang, H. Y., Zhang, Li. P., & et al. (2016). Hyperspectral Image Denoising with a Combined Spatial and Spectral Weighted Hyperspectral Total Variation Model. Canadian Journal of Remote Sensing, 42(1), 5372.
 (28) Kolda, T. G., & Bader, B. W. Tensor Decompositions and Applications. SIAM Review, 51(3), 455500.
 (29) Lu, C., Feng, J., Chen, Y., & et al. (2016). Tensor Robust Principal Component Analysis: Exact Recovery of Corrupted LowRank Tensors via Convex Optimization. IEEE Computer Vision and Pattern Recognition,52495257.
 (30) Liu, J., Tai, X. C., Huang, H., & et al. (2013). A weighted dictionary learning model for denoising images corrupted by mixed noise. IEEE Transactions on Image Processing, 22(3), 11081120.
 (31) Li, C., Sun, T., Kelly, K. F.,& et al. (2012). A compressive sensing and unmixing scheme for hyperspectral data processing. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 21(3),12001210.
 (32) Martín, G., BioucasDias, J. M., & Plaza, A. (2015). HYCA: A New Technique for Hyperspectral Compressive Sensing. IEEE Transactions on Geoscience and Remote Sensing, 53(5), 28192831.

(33)
Mairal, J., Bach, F., Ponce, J., & et al. (2009). Online dictionary learning for sparse coding,
International Conference on Machine Learning
, 689696.  (34) Maggioni, M., Katkovnik, V., Egiazarian, K., & et al. (2013). Nonlocal transformdomain filter for volumetric data denoising and reconstruction, IEEE Transactions on Image Processing, 22(1), 119133.
 (35) Ng, M. K., Chan, R. H., & Tang, W. C. (1999). A Fast Algorithm for Deblurring Models with Neumann Boundary Conditions. Society for Industrial and Applied Mathematics, 21(3), 851866.
 (36) Othman, H., & Qian, S. E. (2006). Noise reduction of hyperspectral imagery using hybrid spatialspectral derivativedomain wavelet shrinkage. IEEE Transactions on Geoscience and Remote Sensing, 44(2), 397408.
 (37) Peng, Y., Meng, D. Y., Xu, Z. B., & et al. (2014). Decomposable Nonlocal Tensor Dictionary Learning for Multispectral Image Denoising. IEEE Conference on Computer Vision and Pattern Recognition, 29492956.
 (38) Peng, Y., Suo, J., Dai, Q., & et al. (2014). Reweighted LowRank Matrix Recovery and its Application in Image Restoration. IEEE Transactions on Cybernetics, 44(12), 24182430.

(39)
Qian, Y. T., & Ye, M. (2013). Hyperspectral Imagery Restoration Using Nonlocal SpectralSpatial Structured Sparse Representation With Noise Estimation.
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 6(2), 499515.  (40) Rudin, L. I., Osher, S. F., & Emad, H. F. (1992). Nonlinear total variation based noise removal algorithms. Eleventh International Conference of the Center for Nonlinear Studies on Experimental Mathematics : Computational Issues in Nonlinear Science: Computational Issues in Nonlinear Science, 259268.
 (41) Wright, J., Ganesh, A., Rao, S., & et al. (2009). Robust Principal Component Analysis: Exact Recovery of Corrupted LowRank Matrices. Advances in Neural Information Processing Systems, 87(4), 20802088.
 (42) Wu, Z. J., Wang, Q., Jin, J., & et al. (2017). Structure tensor total variationregularized weighted nuclear norm minimization for hyperspectral image mixed denoising. Signal Processing, 131(1), 202219.
 (43) Wang, Z., Bovik, A. C., Sheikh, H. R., & et al. (2004). Image quality assessment: from error visibility to structural similarity, IEEE Trans Image Process, 13(4), 600612.
 (44) Wald, L. (2002). Data Fusion. Definitions and Architectures  Fusion of Images of Different Spatial Resolutions. Presses des MINES
 (45) Waters, A. E., Sankaranarayanan, A. C., & Baraniuk, R. G. (2011). SpaRCS: recovering lowrank and sparse matrices from compressive measurements, International Conference on Neural Information Processing Systems, 10891097.
 (46) Wang, Y., Lin, L., Zhao, Q., & et al. (2017). Compressive Sensing of Hyperspectral Images via Joint Tensor Tucker Decomposition and Weighted Total Variation Regularization. IEEE Geoscience and Remote Sensing Letters, 14(12), 24572461.
 (47) Wang, Y., Peng, J. J., Zhao, Q., & et al. (2018). Hyperspectral Image Restoration Via Total Variation Regularized LowRank Tensor Decomposition, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(4), 12271243.
 (48) Willett R. M., Duarte, M. F., Davenport M., &, et al. (2013). Sparsity and Structure in Hyperspectral Imaging : Sensing, Reconstruction, and Target Detection. IEEE Signal Processing Magazine, 31(1), 116126.
 (49) Xing, Z., Zhou, M., Castrodad, A., & et al. (2011). Dictionary Learning for Noisy and Incomplete Hyperspectral Images. SIAM Journal on Imaging Sciences, 5(1), 3356.
 (50) Xie, Y., Qu, Y. Y., Tao, D. C., & et al. (2016). Hyperspectral Image Restoration via Iteratively Regularized Weighted Schatten Norm Minimization. IEEE Transactions on Geoscience and Remote Sensing, 54(8), 46424659.
 (51) Xie, Q., Zhao, Q., Meng, D. Y., & et al. (2016). Multispectral images denoising by intrinsic tensor sparsity regularization, IEEE Conference on Computer Vision and Pattern Recognition, 1692–1700.
 (52) Yang, S., Wang, M., Li, P., & et al. (2015). Compressive Hyperspectral Imaging via Sparse Tensor and Nonlinear Compressed Sensing. IEEE Transactions on Geoscience and Remote Sensing, 53(11), 59435957.
 (53) Zhang, L., Wei, W., Zhang, Y. N., & et al. (2015). Hyperspectral Compressive Sensing Using ManifoldStructured Sparsity Prior, IEEE International Conference on Computer Vision, 35503558.
 (54) Zhao, J., Zhong, Y., Shu, H., & et al. (2016). HighResolution Image Classification Integrating SpectralSpatialLocation Cues by Conditional Random Fields. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society, 25(9), 40334045.
 (55) Zhang, H. Y., He, W., Zhang, L. P., & et al. (2014). Hyperspectral Image Restoration Using LowRank Matrix Recovery. IEEE Transactions on Geoscience and Remote Sensing, 52(8), 47294743.
 (56) Zhou, T. Y., & Tao, D. C. (2011). GoDec: Randomized Lowrank and Sparse Matrix Decomposition in Noisy Case. International Conference on Machine Learning, 3340.
Appendix
In this supplementary material, we provide the proof of the equivalence between problem (10) and problem (11) in the maintext.
Proof of the equivalence
We call two problems equivalent if from a solution of one, a solution of the other is readily found, and vice versa. We can then give the following theorem:
Theorem 9.1
For any , the problem
(40) 
is equivalent to
(41) 