I Introduction
Image denoising is one of the most studied problems at the core of image processing. It has been questioned if it is a solved problem, given the excellent performance of several stateoftheart methods, which seem to be approaching some sort of theoretical limit [1]. The answer to this question is negative: denoising is still a very active area of research, not only in itself, but also as a building block for many other tasks. In fact, most algorithms for imaging inverse problems (e.g.,
deblurring, reconstruction, or superresolution) include a step that is formally equivalent to a denoising operation.
A possible approach to leverage stateoftheart denoising methods in other inverse problems is to generalize/extend their underlying rationale to these other tasks [2, 3, 4]. The main drawbacks of this approach are that, most often, it is strongly problemdependent (i.e., different problems require different generalizations) and it may lead to computationally hard problems. The recently proposed plugandplay (PnP) approach aims at sidestepping these drawbacks and still be able to capitalize on stateoftheart denoisers. PnP methods work by plugging a blackbox denoiser into an iterative algorithm that address the inverse problem in hand. These algorithms typically include the proximity operator [5]
of a convex regularizer, which is formally equivalent to a denoising operation (under Gaussian white noise), with the convexity of the regularizer playing a central role in the convergence properties of the algorithm. The PnP approach replaces the proximity operator with some blackbox denoiser
[6, 7], the key advantage being that the same denoiser can be used for different problems, since its use is fully decoupled from the particularities of the inverse problem being addressed.Replacing the proximity operator with some denoiser makes the convergence of the resulting algorithm difficult to analyze. While the original algorithm optimizes an objective function, the PnP version lacks an explicit objective function. In the original PnP scheme [6, 7], the iterative algorithm in which the denoiser is pluggedin is the alternating direction method of multipliers (ADMM) [8, 9]. PnPADMM allows using an arbitrary denoiser to regularize an imaging inverse problem, such as deblurring or superresolution, tackled via ADMM, but departing from its standard use, where the denoiser is the proximity operator of a convex regularizer [8]. (Another recent approach, termed regularization by denoising, follows a different route by building an explicit regularizer based on a denoiser [10].) The fact that a denoiser, which may lack an explicit objective function or a closedform, is plugged into ADMM begs three obvious questions [11, 6, 7]:

is the resulting algorithm guaranteed to converge?

if so, does it converge to a (global or local) optimum of some objective function?

if so, can we identify this function?
In this paper, we adopt a patchbased denoiser using Gaussian mixture models (GMM) [12, 13, 14], which allows us to provide positive answers to these three questions.
As proposed in our earlier work [15], a GMM can be adapted/specialized to specific classes of images; the rationale is that a denoiser based on a classadapted prior is able to capture the characteristics of that class better than a generalpurpose one. Here, we take this adaptation one step further, by using sceneadapted priors. As the name suggests, the model is no longer learned from a set of images from the same class, but from one or more images of the same scene being restored/reconstructed. Hyperspectral sharpening [16, 17, 18, 19] and image deblurring from noisy/blurred image pairs [20, 21] are the two data fusion problems that may leverage such priors and that we consider in this paper.
In summary, our contributions are the following.

We propose plugging a GMMbased denoiser into an ADMM algorithm; this GMMbased denoiser is a modified version of a patchbased MMSE (minimum mean squared error) denoiser [12]; the modification consists in fixing the posterior weights of the MMSE denoiser.

We prove that the resulting PnPADMM algorithm is guaranteed to converge to a global minimum of a cost function, which we explicitly identify.

We apply the proposed framework to the two fusion problems mentioned above, showing that it yields results competitive with the statoftheart.
An earlier version of this work was published in [22]. With respect to that publication, this one includes more detailed and general results and proofs, and the application to noisy/blurred image pairs, which was not considered.
The paper is organized as follows. Section II briefly reviews ADMM, which is the basis of the proposed algorithm, and its PnP version. Section III studies conditions for the denoiser to be a proximity operator, focusing on linear denoisers and explicitly identifying the underlying objective function. Section IV proposes a GMMbased denoiser and shows that it satisfies the conditions for being a proximity operator. Sections V – VII present the two instantiations of the proposed method and the corresponding experimental results. Finally, Section VIII concludes the paper.
Ii Alternating Direction Method of Multipliers
Iia From Basic ADMM to SALSA
Although dating back to the 1970’s [23]
, ADMM has seen a surge of interest in the last decade, as a flexible and efficient optimization tool, widely used in imaging problems, machine learning, and other areas
[24, 8]. One of the canonical problem forms addressed by ADMM is(1) 
where and are closed (equivalently, lower semicontinuous – l.s.c. [5]), proper^{1}^{1}1A convex function is called proper if its domain is not empty: ., convex functions, and is a full column rank matrix of appropriate dimensions [24]. Each iteration of ADMM for (1) is as follows (with the superscript denoting the iteration counter):
(2)  
(3)  
(4) 
where are the (scaled) Lagrange multipliers at iteration , and is the penalty parameter.
ADMM can be directly applied to problems involving the sum of closed, proper, convex functions, composed with linear operators,
(5) 
by casting it into the form (1) as follows:
The resulting instance of ADMM (called SALSA–split augmented Lagrangian shrinkage algorithm [9]) is
(6)  
(7)  
(8)  
(9) 
where (as for ), . Computing requires solving a linear system (equivalently, inverting a matrix):
(10) 
whereas computing (for ) requires applying the proximity operator of the corresponding :
(11) 
Recall that the proximity operator of a convex l.s.c. function , computed at some point , is given by
(12) 
and is guaranteed to be well defined due to the coercivity and strict convexity of the term [5].
IiB Convergence of ADMM Algorithm
We present a simplified version of a classical theorem on the convergence of ADMM, proved in the seminal paper of Eckstein and Bertsekas [24]. The full version of the theorem allows for inexact solution of the optimization problems in the iterations of ADMM, while this version, which we will use below, assumes exact solutions. For SALSA, being a particular instance of ADMM, the application of the theorem is trivial.
Theorem 1 (Eckstein and Bertsekas [24]).
Consider a problem of the form (1), where has full column rank, and and are closed, proper, convex functions; let , and be given. If the sequences , , and are generated according to (2), (3), (4), then converges to a solution of (1), , if one exists. Furthermore, if a solution does not exist, then at least one of the sequences or diverges.
IiC PlugandPlay ADMM
The PnP framework [6] emerged from noticing that subproblem (3) in ADMM (equivalently, (7)(8) in SALSA), which correspond to the application of a proximity operator, can be interpreted as a denoising step. In fact, the definition of proximity operator in (12) is formally equivalent the maximum a posteriori
(MAP) estimate of some unknown
from noisy observations(under additive white Gaussian noise with unit variance), under a prior proportional to
. Armed with this observation, the goal of the PnP approach is to capitalize on recent developments in image denoising, by replacing a proximity operator by a stateoftheart denoiser. However, the current stateoftheart denoising methods do not correspond (at least, explicitly) to proximity operators of convex functions. Most of them exploit image selfsimilarity by decomposing the image into small patches, i.e., they rely on a divide and conquer principle, where instead of dealing with the noisy image as a whole, small (overlapping) patches are extracted, denoised independently (using, e.g., collaborative filtering, Gaussian mixtures, dictionaries, …) and then put back in their locations [25, 26, 27, 12, 13, 28]. These denoisers often lack a closedform expression, making the convergence analysis of the resulting algorithm much more difficult. In the following section, we focus on the key properties of the denoising functions needed to satisfy Theorem 1, temporarily leaving aside the operator , as it is problemdependent.Iii When is a Denoiser a Proximity Operator?
Iiia Necessary and Sufficient Conditions
If the adopted denoiser in the PnPADMM scheme can be shown to correspond to the proximity operator of some l.s.c. convex function (although maybe not explicitly derived as such), then nothing really changes. The following classical result by Moreau [29] (which was used by Sreehari et al [7] in this context) gives necessary and sufficient conditions for some mapping to be a proximity operator.
Lemma 1 (Moreau [29]).
A function is the proximity operator of some function if and only if it is nonexpansive and it is a subgradient of some convex function , i.e., if and only if, for any ,
(13) 
where denotes the subdifferential of at [5].
In practice, given some blackbox denoiser that performs a complex sequence of operations (e.g., the stateoftheart BM3D [26]) it may be difficult, if not impossible, to verify that it satisfies the conditions of Lemma 1. In the next section, we study a class of denoisers that verify the conditions of Lemma 1, and go one step further by identifying the function of which the denoiser is the proximity operator. This second step is important to study the existence and uniqueness of solutions of the underlying optimization problem.
IiiB Denoisers as Linear Filters
Arguably, the simplest denoising methods estimate each pixel as a linear combination of the noisy pixels, with weights that depend themselves on these noisy pixels. Mathematically, given a noisy image , where is the underlying clean image and is noise, these methods produce an estimate, , according to
(14) 
where is a square matrix, which is a function of . Many methods, including the classical nonlocal means (NLM [27]) can be written in this form. Sreehari et al [7] have studied this class of denoisers, with
being a symmetric and doublystochastic matrix. This is the case of symmetrized NLM
[6, 30], patchbased sparse representations on a dictionary with a fixed support [18], and patchbased approximate MAP estimation under a GMM prior [31].In this paper, we study a different class of denoisers, where the matrix is fixed, , but it only needs to be symmetric, positive semidefinite (p.s.d.), and have spectral radius less or equal to 1 (conditions that are satisfied by any symmetric doublystochastic matrix). As shown below, the patchbased GMMdenoiser proposed in this paper satisfies these conditions.
Before proceeding, we introduce some notation. Let denote the column span of . Let denote the indicator function of the convex set , i.e., , if , and , if . The following theorem is proved in Appendix A.
Theorem 2.
Let be defined as , where
is a symmetric, p.s.d. matrix, with maximum eigenvalue satisfying
. Let be the eigendecomposition of , where contains the eigenvalues, sorted in decreasing order; the first are nonzero and the last are zeros. Let be the diagonal matrix with the nonzero eigenvalues and the matrix with the first columns of (an orthonormal basis for . Then,
is the proximity operator of a convex function ;

is the following closed, proper, convex function:
(15)
Iv GMMBased Denoising
In this paper, we adopt a GMM patchbased denoiser to plug into ADMM. The reasons for this choice are threefold: (i) it has been shown that GMM are good priors for clean image patches [12, 13, 14]; (ii) a GMM can be learned either from an external dataset of clean images [13, 14], or directly from noisy patches [12]; (iii) the ease of learning a GMM from an external dataset opens the door to classspecific priors, which, naturally, lead to superior results since they capture the characteristics of the image class in hand better than a generalpurpose image model [15, 32].
Leaving aside, for now, the question of how to learn the parameters of the model, we consider a GMM prior^{2}^{2}2
denotes a Gaussian probability density function of mean
and covariance , computed at .(16) 
for any patch of the image to be denoised. A current practice in patchbased image denoising is to treat the average of each patch separately: the average of the noisy patch is subtracted from it, the resulting zeromean patch is denoised, and the average is added back to the estimated patch. The implication of this procedure is that, without loss of generality, we can assume that the components of the GMM have zero mean (, for ).
Given this GMM prior and a noisy version of each patch, where , with , the MMSE estimate of is given by
(17) 
where
(18) 
and
(19) 
Notice that
is the posterior probability that the
ith patch was generated by the jth component of the GMM, and is the MMSE estimate of the ith patch if it was known that it had been generated by the th component.As is standard in patchbased denoising, after computing the MMSE estimates of the patches, they are returned to their location and combined by straight averaging. This corresponds to solving the following optimization problem
(20) 
where is a binary matrix that extracts the th patch from the image (thus puts the patch back into its place), is the number of patches, is the number of pixels in each patch, and the total number of image pixels. The solution to (20) is
(21) 
assuming the patches are extracted with unit stride and periodic boundary conditions, thus every pixel belongs to
patches and .Iva From ClassAdapted to SceneAdapted Priors
Classadapted priors have been used successfully in image deblurring, if the image to be deblurred is known to belong to a certain class, e.g., a face or a fingerprint, and a prior adapted to that class is used [15]. Here, we take this adaptation to an extreme, which is possible if we have two types of data/images of the same scene, as is the case with some fusion problems. Instead of using a dataset of other images from the same class (e.g., faces), we leverage one type of data to learn a model that is adapted to the specific scene being imaged, and use it as a prior. The underlying assumption is that the two types of data have similar spatial statistical properties, which is reasonable, as the scene being imaged is the same. In Sections V and VI, we apply this idea to tackle two data fusion problems, namely hyperspectral (HS) sharpening [16], and image deblurring from a noisy/blurred image pair [21].
In HS sharpening, the goal is to fuse HS data of low spatial resolution, but high spectral resolution, with multispectral (MS) or panchromatic (PAN) data of high spatial resolution, but low spectral resolution, to obtain an HS image with high spatial and spectral resolutions. This can be seen as a superresolution problem, where the HS image is superresolved with the help of a spatial prior obtained from the MS image of the same scene. More specifically, we resort to a GMM prior, as described in the previous section, learned from (patches of the bands of) the observed MS or PAN image using the classical expectationmaximization (EM) algorithm, and then leverage this prior to sharpen the low resolution HS bands.
In the second problem, the goal is to restore a blurred, but relatively noiseless, image, with the help of another image of the same scene, which is sharp but noisy. The noisy sharp image is used to learn a spatial prior, adapted to the particular scene, which is then used as a prior to estimate the underlying sharp and clean image. Again, this is carried out by learning a GMM from the patches of the noisy, but sharp, image.
In light of the sceneadaptation scheme just described, we further claim that, it not only makes sense to use a GMM prior learned from the observed sharp images, but it also makes sense to keep the weights of each patch in the target image, required to compute (17), equal to the values for the corresponding observed patch, obtained during training. In other words, we use the weights as if we were simply denoising the sharp image, with a GMM trained from its noisy patches. In fact, another instance of this idea has been previously used, based on sparse representations on learned dictionaries [18]. We stress that this approach is only reasonable in the case of image denoising or image reconstruction using sceneadapted priors, where the training and target images are of the same scene, and thus have similar spatial structures. As will be shown in the next section, this modification will play an important role in proving convergence of the proposed PnPADMM algorithm.
IvB Analysis of the Sceneadapted GMMbased Denoiser
Consider the role of the denoiser: for a noisy input argument , it produces an estimate , given the parameters of the GMM, , and the noise variance . As the GMMdenoiser is patchbased, the algorithm starts by extracting (overlapping, with unit stride) patches, followed by computing their estimates using (17). Recall that, with the sceneadaptation scheme, parameters no longer depend on the noisy patches, and thus we may write compactly,
(22) 
where is the same matrix that appears in (20), thus . Furthermore, using (21),
(23) 
which shows that the GMMbased MMSE denoiser, with fixed weights, is a linear function of the noisy image. Matrix , of course, depends on the parameters of the method: , , and the parameters of the GMM, but it is fixed throughout the iterations of ADMM. Recall that the weights and covariances are obtained during training, and is assumed known, thus is fixed. The following Lemma summarizes the key properties of , when using the sceneadapted GMM denoiser. The proof can be found in Appendix B.
Lemma 2.
Matrix is symmetric, positive semidefinite, with maximum eigenvalue no larger than 1.
Corollary 1.
Notice that Lemma 2 and Corollary 1 hinge on the fact that the denoiser uses fixed weights , rather than being a pure MMSE denoiser. In fact, the simple univariate example in Fig. (a)a shows that an MMSE denoiser may not even be nonexpansive. This figure plots the MMSE estimate in (17), as a function of the noisy observed , under a GMM prior with two zeromean components (one with small and one with large variance). Clearly, the presence of regions of the function with derivative larger than 1 implies that it is not nonexpansive.
IvC Convergence Proof
Corollary 1 allows proving convergence of the proposed algorithm. This result is formalized in Corollary 2.
Corollary 2.
Consider the SALSA algorithm (in (6)–(9)) applied to a problem of the form (5), with , for at least one . Consider also that are proper closed convex functions, and , where is as defined in (15). Furthermore, assume that the intersection of the domains of is not empty. Then, the algorithm converges.
Proof.
The proof amounts to verifying the conditions of Theorem 1. The first condition for convergence is that has full column rank. Since
the fact that implies that has full column rank, regardless of the other blocks. The second condition regards functions to ; while to are assumed to be closed, proper, and convex, Corollary 1 guarantees that is also closed, proper, and convex. Finally, the function being minimized is the sum of closed, proper, convex functions, such that the intersection of the corresponding domains is not empty; thus, the sum is itself closed, proper, and convex, implying that a minimizer is guaranteed to exist. ∎
V Application to Hyperspectral Sharpening
Va Formulation
Hyperspectral (HS) sharpening is a data fusion problem where the goal is to combine HS data of very high spectral resolution, but low spatial resolution, with multispectral (MS) or panchromatic (PAN) data of high spatial resolution, but low spectral resolution [16]. This data fusion problem can be formulated as an inverse problem, where the goal is to infer an image with both high spatial and spectral resolutions, from the observed data [16, 33]. Using compact matrix notation, the forward model can be written as
(24)  
(25) 
where:

is the image to be inferred, with bands and pixels (columns index pixels, rows index the bands);

denotes the observed HS data, with bands and pixels (naturally, ; often, )

is the observed MS/PAN data, with bands and pixels (naturally, ; often, );

represents a spatial lowpass filter;

is the spatial subsampling operator;

models the spectral response of the MS or PAN sensor;

and represent noise, assumed white, Gaussian, with known variances and , respectively.
Eqs. (24)(25) model the HS data as a spatially blurred and downsampled version of , and the MS data as a spectrally degraded version of , via the spectral response matrix . In this paper, we assume that and are known, and that the spatial filter is cyclic so that we may handle it efficiently using the
fast Fourier transform
(FFT); this assumption could be easily relaxed using the method in [34]. Moreover, other types of noise could also be considered within this formulation [18].Since the number of unknowns in is typically much larger than the number of measurements, , the problem of estimating is illposed. Current stateoftheart methods tackle it using a variational formulation (equivalently, maximum a posteriori – MAP) [16], i.e., seeking an estimate according to
(26) 
where is the regularizer (or negative logprior), while parameters and control the relative weight of each term.
One way to alleviate the difficulty associated with the very high dimensionality of is to exploit the fact that the spectra in a HS image typically live in a subspace of of dimension (often with ), due to the high correlation among the different bands [35, 16]. This subspace can be interpreted as a set of materials existing on the scene, which is often much smaller than the number of bands. There are several methods to identify a suitable subspace, from the classical singular value decomposition (SVD) or principal component analysis (PCA), to methods specifically designed for HS data [36, 37]. Let contain a basis for the identified subspace, then we may write , where each column of , denoted latent image, is a coefficient representation of the corresponding column of . Instead of estimating directly, this approach reformulates the problem as that of obtaining an estimate of , from which can be recovered. Replacing by its representation in (26) leads to a new estimation criterion
(27) 
where denotes the squared Frobenius norm and is the regularizer acting on the coefficients.
VB SALSA for Hyperspectral Sharpening
We follow Simões et al. [17] and use SALSA to tackle problem (27). Notice that, while in (27), the optimization variable is a matrix , in (5
), the optimization variable is a vector
. This is merely a difference in notation, as we can represent matrix by its vectorized version , i.e., by stacking the columns of into vector . In fact, using wellknown equalities relating vectorization and Kronecker products, we can write(28)  
(29) 
and map (27) into (5) by letting , and
(30)  
(31)  
(32) 
where , for , and
(33) 
Although the previous paragraph showed that we can arbitrarily switch between matrix and vector representations, in what follows it is more convenient to keep the matrix versions. Each iteration of the resulting instance of ADMM has the form
(34)  
(35)  
(36)  
(37) 
where is the socalled penalty parameter of ADMM [8]. The first three problems are quadratic, thus with closedform solutions involving matrix inversions [17]:
(38)  
(39)  
(40) 
where denotes entrywise (Hadamard) product. The matrix inversion in (38) can be computed with cost via FFT, assuming periodic boundary conditions or unknown boundaries [34]. The matrix inversions in (39) and (40) involve matrices of size ; since is typically a number around , the cost of these inversions is marginal. Moreover, with fixed , these two inverses can be precomputed [17].
VC GMMBased PnPSALSA Algorithm
Finally, the PnP algorithm is obtained by letting the regularizer be as defined in (15), i.e., (34) takes the form
(41) 
where is as given in (22)–(23), with . Of course, (41) is just the formal representation of the denoiser acting on ; matrix is never explicitly formed, and the multiplication by corresponds to the patchbased denoiser described in Section IV. The resulting PnPSALSA algorithm is as shown in Algorithm 1, and its convergence is stated by the following corollary.
Corollary 3.
Algorithm 1 converges.
Proof.
Convergence of Algorithm 1 is a direct consequence of Corollary 2, with , and , with as given by (15). The condition that at least one of the matrices is an identity is satisfied in (33). Functions and (see (30) and (31)) are quadratic, thus closed, proper, and convex, and their domain is the whole Euclidean space where they are defined; consequently, , implying that (27) has at least one solution. ∎
VD Complete Hyperspectral Sharpening Algorithm
Finally, to obtain a complete HS sharpening algorithm, we still need to specify how matrices and are obtained. Matrix is obtained by applying PCA to all the spectra in the HS data, .
As mentioned above, matrix is not explicitly formed; multiplication by corresponds to the patchbased denoiser described in Subsection IVB, based on the covariance matrices , and weights . These covariance matrices are learned from all the patches of all the bands of , using the EM algorithm. With denoting the (vectorized) th patch of the th band of , each Estep of EM computes the posterior probability that this patch was produced by the the GMM component:
(42) 
where are the current weight estimates. The Mstep updates the weight and covariance estimates. The weights are updated using the standard expression in EM for GMM:
The covariances update has to be modified to take into account that the patches have noise (see [12]),
where is the eigenvalue thresholding operation, which guarantees that is a p.s.d. matrix after subtracting ; for a real symmetric matrix argument, , where the columns of
are the eigenvectors of
, is the diagonal matrix holding its eigenvalues, and the operation is applied elementwise.After convergence of EM is declared, the posterior weights of each patch are computed by averaging across bands,
(43) 
and stored in (recall that is the number of patches). Naturally, we assume that the patches are aligned across bands, i.e., all the patches , for , are exactly in the same spatial location. The complete method is described in Algorithm 2.
Vi Application to Deblurring Image Pairs
Via Formulation
Another application that can leverage sceneadapted priors is image deblurring from blurred and noisy image pairs. It was first proposed in [20], and it is particularly useful for improving image quality under dim lighting conditions. On the one hand, a long exposure with a handheld camera is likely to yield blurred pictures, whereas, on the other hand, a short exposure with high ISO produces sharp, yet noisy images. The idea is then to combine these two types of images of the same scene and fuse them to obtain a sharp and clean image.
The observation model in this problem is a simple instance of (24)–(25), with , , , and , i.e., there is neither subsampling nor spectral degradation from the sensor. Furthermore, we only consider grayscale images, thus , and we vectorize the images differently, with all the pixels stacked into a column vector. In conclusion, the observation model is
(44)  
(45) 
where:

is the (vectorized) target image to be inferred;

is the blurred image;

is the noisy sharp image;

represents a spatial blur;

represent noise, assumed to be zeromean, white, Gaussian, with variances and , with .
As before, we seek a MAP/variational estimate given by
(46) 
where is the regularizer, and and have similar meanings as in (27).
ViB ADMM for Deblurring with Image Pairs
For this problem formulation, we may use the canonical form of ADMM in (1). Each iteration of the resulting algorithm becomes
(47)  
(48)  
(49) 
where the objective in (47) is quadratic, with a closedform solution,
(50) 
As above, assuming periodic or unknown boundaries, (50) can be implemented with cost via the FFT [34]. The proximity operator in (48) corresponds to the denoising step, which thus takes the form
(51) 
The proof of convergence of the algorithm in (47)–(49) is essentially the same as that of the previous section, so we abstain from presenting it here; we simply have to notice that (51) is indeed a proximity operator and that (46) is guaranteed to have at least one solution, because it is a proper, closed, convex function.
As in the HS sharpening case, is not explicitly formed; multiplication by corresponds to the patchbased denoiser described in Subsection IVB, based on the covariance matrices , and weights . These covariances are learned via EM from the patches of the noisy sharp image, and the weights of the GMM components at each patch are kept and used to denoise the corresponding patch in the blurred image.
Vii Experimental Results
Viia Hyperspectral Sharpening
We begin by reporting an experiment designed to assess the adequacy of the GMM prior. We generate synthetic HS and MS data according to models (24) and (25), respectively, with 50dB SNR (nearly noiseless). Afterwards, we add Gaussian noise with a given variance, namely , and recover the clean data using the GMMbased denoiser and BM3D [26]. The results are then compared with respect to the visual quality and peak SNR (PSNR). Whereas BM3D estimates the clean image from the noisy one, the GMM is trained only once on the nearly noiseless PAN image. This prior is then used on all the denoising experiments: noisy PAN, noisy MS band (red channel), noisy HS band, noisy image of coefficients, see Figs 8 to 23. The weights are also kept fixed on the four experiments, that is, we use the weights computed in the training stage, relative to the PAN image. In all the experiments, we consider a GMM with 20 components, trained from overlapping patches.
The results support the claim that the GMM trained from the PAN image is a good prior for both the MS and HS bands, and also for the image of coefficients. In fact, when denoising the PAN and MS data, the GMMbased denoiser, with the precomputed model and weights, is able to outperform BM3D (Figs 8 and 13). Furthermore, when denoising the HS image, the same GMM prior is still able to perform competitively with BM3D, which uses only the noisy input (Figs 18 and 23). While BM3D performs very well in this pure denoising setting, being able to learn the model from the PAN image is an important feature of the GMM denoiser because, in HS sharpening, the spatial resolution of the HS bands may not be enough to learn a suitable prior. On the other hand, since the PAN image has the highest spatial resolution, it is the best band to train the model. Notice that the other MS bands, and also the HS bands and coefficients, retain most of the spatial features of the panchromatic band, which gives supporting evidence to our claim that the weights should be kept fixed. Table I shows the results when the weights are fixed versus weights computed with (19) depending on the noisy input patch (the exact MMSE estimate).
Image 
Fixed  Varying 

PAN  32.40  31.97 
Red  32.52  32.31 
HS  35.90  35.86 
Coefficient  33.62  33.45 
Exp. 1 (PAN)  Exp. 2 (PAN)  Exp. 3 (R,G,B,NIR)  Exp. 4 (R,G,B,NIR)  
Dataset  Metric  ERGAS  SAM  PSNR  ERGAS  SAM  PSNR  ERGAS  SAM  PSNR  ERGAS  SAM  PSNR 
Rosis  Dictionary [18]  1.97  3.25  32.80  2.04  3.14  32.22  0.47  0.84  45.66  0.87  1.57  39.46 
MMSEGMM [19]  1.62  2.66  34.09  1.72  2.69  33.46  0.49  0.87  45.64  0.84  1.38  39.94  
MAPGMM [31]  1.56  2.56  34.49  1.64  2.59  33.94  0.49  0.87  45.67  0.80  1.35  40.34  
SAGMM  1.55  2.53  34.57  1.64  2.57  33.96  0.49  0.87  45.68  0.80  1.34  40.31  
Moffett  Dictionary [18]  2.67  4.18  32.86  2.73  4.19  32.51  1.83  2.70  39.13  2.13  3.19  36.15 
MMSEGMM [19]  2.56  4.03  32.98  2.64  4.04  32.68  1.85  2.69  39.14  1.98  2.97  36.50  
MAPGMM [31]  2.47  3.91  33.28  2.55  3.91  32.96  1.69  2.52  39.57  2.01  2.88  36.61  
SAGMM  2.47  3.89  33.30  2.55  3.90  32.97  1.67  2.47  39.66  1.98  2.84  36.76 
Next, we compare the HS sharpening results using the MMSE and MAP GMMbased denoisers from [12] and [31], respectively, the sceneadapted GMM denoiser herein proposed, and the dictionarybased method from [18] (which, to the best of our knowledge, is stateoftheart). We use three different metrics: ERGAS (erreur relative globale adimensionnelle de synthèse), SAM (spectral angle mapper), and PSNR [16], in 4 different settings. In the first one, we consider sharpening the HS images using the PAN image, both at 50dB SNR. The second experiment refers to PANsharpening as well, at 30dB SNR on the PAN image and last 50 HS bands, and 35dB on all the other bands [18]. Experiments 3 and 4 use the same SNR as 1 and 2, respectively, but the HS bands are sharpened based on 4 MS bands: R, G, B, and nearinfrared. Once again, a GMM with components is learned, and used as explained in Subsection VD. The PAN and MS images were generated from the original HS data using the IKONOS and LANDSAT spectral responses. For more details about the experimental procedure we refer the reader to [18].
Table II shows the results on a cropped area of the ROSIS Pavia University and Moffett Field datasets. The four methods have comparable performances, with the proposed method being slightly better. The results also support the claim that the target image has the same spatial structure as the PAN image used to train the GMM, otherwise keeping the posterior weights would not yield good results. Figures (a)a–(d)d show the results of experiment 4 on the Moffett Field data, in terms of visual quality, yet the differences are not visually noticeable. To better grasp the accuracy, we plot the (sorted) errors for each pixel, as well as pixel values across bands. In all experiments, the parameters of PnPSALSA, as described on Algorithm 1, were selected using a grid search. In particular, in this example, we used , , .
Comments
There are no comments yet.