1 Introduction
Following the significant progress in image denoising, researchers have experimented with the idea of using modern denoisers for image restoration problems such as deconvolution, deblurring, tomography, and compressed sensing [1, 2, 3, 4]. Generally speaking, these are based on iterative methods, where each iteration involves the inversion of the forward model followed by the application of a powerful denoiser. Despite their empirical success, it has generally been challenging to furnish guarantees on convergence and optimality. The technical hurdle in this regard stems from the fact that stateoftheart denoisers such as NLM [5] and BM3D [6] are derived from a filtering and not an optimization perspective. To be precise, it is not known if they can be expressed as the proximal map [7] of some regularizer , i.e., whether the minimizer of
(1) 
corresponds to the output obtained by denoising .
It was recently shown in [8] that the existence of such a regularizer can be guaranteed for a doubly stochastic variant of NLM. In particular, it was shown that if is the regularizer in question and is data fidelity term associated with the forward model, then the ADMMbased solution [7] of
(2) 
provides a framework where the inversion and regularization steps can be decoupled, and the latter amounts to denoising using doubly stochastic NLM (see Section 2 for a detailed description). In fact, the idea of replacing the regularization step in ADMM with an arbitrary denoiser was proposed earlier in [9] under the name “plugandplay” ADMM (PnPADMM), albeit without any convergence guarantees. Later, it was also shown in [10] that the iterates of PnPADMM are guaranteed to converge to a fixed point if the denoiser satisfies a certain boundedness property. However, the question of optimality of the fixed point was not resolved in [10]. Moreover, it is not known if any of the existing denoisers satisfy this property.
The present work has twofold contributions. First, we propose to “linearize” the data term for the proximal update
(3) 
used in PnPADMM [8]. Linearization is used in several algorithms such as ISTA [11], FISTA [12], and ADMM [13]. However, to the best of our knowledge, this has not been exploited for PnPADMM. Linearization allows us to perform the inversion updates at lower cost (without sacrificing convergence guarantees) for applications where (3) has to be computed in an iterative fashion. We note that techniques to cut down the inversion cost have been proposed in [10, 14]. It was shown in [14] that the inversion can be performed efficiently using primaldual splitting, and fast inversion techniques for specific restoration problems were proposed in [10]. Our proposal requires to be differentiable. Moreover, we can guarantee convergence if is Lipschitz continuous. This is indeed the case for a wide range of problems including linear inverse problems [11]. Importantly, we can incorporate hard constraints in the optimization framework, which is difficult to do in PnPADMM [8]. We note that even though fails to be Lipschitz for singlephoton imaging [15], linearized ADMM is found to be stable and yields good reconstructions that are comparable with [10] (see Section 4.2).
In a different direction, we also develop a fast lowcomplexity algorithm for doubly stochastic NLM (DSGNLM). In [8], the authors conceived DSGNLM as a matrix multiplication , where
is the vectorized input image and
is a weight matrix computed from patches. However, is derived from the weight matrix of original NLM [5] in three steps (see Section 3). As a result, it was not apparent in [8] that can be computed using filtering (aggregation), without storing . We show that this can indeed be done using aggregation, however the aggregation needs to be performed not once but thrice. We also use an existing algorithm [16] for efficiently computing the patch distances. On the overall, we are able to accelerate the implementation of DSGNLM by about , for which PnPADMM (and the proposed linearization) comes with convergence and optimality guarantees.In Section 2, we propose linearized PnPADMM and discuss its convergence properties. A filtering perspective of DSGNLM is presented in Section 3, which is used to develop a fast algorithm. In Section 4, we present results for superresolution and singlephoton imaging. We conclude with a summary of the results in Section 5.
2 Linearized PlugandPlay ADMM
The plugandplay framework was originally proposed in [9]. Following this original work, its algorithmic properties and applications have been studied in a series of papers [8, 17, 18, 10, 19, 14]. The core idea is based on alternating direction method of multipliers (ADMM), which is tailored for solving composite optimization problems of the form in (2), e.g., see [7, 20, 13, 21]. In this work, we will consider the optimization model
(4) 
where the data term forces consistency w.r.t. the measurements, the regularizer enforces some prior, and is closed and convex. Constraints of interest include (nonnegativity constraint) and (box constraint) [20]. We next split the variable in (4) to obtain the following equivalent problem:
(5) 
The important point here is that though is constrained, (used in the denoising step) is a free variable which is indirectly constrained by the relation . The augmented Lagrangian for (5) is given by
where is the (scaled) Lagrange multiplier and is the penalty parameter [7]. The ADMM solution of (2) consists of the minimization of over :
(6) 
the minimization of over :
(7) 
where and , and the dual update
Notice that (7
) is the maximumaposteriori estimator corresponding to the observation
and prior , whereis iid Gaussian with variance
[22]. Different choices of lead to different denoising methods, such as wavelet and totalvariation denoising [20]. The key idea in [9] was to substitute step (7) with(8) 
where is some powerful denoiser such as NLM or BM3D (optimized to operate at noise level ) that do not arise from any known regularizer. It was empirically demonstrated in [9, 8, 17, 18, 10, 19] that this adhoc modification of the ADMM algorithm produces promising results for image restoration. Using the theory of proximal maps [23], it was later demonstrated in [8]
that one can associate a convex regularizer with NLM provided its weight matrix is modified to be symmetric and doubly stochastic, with eigenvalues in
(discussed in detail in Section 3). Moreover, it was shown that convergence is guaranteed in this case.In general, the optimization in (6) has to be performed iteratively, even when . That is, along with the outer iterations, we need to perform inner iterations for (6). This is particularly applicable for the applications in Section 4. Ideally, we would like to replace (6) by a simple lowcomplexity operation. This can be achieved using a variant of ADMM called linearized ADMM [13]. The idea is to replace by its linear approximation around :
(9) 
where we also append a quadratic term () for technical reasons to be discussed shortly. Substituting (9) in (6), we obtain
that is, is the orthogonal projection of onto , where
We note that can be easily computed when is or . The assumption with linearization is that is continuously differentiable and is Lipschitz (and easily computable). This is true for a wide range of image restoration problems including linear inverse problems. The plugandplay algorithm using linearized ADMM is summarized below.
If is DSGNLM, then Algorithm 1 is guaranteed to converge under some mild assumption on . Indeed, as shown in [8], we can associate a closed, proper, and convex regularizer with in this case, i.e., (8) corresponds to the ADMM update (7). Algorithm 1 is then simply an application of linearized ADMM to problem (4), and its convergence follows from existing results [13]. Specifically, if is closed, proper, and convex (and satisfies some mild assumptions [13]), and is larger than the Lipschitz constant of , then converges to the optimum of (4).
3 Fast DSGNLM
We now propose a fast algorithm for DSGNLM [8]. This denoiser is derived from NLM [5] which uses patches and nonlocal aggregation for denoising. We will follow the notation in [8] for easy comparison. In particular, we will use and to denote the input to the denoiser and the output, both defined on a finite domain . Moreover, we will use (vector of length ) to denote a patch of size centered at . In NLM, is given by
(10) 
where is a search window of size centered at [8].
Notice that we can express (10
) as a linear transform
, where (derived from ) is row stochastic, i.e., the sum of entries in each row is one. However, due to the division in (10), is not symmetric in general. It was shown in [8, Section IV] that, in three simple steps, we can transforminto a symmetric, doubly stochastic matrix
whose eigenvalues are in . DSGNLM is simply the transform . For completeness, we recall these steps, but slightly differently from [8]. In the first step, we set , where(11) 
and is the separable hat function. We refer the reader to [8] for the technical reason behind this and subsequent steps. The next step involves row and column normalization:
(12) 
We next compute the row sums of the weight matrix (at this point) and set the maximum to . The final step is given by
(13) 
It is proved in [8, Theorem IV.1] that the final matrix is symmetric and doubly stochastic, and its eigenvalues are in . As for the implementation, it is impractical to store the large matrix and apply it on as a matrixvector multiplication. The present observation is that we can efficiently compute without having to store . However, unlike NLM, we need to loop over the pixels three times instead of just once. The complete procedure is provided in Algorithm 2, where images and are of the size of . The first aggregation corresponds to (11). In the second aggregation, we compute (12) and the maximum of the row sums. In the final aggregation, we combine step (13) and the application of on .
Similar to NLM, it is evident that the complexity of Algorithm 2 is per pixel. However, similar to NLM, we can use the fast algorithm in [16] for computing patch distances, whose complexity does not depend on the patch size . This significantly brings down the complexity to . In summary, we can compute DSGNLM via filtering and that too at a reduced complexity.
4 Experiments
We validate the performance of our restoration algorithm using superresolution [24] and single photon imaging [15]. The reason behind this choice is that the data fidelity term is quadratic in superresolution and nonquadratic (though convex and differentiable) in single photon imaging. Assuming (without loss of generality) that the intensity of the groundtruth is in , we set the constraint set in (4) to be For both applications, we have used BM3D, NLM and DSGNLM for denoising. Moreover, similar to [8], we have tried both the adaptive (ADSGNLM) and fixed (FDSGNLM) variants of DSGNLM. In the former, the weight is adapted in each iteration as discussed previously. In FDSGNLM, we stop adapting after certain number of iterations (typically ), which is used for rest of the iterations. The point to note is that convergence is guaranteed for FDSGNLM but not for ADSGNLM, since the output is not a linear function of the input in the latter [8].
In Table 1, we compare the timings of DSGNLM [8] and Algorithm 2 for a image. The simulation was done using Matlab on a quadcore GHz machine with GB memory. We used standard patch and window sizes [5]. The bruteforce implementation of nonlocal denoising is known to be prohibitively slow [16]. In Table 1, notice that the bruteforce implementation of DSGNLM takes minutes, while Algorithm 2 takes just few seconds. The speedup is almost for large patch sizes. This is because the complexity of our fast algorithm does not scale with the patch size.
4.1 Single image superresolution
The forward model for single image superresolution is given by
where is the highresolution image (ground truth), is a linear operator, and is iid Gaussian noise [24]. Specifically, is the lowpass filtering of (using blur ) followed by downsampling (by factor ). The adjoint corresponds to upsampling (by factor ) and lowpass filtering with (if is symmetric) [20]. The problem is to recover from the lowresolution image . As is wellknown [22], the data fidelity term corresponding to the negative loglikelihood of given is
(14) 
The gradient and Hessian of (14) are and . The Lipschitz constant of is simply the largest eigenvalue of [20]. If we set and use ADSGNLM for denoising, then convergence is guaranteed for Algorithm 1.
(periodic boundary). The standard deviations of Gaussian blur and noise are
and. (a) Cubic interpolation,
PSNR = dB; (b) Chan [10], PSNR = dB; and (c) Proposed, PSNR = dB.Patch Size ()  

Proposed  
Bruteforce 
Error  BM3D  NLM  ADSGNLM  FDSGNLM 

Primal  
Dual  
PSNR 
We now present some results for image superresolution where (blur) is Gaussian. A typical result is reported in Figure 1, where we have also compared with the plugandplay algorithm in [10]. For both methods, we have used FDSGNLM as the denoiser. In [10], the inversion step (6) is computed using fast polyphase decompositions and is adapted at each iteration. Notice that the reconstructions are comparable, both visually and in terms of PSNR. We note that, unlike Algorithm 1, the fast inversion in [10] works just with periodic convolutions. Following this observation, we perform a superresolution experiment for the setup in Figure 1, but using symmetric boundary condition (for the blur). In this case, the inversion step (6) requires the solution a linear system with coefficients . This can be done iteratively using conjugategradients, which is however computation intensive. The evolution of PSNR with time for standard and linearized plugandplay ADMM are compared in Figure 2. Notice that the PSNR peaks very fast in linearized ADMM. A possible explanation is that though we use an inexact update for (6), this is partly compensated by the regularization (smoothing) step in each iteration.
4.2 Single photon imaging
We next consider single photon imaging using quanta image sensors (QIS) [15]. For an input image , the QIS consists of
photon detectors, which are uniformly distributed to read the incoming light (
is the oversampling factor). A simplified model of the amount of photon arriving at QIS pixels is given by , where is the sensor gain, and. More precisely, the photon count follows a Poisson distribution with parameter
. The final output is a binary image obtained by thresholding the photon count (see [10] for a detailed description). In brief, the negative loglikelihood of observing given is given by(15) 
where
and are known design parameters. Since (15) is separable, its gradient is , where
It can be verified that is not bounded near the origin. Therefore, cannot be Lipschitz. However, this is only a sufficient condition for convergence, and Algorithm 1 in fact was found to be work stably for our experiments. As shown in Figure 3, the reconstruction obtained using Algorithm 1 is comparable to that obtained using [10].
Error  BM3D  NLM  ADSGNLM  FDSGNLM 

Primal  
Dual  
PSNR 
For both experiments, we computed the primal and dual residuals [7] and the PSNR for different denoisers. The results are reported in Tables 2 and 3. We observe that the residuals for BM3D, NLM, and to some extent ADSGNLM, do not seem to converge fully. On the other hand, the residual for FDSGNLM is several orders smaller than NLM and BM3D. For single photon imaging, the residual for FDSGNLM is orders smaller than NLM and orders smaller than BM3D. Interestingly, the PSNR for FDSGNLM is better than that of BM3D or NLM (a similar observation was reported in [8]).
5 Conclusion
We proposed a plugandplay algorithm for image restoration, where both the inversion and denoising steps can be computed efficiently. The proposed algorithm comes with convergence guarantees for linear inverse problems. For superresolution and single photon imaging, our algorithm was shown to yield reconstructions that are comparable with existing plugandplay algorithms. Though this was not investigated due to space constraints, we can use accelerated variants of linearized ADMM to further speed up the reconstruction [13]. Another interesting possibility is to use improved variants of nonlocal means [25, 26] for the denoising step.
References
 [1] R. Neelamani, H. Choi, and R. Baraniuk, “ForWaRD: FourierWavelet regularized deconvolution for illconditioned systems,” IEEE Transactions on Signal Processing, vol. 52, no. 2, pp. 418–433, 2004.
 [2] J. Tan, Y. Ma, and D. Baron, “Compressive imaging via approximate message passing with image denoising,” IEEE Transactions on Signal Processing, vol. 63, no. 8, pp. 2085–2092, 2015.
 [3] A. Danielyan, V. Katkovnik, and K. Egiazarian, “BM3D frames and variational image deblurring,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1715–1728, 2012.
 [4] C. A. Metzler, A. Maleki, and R. G. Baraniuk, “BM3DAMP: A new image recovery algorithm based on BM3D denoising,” Proc. IEEE International Conference on Image Processing, pp. 3116–3120, 2015.

[5]
A. Buades, B. Coll, and J. M. Morel, “A nonlocal algorithm for image
denoising,”
Proc. IEEE Conference on Computer Vision and Pattern Recognition
, vol. 2, pp. 60–65, 2005.  [6] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3D transformdomain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.

[7]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
optimization and statistical learning via the alternating direction method of
multipliers,”
Foundations and Trends in Machine Learning
, vol. 3, no. 1, pp. 1–122, 2011.  [8] S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman, “Plugandplay priors for bright field electron tomography and sparse interpolation,” IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 408–423, 2016.
 [9] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plugandplay priors for model based reconstruction,” Proc. IEEE Global Conference on Signal and Information Processing, pp. 945–948, 2013.
 [10] S. H. Chan, X. Wang, and O. A. Elgendy, “Plugandplay admm for image restoration: Fixedpoint convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2017.
 [11] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, 2004.
 [12] A. Beck and M. Teboulle, “Fast gradientbased algorithms for constrained total variation image denoising and deblurring problems,” IEEE Transactions on Image Processing, vol. 18, no. 11, pp. 2419–2434, 2009.
 [13] Y. Ouyang, Y. Chen, G. Lan, and E. Pasiliao, “An accelerated linearized alternating direction method of multipliers,” SIAM Journal on Imaging Sciences, vol. 8, no. 1, pp. 644–681, 2015.
 [14] S. Ono, “Primaldual plugandplay image restoration,” IEEE Signal Processing Letters, vol. 24, no. 8, pp. 1108–1112, 2017.
 [15] E. Fossum, “The quanta image sensor (QIS): Concepts and challenges,” OSA, Computational Optical Sensing and Imaging, 2011.
 [16] J. Darbon, A. Cunha, T. F. Chan, S. Osher, and G. J. Jensen, “Fast nonlocal filtering applied to electron cryomicroscopy,” Proc. IEEE International Symposium on Biomedical Imaging, pp. 1331–1334, 2008.
 [17] A. Brifman, Y. Romano, and M. Elad, “Turning a denoiser into a superresolver using plug and play priors,” Proc. IEEE International Conference on Image Processing, pp. 1404–1408, 2016.
 [18] A. Rond, R. Giryes, and M. Elad, “Poisson inverse problems by the plugandplay scheme,” Journal of Visual Communication and Image Representation, vol. 41, pp. 96–108, 2016.
 [19] X. Wang and S. H. Chan, “Parameterfree plugandplay ADMM for image restoration,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1323–1327, 2017.
 [20] M. V. Afonso, J. M. BioucasDias, and M. A. Figueiredo, “Fast image recovery using variable splitting and constrained optimization,” IEEE Transactions on Image Processing, vol. 19, no. 9, pp. 2345–2356, 2010.
 [21] A. M. Teodoro, J. M. BioucasDias, and M. A. Figueiredo, “Image restoration and reconstruction using variable splitting and classadapted image priors,” Proc. IEEE International Conference on Image Processing, pp. 3518–3522, 2016.
 [22] B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Transactions on Computers, no. 3, pp. 219–229, 1977.
 [23] J.J. Moreau, “Proximité et dualité dans un espace hilbertien,” Bull. Soc. Math. France, vol. 93, no. 2, pp. 273–299, 1965.
 [24] S. C. Park, M. K. Park, and M. G. Kang, “Superresolution image reconstruction: a technical overview,” IEEE Signal Processing Magazine, vol. 20, no. 3, pp. 21–36, 2003.
 [25] Y. Wu, B. Tracey, P. Natarajan, and J. P. Noonan, “James–Stein type center pixel weights for nonlocal means image denoising,” IEEE Signal Processing Letters, vol. 20, no. 4, pp. 411–414, 2013.
 [26] S. Ghosh, A. K. Mandal, and K. N. Chaudhury, “Pruned nonlocal means,” IET Image Processing, vol. 11, no. 5, pp. 317–323, 2017.