Linearized ADMM and Fast Nonlocal Denoising for Efficient Plug-and-Play Restoration

01/18/2019 ∙ by Unni V. S., et al. ∙ 0

In plug-and-play image restoration, the regularization is performed using powerful denoisers such as nonlocal means (NLM) or BM3D. This is done within the framework of alternating direction method of multipliers (ADMM), where the regularization step is formally replaced by an off-the-shelf denoiser. Each plug-and-play iteration involves the inversion of the forward model followed by a denoising step. In this paper, we present a couple of ideas for improving the efficiency of the inversion and denoising steps. First, we propose to use linearized ADMM, which generally allows us to perform the inversion at a lower cost than standard ADMM. Moreover, we can easily incorporate hard constraints into the optimization framework as a result. Second, we develop a fast algorithm for doubly stochastic NLM, originally proposed by Sreehari et al. (IEEE TCI, 2016), which is about 80x faster than brute-force computation. This particular denoiser can be expressed as the proximal map of a convex regularizer and, as a consequence, we can guarantee convergence for linearized plug-and-play ADMM. We demonstrate the effectiveness of our proposals for super-resolution and single-photon imaging.



There are no comments yet.


page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Following the significant progress in image denoising, researchers have experimented with the idea of using modern denoisers for image restoration problems such as deconvolution, deblurring, tomography, and compressed sensing [1, 2, 3, 4]. Generally speaking, these are based on iterative methods, where each iteration involves the inversion of the forward model followed by the application of a powerful denoiser. Despite their empirical success, it has generally been challenging to furnish guarantees on convergence and optimality. The technical hurdle in this regard stems from the fact that state-of-the-art denoisers such as NLM [5] and BM3D [6] are derived from a filtering and not an optimization perspective. To be precise, it is not known if they can be expressed as the proximal map [7] of some regularizer , i.e., whether the minimizer of


corresponds to the output obtained by denoising .

It was recently shown in [8] that the existence of such a regularizer can be guaranteed for a doubly stochastic variant of NLM. In particular, it was shown that if is the regularizer in question and is data fidelity term associated with the forward model, then the ADMM-based solution [7] of


provides a framework where the inversion and regularization steps can be decoupled, and the latter amounts to denoising using doubly stochastic NLM (see Section 2 for a detailed description). In fact, the idea of replacing the regularization step in ADMM with an arbitrary denoiser was proposed earlier in [9] under the name “plug-and-play” ADMM (PnP-ADMM), albeit without any convergence guarantees. Later, it was also shown in [10] that the iterates of PnP-ADMM are guaranteed to converge to a fixed point if the denoiser satisfies a certain boundedness property. However, the question of optimality of the fixed point was not resolved in [10]. Moreover, it is not known if any of the existing denoisers satisfy this property.

The present work has two-fold contributions. First, we propose to “linearize” the data term for the proximal update


used in PnP-ADMM [8]. Linearization is used in several algorithms such as ISTA [11], FISTA [12], and ADMM [13]. However, to the best of our knowledge, this has not been exploited for PnP-ADMM. Linearization allows us to perform the inversion updates at lower cost (without sacrificing convergence guarantees) for applications where (3) has to be computed in an iterative fashion. We note that techniques to cut down the inversion cost have been proposed in [10, 14]. It was shown in [14] that the inversion can be performed efficiently using primal-dual splitting, and fast inversion techniques for specific restoration problems were proposed in [10]. Our proposal requires to be differentiable. Moreover, we can guarantee convergence if is Lipschitz continuous. This is indeed the case for a wide range of problems including linear inverse problems [11]. Importantly, we can incorporate hard constraints in the optimization framework, which is difficult to do in PnP-ADMM [8]. We note that even though fails to be Lipschitz for single-photon imaging [15], linearized ADMM is found to be stable and yields good reconstructions that are comparable with [10] (see Section 4.2).

In a different direction, we also develop a fast low-complexity algorithm for doubly stochastic NLM (DSG-NLM). In [8], the authors conceived DSG-NLM as a matrix multiplication , where

is the vectorized input image and

is a weight matrix computed from patches. However, is derived from the weight matrix of original NLM [5] in three steps (see Section 3). As a result, it was not apparent in [8] that can be computed using filtering (aggregation), without storing . We show that this can indeed be done using aggregation, however the aggregation needs to be performed not once but thrice. We also use an existing algorithm [16] for efficiently computing the patch distances. On the overall, we are able to accelerate the implementation of DSG-NLM by about , for which PnP-ADMM (and the proposed linearization) comes with convergence and optimality guarantees.

In Section 2, we propose linearized PnP-ADMM and discuss its convergence properties. A filtering perspective of DSG-NLM is presented in Section 3, which is used to develop a fast algorithm. In Section 4, we present results for super-resolution and single-photon imaging. We conclude with a summary of the results in Section 5.

2 Linearized Plug-and-Play ADMM

The plug-and-play framework was originally proposed in [9]. Following this original work, its algorithmic properties and applications have been studied in a series of papers [8, 17, 18, 10, 19, 14]. The core idea is based on alternating direction method of multipliers (ADMM), which is tailored for solving composite optimization problems of the form in (2), e.g., see [7, 20, 13, 21]. In this work, we will consider the optimization model


where the data term forces consistency w.r.t. the measurements, the regularizer enforces some prior, and is closed and convex. Constraints of interest include (non-negativity constraint) and (box constraint) [20]. We next split the variable in (4) to obtain the following equivalent problem:


The important point here is that though is constrained, (used in the denoising step) is a free variable which is indirectly constrained by the relation . The augmented Lagrangian for (5) is given by

where is the (scaled) Lagrange multiplier and is the penalty parameter [7]. The ADMM solution of (2) consists of the minimization of over :


the minimization of over :


where and , and the dual update

Notice that (7

) is the maximum-a-posteriori estimator corresponding to the observation

and prior , where

is iid Gaussian with variance

[22]. Different choices of lead to different denoising methods, such as wavelet and total-variation denoising [20]. The key idea in [9] was to substitute step (7) with


where is some powerful denoiser such as NLM or BM3D (optimized to operate at noise level ) that do not arise from any known regularizer. It was empirically demonstrated in [9, 8, 17, 18, 10, 19] that this ad-hoc modification of the ADMM algorithm produces promising results for image restoration. Using the theory of proximal maps [23], it was later demonstrated in [8]

that one can associate a convex regularizer with NLM provided its weight matrix is modified to be symmetric and doubly stochastic, with eigenvalues in

(discussed in detail in Section 3). Moreover, it was shown that convergence is guaranteed in this case.

In general, the optimization in (6) has to be performed iteratively, even when . That is, along with the outer iterations, we need to perform inner iterations for (6). This is particularly applicable for the applications in Section 4. Ideally, we would like to replace (6) by a simple low-complexity operation. This can be achieved using a variant of ADMM called linearized ADMM [13]. The idea is to replace by its linear approximation around :


where we also append a quadratic term () for technical reasons to be discussed shortly. Substituting (9) in (6), we obtain

that is, is the orthogonal projection of onto , where

We note that can be easily computed when is or . The assumption with linearization is that is continuously differentiable and is Lipschitz (and easily computable). This is true for a wide range of image restoration problems including linear inverse problems. The plug-and-play algorithm using linearized ADMM is summarized below.

Input: and .
1 Set and . while not converged do
3 end while
Algorithm 1 Linearized Plug-and-Play ADMM.

If is DSG-NLM, then Algorithm 1 is guaranteed to converge under some mild assumption on . Indeed, as shown in [8], we can associate a closed, proper, and convex regularizer with in this case, i.e., (8) corresponds to the ADMM update (7). Algorithm 1 is then simply an application of linearized ADMM to problem (4), and its convergence follows from existing results [13]. Specifically, if is closed, proper, and convex (and satisfies some mild assumptions [13]), and is larger than the Lipschitz constant of , then converges to the optimum of (4).

3 Fast DSG-NLM

We now propose a fast algorithm for DSG-NLM [8]. This denoiser is derived from NLM [5] which uses patches and nonlocal aggregation for denoising. We will follow the notation in [8] for easy comparison. In particular, we will use and to denote the input to the denoiser and the output, both defined on a finite domain . Moreover, we will use (vector of length ) to denote a patch of size centered at . In NLM, is given by


where is a search window of size centered at [8].

Notice that we can express (10

) as a linear transform

, where (derived from ) is row stochastic, i.e., the sum of entries in each row is one. However, due to the division in (10), is not symmetric in general. It was shown in [8, Section IV] that, in three simple steps, we can transform

into a symmetric, doubly stochastic matrix

whose eigenvalues are in . DSG-NLM is simply the transform . For completeness, we recall these steps, but slightly differently from [8]. In the first step, we set , where


and is the separable hat function. We refer the reader to [8] for the technical reason behind this and subsequent steps. The next step involves row and column normalization:


We next compute the row sums of the weight matrix (at this point) and set the maximum to . The final step is given by


It is proved in [8, Theorem IV.1] that the final matrix is symmetric and doubly stochastic, and its eigenvalues are in . As for the implementation, it is impractical to store the large matrix and apply it on as a matrix-vector multiplication. The present observation is that we can efficiently compute without having to store . However, unlike NLM, we need to loop over the pixels three times instead of just once. The complete procedure is provided in Algorithm 2, where images and are of the size of . The first aggregation corresponds to (11). In the second aggregation, we compute (12) and the maximum of the row sums. In the final aggregation, we combine step (13) and the application of on .

Input: .
Output: .
1 Initialize , and with zeros; also set ;
2 for  do
3       for  do  ;
5 end for
6 for  do
7       ;
8       for  do  ;
9       ;
11 end for
12for  do
13       for  do
14             ;
15             ;
16             ;
18       end for
19      .
20 end for
Algorithm 2 DSG-NLM Filtering.

Similar to NLM, it is evident that the complexity of Algorithm 2 is per pixel. However, similar to NLM, we can use the fast algorithm in [16] for computing patch distances, whose complexity does not depend on the patch size . This significantly brings down the complexity to . In summary, we can compute DSG-NLM via filtering and that too at a reduced complexity.

4 Experiments

We validate the performance of our restoration algorithm using super-resolution [24] and single photon imaging [15]. The reason behind this choice is that the data fidelity term is quadratic in super-resolution and non-quadratic (though convex and differentiable) in single photon imaging. Assuming (without loss of generality) that the intensity of the ground-truth is in , we set the constraint set in (4) to be For both applications, we have used BM3D, NLM and DSG-NLM for denoising. Moreover, similar to [8], we have tried both the adaptive (A-DSG-NLM) and fixed (F-DSG-NLM) variants of DSG-NLM. In the former, the weight is adapted in each iteration as discussed previously. In F-DSG-NLM, we stop adapting after certain number of iterations (typically ), which is used for rest of the iterations. The point to note is that convergence is guaranteed for F-DSG-NLM but not for A-DSG-NLM, since the output is not a linear function of the input in the latter [8].

In Table 1, we compare the timings of DSG-NLM [8] and Algorithm 2 for a image. The simulation was done using Matlab on a quad-core GHz machine with GB memory. We used standard patch and window sizes [5]. The brute-force implementation of nonlocal denoising is known to be prohibitively slow [16]. In Table 1, notice that the brute-force implementation of DSG-NLM takes minutes, while Algorithm 2 takes just few seconds. The speedup is almost for large patch sizes. This is because the complexity of our fast algorithm does not scale with the patch size.

4.1 Single image super-resolution

The forward model for single image super-resolution is given by

where is the high-resolution image (ground truth), is a linear operator, and is iid Gaussian noise [24]. Specifically, is the low-pass filtering of (using blur ) followed by downsampling (by factor ). The adjoint corresponds to upsampling (by factor ) and low-pass filtering with (if is symmetric) [20]. The problem is to recover from the low-resolution image . As is well-known [22], the data fidelity term corresponding to the negative log-likelihood of given is


The gradient and Hessian of (14) are and . The Lipschitz constant of is simply the largest eigenvalue of [20]. If we set and use A-DSG-NLM for denoising, then convergence is guaranteed for Algorithm 1.

Figure 1: Super-resolution results for House using

(periodic boundary). The standard deviations of Gaussian blur and noise are


. (a) Cubic interpolation,

PSNR = dB; (b) Chan [10], PSNR = dB; and (c) Proposed, PSNR = dB.
Patch Size ()
Table 1: Timings (sec) of the brute-force and fast implementations of DSG-NLM for different patch size and fixed window size .
Table 2: Primal and dual residuals and corresponding PSNRs (after 250 iterations) for image super-resolution (periodic boundary).

We now present some results for image super-resolution where (blur) is Gaussian. A typical result is reported in Figure 1, where we have also compared with the plug-and-play algorithm in [10]. For both methods, we have used F-DSG-NLM as the denoiser. In [10], the inversion step (6) is computed using fast polyphase decompositions and is adapted at each iteration. Notice that the reconstructions are comparable, both visually and in terms of PSNR. We note that, unlike Algorithm 1, the fast inversion in [10] works just with periodic convolutions. Following this observation, we perform a super-resolution experiment for the setup in Figure 1, but using symmetric boundary condition (for the blur). In this case, the inversion step (6) requires the solution a linear system with coefficients . This can be done iteratively using conjugate-gradients, which is however computation intensive. The evolution of PSNR with time for standard and linearized plug-and-play ADMM are compared in Figure 2. Notice that the PSNR peaks very fast in linearized ADMM. A possible explanation is that though we use an inexact update for (6), this is partly compensated by the regularization (smoothing) step in each iteration.

Figure 2: PSNR evolution for super-resolution (symmetric boundary) with downsampling factor using standard and linearized ADMM.

4.2 Single photon imaging

We next consider single photon imaging using quanta image sensors (QIS) [15]. For an input image , the QIS consists of

photon detectors, which are uniformly distributed to read the incoming light (

is the oversampling factor). A simplified model of the amount of photon arriving at QIS pixels is given by , where is the sensor gain, and

. More precisely, the photon count follows a Poisson distribution with parameter

. The final output is a binary image obtained by thresholding the photon count (see [10] for a detailed description). In brief, the negative log-likelihood of observing given is given by



and are known design parameters. Since (15) is separable, its gradient is , where

It can be verified that is not bounded near the origin. Therefore, cannot be Lipschitz. However, this is only a sufficient condition for convergence, and Algorithm 1 in fact was found to be work stably for our experiments. As shown in Figure 3, the reconstruction obtained using Algorithm 1 is comparable to that obtained using [10].

Figure 3: Single photon imaging with . (a) Binary measurements; (b) Chan [10] with F-DSG-NLM, PSNR = dB; and (c) Proposed with F-DSG-NLM, PSNR = dB.
Table 3: Same as in Table 2 but for single photon imaging.

For both experiments, we computed the primal and dual residuals [7] and the PSNR for different denoisers. The results are reported in Tables 2 and 3. We observe that the residuals for BM3D, NLM, and to some extent A-DSG-NLM, do not seem to converge fully. On the other hand, the residual for F-DSG-NLM is several orders smaller than NLM and BM3D. For single photon imaging, the residual for F-DSG-NLM is orders smaller than NLM and orders smaller than BM3D. Interestingly, the PSNR for F-DSG-NLM is better than that of BM3D or NLM (a similar observation was reported in [8]).

5 Conclusion

We proposed a plug-and-play algorithm for image restoration, where both the inversion and denoising steps can be computed efficiently. The proposed algorithm comes with convergence guarantees for linear inverse problems. For super-resolution and single photon imaging, our algorithm was shown to yield reconstructions that are comparable with existing plug-and-play algorithms. Though this was not investigated due to space constraints, we can use accelerated variants of linearized ADMM to further speed up the reconstruction [13]. Another interesting possibility is to use improved variants of nonlocal means [25, 26] for the denoising step.


  • [1] R. Neelamani, H. Choi, and R. Baraniuk, “ForWaRD: Fourier-Wavelet regularized deconvolution for ill-conditioned systems,” IEEE Transactions on Signal Processing, vol. 52, no. 2, pp. 418–433, 2004.
  • [2] J. Tan, Y. Ma, and D. Baron, “Compressive imaging via approximate message passing with image denoising,” IEEE Transactions on Signal Processing, vol. 63, no. 8, pp. 2085–2092, 2015.
  • [3] A. Danielyan, V. Katkovnik, and K. Egiazarian, “BM3D frames and variational image deblurring,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1715–1728, 2012.
  • [4] C. A. Metzler, A. Maleki, and R. G. Baraniuk, “BM3D-AMP: A new image recovery algorithm based on BM3D denoising,” Proc. IEEE International Conference on Image Processing, pp. 3116–3120, 2015.
  • [5] A. Buades, B. Coll, and J. M. Morel, “A non-local algorithm for image denoising,”

    Proc. IEEE Conference on Computer Vision and Pattern Recognition

    , vol. 2, pp. 60–65, 2005.
  • [6] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
  • [7] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,”

    Foundations and Trends in Machine Learning

    , vol. 3, no. 1, pp. 1–122, 2011.
  • [8] S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman, “Plug-and-play priors for bright field electron tomography and sparse interpolation,” IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 408–423, 2016.
  • [9] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” Proc. IEEE Global Conference on Signal and Information Processing, pp. 945–948, 2013.
  • [10] S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play admm for image restoration: Fixed-point convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2017.
  • [11] I. Daubechies, M. Defrise, and C. D. Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, 2004.
  • [12] A. Beck and M. Teboulle, “Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems,” IEEE Transactions on Image Processing, vol. 18, no. 11, pp. 2419–2434, 2009.
  • [13] Y. Ouyang, Y. Chen, G. Lan, and E. Pasiliao, “An accelerated linearized alternating direction method of multipliers,” SIAM Journal on Imaging Sciences, vol. 8, no. 1, pp. 644–681, 2015.
  • [14] S. Ono, “Primal-dual plug-and-play image restoration,” IEEE Signal Processing Letters, vol. 24, no. 8, pp. 1108–1112, 2017.
  • [15] E. Fossum, “The quanta image sensor (QIS): Concepts and challenges,” OSA, Computational Optical Sensing and Imaging, 2011.
  • [16] J. Darbon, A. Cunha, T. F. Chan, S. Osher, and G. J. Jensen, “Fast nonlocal filtering applied to electron cryomicroscopy,” Proc. IEEE International Symposium on Biomedical Imaging, pp. 1331–1334, 2008.
  • [17] A. Brifman, Y. Romano, and M. Elad, “Turning a denoiser into a super-resolver using plug and play priors,” Proc. IEEE International Conference on Image Processing, pp. 1404–1408, 2016.
  • [18] A. Rond, R. Giryes, and M. Elad, “Poisson inverse problems by the plug-and-play scheme,” Journal of Visual Communication and Image Representation, vol. 41, pp. 96–108, 2016.
  • [19] X. Wang and S. H. Chan, “Parameter-free plug-and-play ADMM for image restoration,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 1323–1327, 2017.
  • [20] M. V. Afonso, J. M. Bioucas-Dias, and M. A. Figueiredo, “Fast image recovery using variable splitting and constrained optimization,” IEEE Transactions on Image Processing, vol. 19, no. 9, pp. 2345–2356, 2010.
  • [21] A. M. Teodoro, J. M. Bioucas-Dias, and M. A. Figueiredo, “Image restoration and reconstruction using variable splitting and class-adapted image priors,” Proc. IEEE International Conference on Image Processing, pp. 3518–3522, 2016.
  • [22] B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Transactions on Computers, no. 3, pp. 219–229, 1977.
  • [23] J.-J. Moreau, “Proximité et dualité dans un espace hilbertien,” Bull. Soc. Math. France, vol. 93, no. 2, pp. 273–299, 1965.
  • [24] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruction: a technical overview,” IEEE Signal Processing Magazine, vol. 20, no. 3, pp. 21–36, 2003.
  • [25] Y. Wu, B. Tracey, P. Natarajan, and J. P. Noonan, “James–Stein type center pixel weights for non-local means image denoising,” IEEE Signal Processing Letters, vol. 20, no. 4, pp. 411–414, 2013.
  • [26] S. Ghosh, A. K. Mandal, and K. N. Chaudhury, “Pruned non-local means,” IET Image Processing, vol. 11, no. 5, pp. 317–323, 2017.