A variety of image restoration problems, such as superresolution, deblurring, compressed sensing, tomography etc., are modeled as optimization problems of the form
where the data-fidelity term is derived from the degradation and noise models, while the regularizer is derived from some prior on the ground-truth image . Traditionally, the regularizer is a sparsity-promoting function in some transform domain . In recent years, researchers have explored the possibility of using powerful Gaussian denoisers such as NLM  and BM3D  for regularization purpose. In [5, 6], the regularizer is explicitly constructed from a denoiser. On the other hand, for plug-and-play (PnP) methods [7, 8, 9, 10], the denoiser is formally substituted in place of the proximal operator in iterative algorithms such as FISTA , primal-dual splitting , and ADMM .
where is a penalty parameter and is the Euclidean norm (this is the rescaled form of ADMM). If and are convex, then under some technical conditions, the iterates are guaranteed to converge to a fixed-point, which is the global minimizer of (1). Now, (3) corresponds to regularized Gaussian denoising, where assumes the role of the regularizer . Based on this observation, the original proposal in  was to replace the proximal operation (3) with an off-the-shelf denoiser, i.e., the -update is replaced by , where is the denoiser in question. The idea is simply to exploit the excellent denoising capability of state-of-the-art denoisers for restoration, even though we might not be able to conceive them as proximal operators (of some regularizer). We refer the readers to [7, 8] for a detailed account. The technical challenge, however, is that the resulting sequence of operations, referred to as PnP-ADMM, need not necessarily correspond to an optimization problem. As a result, the convergence of the iterates is at stake. In particular, we can no longer relate to the optimization in (1) and use existing results  to ensure convergence. Nevertheless, PnP-ADMM is often found to converge empirically and yields high-quality reconstructions in several applications [7, 8, 14, 16]. Among other things, questions relating to the convergence and optimality of PnP-type methods have been studied in recent works. In , convergence guarantees were derived for a kernel-based denoiser for PnP-ADMM. Later, it was shown in  that the convergence can be ensured for a broad class of denoisers. Apart from ADMM, PnP algorithms based on various iterative methods have been explored in [9, 10, 17, 18, 19, 20]. We note that denoisers have also been used for regularization purpose in [21, 22, 23, 24, 25, 26, 27]. The relation of PnP-ADMM with graph Laplacian-based regularization was investigated in , whereas in  a framework motivated by PnP, called Consensus Equilibrium, was proposed.
In this letter, we revisit the proof of convergence of the PnP-ADMM algorithm in  and address an inadequacy therein. It was proved that, under suitable assumptions, the sequence of iterates generated by this algorithm converges to a fixed-point, for any arbitrary initialization . Instead of a fixed , an adaptive is used in , which plays an important role in the proof. However, this necessitates the use of a case-by-case approach conditioned on the adaptation rule (see Section II for details). Of the three cases considered in the paper, convergence was proved for the first two cases. It was claimed that convergence for the third case automatically follows from that of the first two cases. However, we argue that this is generally not true and hence a separate proof is needed for the third case. We give such a proof, which differs from the proof for the first two cases in . In particular, we show that the difference between successive iterates is bounded by a piecewise geometric sequence, as opposed to a geometric sequence for the first two cases. We prove that this sequence is summable, which is used to show that the iterates form a Cauchy sequence (and is hence convergent).
As mentioned earlier, the updates in PnP-ADMM  are modeled on the ADMM updates (2)–(4), with the following changes: a denoiser is used in the -update, and is updated in each iteration. In particular, the updates are given by
where . Here is a denoising operator, where the parameter controls its denoising action. It was proposed to update based on the residual
where the metric on is defined as
, and the three components of and
are vectors in. Thus, (8) is simply the distance between the -th and -th iterates, which measures the progress made by the algorithm. The exact rule proposed in  is as follows:
where and are predefined parameters. The above rule, in effect, decreases the denoising strength ( is increased) if the ratio of the current and previous residuals is greater than ; else, is kept unchanged (see  for details).
The function is differentiable and there exists such that for all .
The second assumption concerns the denoiser.
There exists such that, for all ,
While discussions on the above assumptions can be found in , here we reiterate a couple of remarks about Assumption 2. It is difficult to mathematically verify (10) even for simple denoisers, let alone sophisticated ones such as BM3D. However, an implication of (10) is that the denoiser acts like an identity map (idle filter) when is close to zero. It is reasonable to expect that any practical denoiser obeys this weaker condition. Moreover, while the denoiser might not perfectly behave as an identity operator when is close to zero, it is possible to artificially force this behavior.
We are now ready to state the convergence result in .
In particular, the iterates do not diverge or oscillate. We note that convergence of implies that as . However, the converse is generally not true, i.e., it is possible that converges to but do not converge. The technical point is that must vanish sufficiently fast to guarantee the convergence of . This is used in  as well as the present analysis.
To set up the technical context, we briefly recall the arguments provided in  in support of Theorem 3. First, Assumptions 1 and 2 were used to obtain the following result; see [14, Appendix B, Lemma 1].
If condition in (9) holds at iteration , then for some .
Now, note that exactly one of the following cases must hold:
Condition holds for finitely many .
Condition holds for finitely many .
Both and hold for infinitely many .
In , convergence was established for and as follows. Suppose is true, and let be the largest when holds, i.e., is true for . Then it follows from (9) that increases monotonically: for . Using Lemma 4, we can thus conclude that
Similarly, for , let be the largest when holds, so that holds for . By recursively applying the condition in (9), we then obtain
where the second inequality follows from Lemma 4. In summary, for both and , we can find a sufficiently large and such that
where . Namely, the error between successive iterates is eventually upper-bounded by a decaying geometric sequence. Using the triangle inequality, the fast convergence of can be used to show that the original sequence is Cauchy, and hence convergent (since the ambient space is complete). This establishes the convergence of for the first two cases.
It was stated in  that is a “union of and ”, and that convergence under and implies convergence for . However, this is not true simply because the proof sketched above is valid only if one of or occurs finitely many times—this naturally excludes the case where both and occur infinitely often. For example, consider the hypothetical situation in which occurs for every even and
occurs for every odd. Clearly, the proof does not work in this case.
For further clarity, let us carefully examine the technique in  used to establish convergence for and . For , the eventual bound on was established using the fact that is monotonically increasing for . A similar bound for was derived using the second inequality in (9), which holds for . Thus, in both (11) and (12), the existence of a finite (or ) is vital because it allows us to ignore the first few terms of the sequence , and understand its behavior over the tail. In turn, this is possible because condition (or ) occurs only a finite number of times. If both and occur infinitely often, we cannot find a finite beyond which a single inequality holds for . This is precisely why the technique in  is not applicable for . Before proceeding further, we note why it is important to prove convergence in the case . Theorem 3 assures us that the algorithm converges regardless of which of the three cases hold. Therefore its proof remains incomplete unless convergence is proved for all three cases (and in particular ). Moreover, experiments suggest that is indeed likely to arise in certain practical scenarios. We have reported some empirical observations for deblurring and superresolution experiments in the supplementary material to back this. In these experiments, we found that when is close to , it is likely that holds, i.e., the algorithm keeps switching between conditions and .
Iii Main Result
We will now establish the convergence of for case . In particular, we will show that can be bounded by a sequence which vanishes sufficiently fast to ensure that is Cauchy. Such a sequence is defined next.
A positive sequence is said to be a piecewise geometric sequence (PGS) if there exists and indices such that
for , the terms are in geometric progression with rate , i.e., for ,
the subsequence is in geometric progression with rate , i.e., for ,
In other words, a PGS can be divided into chunks that are in geometric progression (with identical rates). Moreover, the subsequence consisting of the peaks (i.e., the first term in each chunk) is itself in geometric progression. A PGS has a sawtooth-like appearance (see Figure 1), and is slower to decay to zero compared to a geometric sequence having the same rate. It turns out that the sequence of residues can be bounded by a PGS for case .
Let be the residuals for case . Then there exists a PGS such that for all .
This may be considered as an analogue of (14) for . To deduce that is Cauchy, it suffices to show that a PGS is summable.
If is a PGS, then converges.
The iterates for case converge to a fixed point.
As noted earlier, all we need to show is that is a Cauchy sequence. That is, for any given , we can find an integer such that whenever . Now, from the triangle inequality for metric and (8), we have
From Lemma 6 and 7, we can conclude that converges. This is because is bounded by the PGS , whose series itself converges. In particular, the partial sums of form a Cauchy sequence. As a result, for any , we can find a sufficiently large such that when . ∎
Iv-a Proof of Lemma 6
Let and . Note that . We will show that is bounded by a PGS with rate .
Let be the iteration at which condition holds for the first time. Further, let be the iteration at which condition occurs for the first time after (i.e. holds at iterations ). Let be the iteration at which holds for the first time after , and so on. Since holds, both and are true infinitely often. This gives us an infinite sequence of indices . Now, by construction, for each , holds at iterations . Hence, from (9), for . Since this trivially also holds for ,
By Lemma 4, for , we have
Letting , this becomes
We now derive a relation between ’s for different . We know that . However, from (9) we get,
since Case 2 occurs at iterations . This gives
since and . Applying the above inequality recursively and using the fact , we get
Let . Hence from (15), for ,
Now, condition holds for . Hence by recursively applying (9), we obtain for . Note that this trivially also holds for . Hence, we have for ,
where we have used (16) with and the fact that .
Iv-B Proof of Lemma 7
Let the parameters , be as in Definition 5. We will prove the convergence of using the Cauchy criterion. Let be given. We need to find an index such that
Let , and fix an integer such that
First, for fixed , we derive a bound on the sum of the terms from to . From Definition 5, we have
The inequality in the third step holds since , while the last equality follows from Definition 5.
We are now ready to establish (18). Let be such that . Suppose lies in the chunk for some . Then . As a result,
We pointed out that the proof of convergence of the PnP-ADMM algorithm in  does not address a certain case. We reasoned that this case needs to be handled differently from the cases addressed in . This is because the approach in  fundamentally assumes that a certain condition holds a finite number of times, which is not true for the case in question. In particular, we showed that unlike the geometric sequences used for the other cases, we need to work with a piecewise geometric sequence. Our proof of convergence follows from the observation that the residue between successive iterations is upper-bounded by this summable sequence. Our analysis rigorously establishes the convergence theorem in .
We note that in practice, optimization algorithms, including PnP-ADMM, are terminated after a finite number of iterations. In particular, since the cases in the convergence analysis involve infinite number of iterations, which of these hold in practice cannot be ascertained empirically. Therefore, getting a guarantee on theoretical convergence has practical importance—it provides a mathematical justification to terminate the algorithm after a sufficiently many iterations. This is precisely what was accomplished in this letter.
-  B. K. Gunturk and X. Li, Image Restoration: Fundamentals and Advances. CRC Press, 2012.
-  M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis in signal priors,” Inverse Problems, vol. 23, no. 3, p. 947, 2007.
-  A. Buades, B. Coll, and J. M. Morel, “A non-local algorithm for image denoising,”
-  K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
-  Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM Journal on Imaging Sciences, vol. 10, no. 4, pp. 1804–1844, 2017.
-  E. T. Reehorst and P. Schniter, “Regularization by denoising: Clarifications and new interpretations,” IEEE Transactions on Computational Imaging, vol. 5, no. 1, pp. 52–67, 2019.
-  S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” Proc. IEEE Global Conference on Signal and Information Processing, pp. 945–948, 2013.
S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman, “Plug-and-play priors for bright field electron tomography and sparse interpolation,”IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 408–423, 2016.
-  S. Ono, “Primal-dual plug-and-play image restoration,” IEEE Signal Processing Letters, vol. 24, no. 8, pp. 1108–1112, 2017.
-  U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plug-and-play priors approach for solving nonlinear imaging inverse problems,” IEEE Signal Processing Letters, vol. 24, no. 12, pp. 1872–1876, 2017.
-  A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
-  A. Chambolle and T. Pock, “A first-order primal-dual algorithm for convex problems with applications to imaging,” Journal of Mathematical Imaging and Vision, vol. 40, no. 1, pp. 120–145, 2011.
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
optimization and statistical learning via the alternating direction method of
Foundations and Trends in Machine learning, vol. 3, no. 1, pp. 1–122, 2011.
-  S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play ADMM for image restoration: Fixed-point convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2017.
-  B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Transactions on Computers, vol. C-26, no. 3, pp. 219–229, 1977.
-  A. M. Teodoro, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “A convergent image fusion algorithm using scene-adapted gaussian-mixture-based denoising,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 451–463, 2019.
-  Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,” IEEE Transactions on Computational Imaging, vol. 5, no. 3, pp. 395–408, 2019.
-  T. Meinhardt, M. Moller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” Proc. IEEE International Conference on Computer Vision, pp. 1781–1790, 2017.
-  E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, “Plug-and-play methods provably converge with properly trained denoisers,” Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 5546–5557, 2019.
W. Dong, P. Wang, W. Yin, G. Shi, F. Wu, and X. Lu, “Denoising prior driven deep neural network for image restoration,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 10, pp. 2305–2318, 2018.
-  A. Brifman, Y. Romano, and M. Elad, “Turning a denoiser into a super-resolver using plug and play priors,” Proc. IEEE International Conference on Image Processing, pp. 1404–1408, 2016.
-  A. Teodoro, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “Image restoration and reconstruction using targeted plug-and-play priors,” IEEE Transactions on Computational Imaging, 2019.
-  T. Tirer and R. Giryes, “Image restoration by iterative denoising and backward projections,” IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1220–1234, 2019.
T. Tirer and R. Giryes, “Super-resolution via image-adapted denoising CNNs: Incorporating external and internal learning,”IEEE Signal Processing Letters, vol. 26, no. 7, pp. 1080–1084, 2019.
A. K. Fletcher, P. Pandit, S. Rangan, S. Sarkar, and P. Schniter, “Plug-in estimation in high-dimensional linear inverse problems: A rigorous analysis,”Advances in Neural Information Processing Systems, pp. 7440–7449, 2018.
-  Y. Yazaki, Y. Tanaka, and S. H. Chan, “Interpolation and denoising of graph signals using plug-and-play ADMM,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5431–5435, 2019.
-  R. Chen, D. Zhai, X. Liu, and D. Zhao, “Noise-aware super-resolution of depth maps via graph-based plug-and-play framework,” Proc. IEEE International Conference on Image Processing, pp. 2536–2540, 2018.
-  S. H. Chan, “Performance analysis of plug-and-play ADMM: A graph signal processing perspective,” IEEE Transactions on Computational Imaging, vol. 5, no. 2, pp. 274–286, 2019.
-  G. T. Buzzard, S. H. Chan, S. Sreehari, and C. A. Bouman, “Plug-and-play unplugged: Optimization-free reconstruction using consensus equilibrium,” SIAM Journal on Imaging Sciences, vol. 11, no. 3, pp. 2001–2020, 2018.
-  S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play ADMM for image restoration: Fixed-point convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84-98, 2016.
-  http://www.mathworks.com/matlabcentral/fileexchange/60641.