I Introduction
A variety of image restoration problems, such as superresolution, deblurring, compressed sensing, tomography etc., are modeled as optimization problems of the form
(1) 
where the datafidelity term is derived from the degradation and noise models, while the regularizer is derived from some prior on the groundtruth image [1]. Traditionally, the regularizer is a sparsitypromoting function in some transform domain [2]. In recent years, researchers have explored the possibility of using powerful Gaussian denoisers such as NLM [3] and BM3D [4] for regularization purpose. In [5, 6], the regularizer is explicitly constructed from a denoiser. On the other hand, for plugandplay (PnP) methods [7, 8, 9, 10], the denoiser is formally substituted in place of the proximal operator in iterative algorithms such as FISTA [11], primaldual splitting [12], and ADMM [13].
The focus of this work is on an ADMMbased PnP method [14]. We recall that the ADMM based solution of (1) involves the following steps [13]:
(2)  
(3)  
(4) 
where is a penalty parameter and is the Euclidean norm (this is the rescaled form of ADMM). If and are convex, then under some technical conditions, the iterates are guaranteed to converge to a fixedpoint, which is the global minimizer of (1). Now, (3) corresponds to regularized Gaussian denoising, where assumes the role of the regularizer [15]. Based on this observation, the original proposal in [7] was to replace the proximal operation (3) with an offtheshelf denoiser, i.e., the update is replaced by , where is the denoiser in question. The idea is simply to exploit the excellent denoising capability of stateoftheart denoisers for restoration, even though we might not be able to conceive them as proximal operators (of some regularizer). We refer the readers to [7, 8] for a detailed account. The technical challenge, however, is that the resulting sequence of operations, referred to as PnPADMM, need not necessarily correspond to an optimization problem. As a result, the convergence of the iterates is at stake. In particular, we can no longer relate to the optimization in (1) and use existing results [13] to ensure convergence. Nevertheless, PnPADMM is often found to converge empirically and yields highquality reconstructions in several applications [7, 8, 14, 16]. Among other things, questions relating to the convergence and optimality of PnPtype methods have been studied in recent works. In [8], convergence guarantees were derived for a kernelbased denoiser for PnPADMM. Later, it was shown in [14] that the convergence can be ensured for a broad class of denoisers. Apart from ADMM, PnP algorithms based on various iterative methods have been explored in [9, 10, 17, 18, 19, 20]. We note that denoisers have also been used for regularization purpose in [21, 22, 23, 24, 25, 26, 27]. The relation of PnPADMM with graph Laplacianbased regularization was investigated in [28], whereas in [29] a framework motivated by PnP, called Consensus Equilibrium, was proposed.
In this letter, we revisit the proof of convergence of the PnPADMM algorithm in [14] and address an inadequacy therein. It was proved that, under suitable assumptions, the sequence of iterates generated by this algorithm converges to a fixedpoint, for any arbitrary initialization . Instead of a fixed , an adaptive is used in [14], which plays an important role in the proof. However, this necessitates the use of a casebycase approach conditioned on the adaptation rule (see Section II for details). Of the three cases considered in the paper, convergence was proved for the first two cases. It was claimed that convergence for the third case automatically follows from that of the first two cases. However, we argue that this is generally not true and hence a separate proof is needed for the third case. We give such a proof, which differs from the proof for the first two cases in [14]. In particular, we show that the difference between successive iterates is bounded by a piecewise geometric sequence, as opposed to a geometric sequence for the first two cases. We prove that this sequence is summable, which is used to show that the iterates form a Cauchy sequence (and is hence convergent).
Ii Background
As mentioned earlier, the updates in PnPADMM [14] are modeled on the ADMM updates (2)–(4), with the following changes: a denoiser is used in the update, and is updated in each iteration. In particular, the updates are given by
(5)  
(6)  
(7) 
where . Here is a denoising operator, where the parameter controls its denoising action. It was proposed to update based on the residual
(8) 
where the metric on is defined as
, and the three components of and
are vectors in
. Thus, (8) is simply the distance between the th and th iterates, which measures the progress made by the algorithm. The exact rule proposed in [14] is as follows:(9) 
where and are predefined parameters. The above rule, in effect, decreases the denoising strength ( is increased) if the ratio of the current and previous residuals is greater than ; else, is kept unchanged (see [14] for details).
It was claimed in [14] that the iterates generated by (5)–(7) converge to a fixed point if a couple of assumptions are met. The first concerns the datafidelity term.
Assumption 1.
The function is differentiable and there exists such that for all .
The second assumption concerns the denoiser.
Assumption 2.
There exists such that, for all ,
(10) 
While discussions on the above assumptions can be found in [14], here we reiterate a couple of remarks about Assumption 2. It is difficult to mathematically verify (10) even for simple denoisers, let alone sophisticated ones such as BM3D. However, an implication of (10) is that the denoiser acts like an identity map (idle filter) when is close to zero. It is reasonable to expect that any practical denoiser obeys this weaker condition. Moreover, while the denoiser might not perfectly behave as an identity operator when is close to zero, it is possible to artificially force this behavior.
We are now ready to state the convergence result in [14].
In particular, the iterates do not diverge or oscillate. We note that convergence of implies that as . However, the converse is generally not true, i.e., it is possible that converges to but do not converge. The technical point is that must vanish sufficiently fast to guarantee the convergence of . This is used in [14] as well as the present analysis.
To set up the technical context, we briefly recall the arguments provided in [14] in support of Theorem 3. First, Assumptions 1 and 2 were used to obtain the following result; see [14, Appendix B, Lemma 1].
Lemma 4.
If condition in (9) holds at iteration , then for some .
Now, note that exactly one of the following cases must hold:

Condition holds for finitely many .

Condition holds for finitely many .

Both and hold for infinitely many .
In [14], convergence was established for and as follows. Suppose is true, and let be the largest when holds, i.e., is true for . Then it follows from (9) that increases monotonically: for . Using Lemma 4, we can thus conclude that
(11) 
Similarly, for , let be the largest when holds, so that holds for . By recursively applying the condition in (9), we then obtain
(12)  
(13) 
where the second inequality follows from Lemma 4. In summary, for both and , we can find a sufficiently large and such that
(14) 
where . Namely, the error between successive iterates is eventually upperbounded by a decaying geometric sequence. Using the triangle inequality, the fast convergence of can be used to show that the original sequence is Cauchy, and hence convergent (since the ambient space is complete). This establishes the convergence of for the first two cases.
It was stated in [14] that is a “union of and ”, and that convergence under and implies convergence for . However, this is not true simply because the proof sketched above is valid only if one of or occurs finitely many times—this naturally excludes the case where both and occur infinitely often. For example, consider the hypothetical situation in which occurs for every even and
occurs for every odd
. Clearly, the proof does not work in this case.For further clarity, let us carefully examine the technique in [14] used to establish convergence for and . For , the eventual bound on was established using the fact that is monotonically increasing for . A similar bound for was derived using the second inequality in (9), which holds for . Thus, in both (11) and (12), the existence of a finite (or ) is vital because it allows us to ignore the first few terms of the sequence , and understand its behavior over the tail. In turn, this is possible because condition (or ) occurs only a finite number of times. If both and occur infinitely often, we cannot find a finite beyond which a single inequality holds for . This is precisely why the technique in [14] is not applicable for . Before proceeding further, we note why it is important to prove convergence in the case . Theorem 3 assures us that the algorithm converges regardless of which of the three cases hold. Therefore its proof remains incomplete unless convergence is proved for all three cases (and in particular ). Moreover, experiments suggest that is indeed likely to arise in certain practical scenarios. We have reported some empirical observations for deblurring and superresolution experiments in the supplementary material to back this. In these experiments, we found that when is close to , it is likely that holds, i.e., the algorithm keeps switching between conditions and .
Iii Main Result
We will now establish the convergence of for case . In particular, we will show that can be bounded by a sequence which vanishes sufficiently fast to ensure that is Cauchy. Such a sequence is defined next.
Definition 5.
A positive sequence is said to be a piecewise geometric sequence (PGS) if there exists and indices such that

for , the terms are in geometric progression with rate , i.e., for ,

the subsequence is in geometric progression with rate , i.e., for ,
In other words, a PGS can be divided into chunks that are in geometric progression (with identical rates). Moreover, the subsequence consisting of the peaks (i.e., the first term in each chunk) is itself in geometric progression. A PGS has a sawtoothlike appearance (see Figure 1), and is slower to decay to zero compared to a geometric sequence having the same rate. It turns out that the sequence of residues can be bounded by a PGS for case .
Lemma 6.
Let be the residuals for case . Then there exists a PGS such that for all .
This may be considered as an analogue of (14) for . To deduce that is Cauchy, it suffices to show that a PGS is summable.
Lemma 7.
If is a PGS, then converges.
The proof of Lemma 6 and 7 is somewhat technical and is deferred to Section IV. Importantly, using the above lemmas, we can establish the convergence of for case .
Proposition 8.
The iterates for case converge to a fixed point.
Proof.
As noted earlier, all we need to show is that is a Cauchy sequence. That is, for any given , we can find an integer such that whenever . Now, from the triangle inequality for metric and (8), we have
From Lemma 6 and 7, we can conclude that converges. This is because is bounded by the PGS , whose series itself converges. In particular, the partial sums of form a Cauchy sequence. As a result, for any , we can find a sufficiently large such that when . ∎
Iv Proofs
Iva Proof of Lemma 6
Let and . Note that . We will show that is bounded by a PGS with rate .
Let be the iteration at which condition holds for the first time. Further, let be the iteration at which condition occurs for the first time after (i.e. holds at iterations ). Let be the iteration at which holds for the first time after , and so on. Since holds, both and are true infinitely often. This gives us an infinite sequence of indices . Now, by construction, for each , holds at iterations . Hence, from (9), for . Since this trivially also holds for ,
By Lemma 4, for , we have
Letting , this becomes
(15) 
We now derive a relation between ’s for different . We know that . However, from (9) we get,
since Case 2 occurs at iterations . This gives
since and . Applying the above inequality recursively and using the fact , we get
Let . Hence from (15), for ,
(16) 
Now, condition holds for . Hence by recursively applying (9), we obtain for . Note that this trivially also holds for . Hence, we have for ,
(17) 
where we have used (16) with and the fact that .
IvB Proof of Lemma 7
Let the parameters , be as in Definition 5. We will prove the convergence of using the Cauchy criterion. Let be given. We need to find an index such that
(18) 
Let , and fix an integer such that
(19) 
This is possible since and the right side of (19) is positive. We will prove that (18) is satisfied by .
First, for fixed , we derive a bound on the sum of the terms from to . From Definition 5, we have
(20) 
The inequality in the third step holds since , while the last equality follows from Definition 5.
We are now ready to establish (18). Let be such that . Suppose lies in the chunk for some . Then . As a result,
where the inequality in the third step follows from (20) and the last inequality follows from (19). Therefore, satisfies the Cauchy criterion, where is defined by (19). This completes the proof.
V Conclusion
We pointed out that the proof of convergence of the PnPADMM algorithm in [14] does not address a certain case. We reasoned that this case needs to be handled differently from the cases addressed in [14]. This is because the approach in [14] fundamentally assumes that a certain condition holds a finite number of times, which is not true for the case in question. In particular, we showed that unlike the geometric sequences used for the other cases, we need to work with a piecewise geometric sequence. Our proof of convergence follows from the observation that the residue between successive iterations is upperbounded by this summable sequence. Our analysis rigorously establishes the convergence theorem in [14].
We note that in practice, optimization algorithms, including PnPADMM, are terminated after a finite number of iterations. In particular, since the cases in the convergence analysis involve infinite number of iterations, which of these hold in practice cannot be ascertained empirically. Therefore, getting a guarantee on theoretical convergence has practical importance—it provides a mathematical justification to terminate the algorithm after a sufficiently many iterations. This is precisely what was accomplished in this letter.
References
 [1] B. K. Gunturk and X. Li, Image Restoration: Fundamentals and Advances. CRC Press, 2012.
 [2] M. Elad, P. Milanfar, and R. Rubinstein, “Analysis versus synthesis in signal priors,” Inverse Problems, vol. 23, no. 3, p. 947, 2007.

[3]
A. Buades, B. Coll, and J. M. Morel, “A nonlocal algorithm for image
denoising,”
Proc. IEEE Conference on Computer Vision and Pattern Recognition
, vol. 2, pp. 60–65, 2005.  [4] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3D transformdomain collaborative filtering,” IEEE Transactions on Image Processing, vol. 16, no. 8, pp. 2080–2095, 2007.
 [5] Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM Journal on Imaging Sciences, vol. 10, no. 4, pp. 1804–1844, 2017.
 [6] E. T. Reehorst and P. Schniter, “Regularization by denoising: Clarifications and new interpretations,” IEEE Transactions on Computational Imaging, vol. 5, no. 1, pp. 52–67, 2019.
 [7] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plugandplay priors for model based reconstruction,” Proc. IEEE Global Conference on Signal and Information Processing, pp. 945–948, 2013.

[8]
S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy, J. P. Simmons, and C. A. Bouman, “Plugandplay priors for bright field electron tomography and sparse interpolation,”
IEEE Transactions on Computational Imaging, vol. 2, no. 4, pp. 408–423, 2016.  [9] S. Ono, “Primaldual plugandplay image restoration,” IEEE Signal Processing Letters, vol. 24, no. 8, pp. 1108–1112, 2017.
 [10] U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plugandplay priors approach for solving nonlinear imaging inverse problems,” IEEE Signal Processing Letters, vol. 24, no. 12, pp. 1872–1876, 2017.
 [11] A. Beck and M. Teboulle, “A fast iterative shrinkagethresholding algorithm for linear inverse problems,” SIAM Journal on Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
 [12] A. Chambolle and T. Pock, “A firstorder primaldual algorithm for convex problems with applications to imaging,” Journal of Mathematical Imaging and Vision, vol. 40, no. 1, pp. 120–145, 2011.

[13]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed
optimization and statistical learning via the alternating direction method of
multipliers,”
Foundations and Trends in Machine learning
, vol. 3, no. 1, pp. 1–122, 2011.  [14] S. H. Chan, X. Wang, and O. A. Elgendy, “Plugandplay ADMM for image restoration: Fixedpoint convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 84–98, 2017.
 [15] B. R. Hunt, “Bayesian methods in nonlinear digital image restoration,” IEEE Transactions on Computers, vol. C26, no. 3, pp. 219–229, 1977.
 [16] A. M. Teodoro, J. M. BioucasDias, and M. A. T. Figueiredo, “A convergent image fusion algorithm using sceneadapted gaussianmixturebased denoising,” IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 451–463, 2019.
 [17] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plugandplay algorithm for regularized image reconstruction,” IEEE Transactions on Computational Imaging, vol. 5, no. 3, pp. 395–408, 2019.
 [18] T. Meinhardt, M. Moller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” Proc. IEEE International Conference on Computer Vision, pp. 1781–1790, 2017.
 [19] E. Ryu, J. Liu, S. Wang, X. Chen, Z. Wang, and W. Yin, “Plugandplay methods provably converge with properly trained denoisers,” Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 5546–5557, 2019.

[20]
W. Dong, P. Wang, W. Yin, G. Shi, F. Wu, and X. Lu, “Denoising prior driven deep neural network for image restoration,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 10, pp. 2305–2318, 2018.  [21] A. Brifman, Y. Romano, and M. Elad, “Turning a denoiser into a superresolver using plug and play priors,” Proc. IEEE International Conference on Image Processing, pp. 1404–1408, 2016.
 [22] A. Teodoro, J. M. BioucasDias, and M. A. T. Figueiredo, “Image restoration and reconstruction using targeted plugandplay priors,” IEEE Transactions on Computational Imaging, 2019.
 [23] T. Tirer and R. Giryes, “Image restoration by iterative denoising and backward projections,” IEEE Transactions on Image Processing, vol. 28, no. 3, pp. 1220–1234, 2019.

[24]
T. Tirer and R. Giryes, “Superresolution via imageadapted denoising CNNs: Incorporating external and internal learning,”
IEEE Signal Processing Letters, vol. 26, no. 7, pp. 1080–1084, 2019. 
[25]
A. K. Fletcher, P. Pandit, S. Rangan, S. Sarkar, and P. Schniter, “Plugin estimation in highdimensional linear inverse problems: A rigorous analysis,”
Advances in Neural Information Processing Systems, pp. 7440–7449, 2018.  [26] Y. Yazaki, Y. Tanaka, and S. H. Chan, “Interpolation and denoising of graph signals using plugandplay ADMM,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5431–5435, 2019.
 [27] R. Chen, D. Zhai, X. Liu, and D. Zhao, “Noiseaware superresolution of depth maps via graphbased plugandplay framework,” Proc. IEEE International Conference on Image Processing, pp. 2536–2540, 2018.
 [28] S. H. Chan, “Performance analysis of plugandplay ADMM: A graph signal processing perspective,” IEEE Transactions on Computational Imaging, vol. 5, no. 2, pp. 274–286, 2019.
 [29] G. T. Buzzard, S. H. Chan, S. Sreehari, and C. A. Bouman, “Plugandplay unplugged: Optimizationfree reconstruction using consensus equilibrium,” SIAM Journal on Imaging Sciences, vol. 11, no. 3, pp. 2001–2020, 2018.
References
 [1] S. H. Chan, X. Wang, and O. A. Elgendy, “Plugandplay ADMM for image restoration: Fixedpoint convergence and applications,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 8498, 2016.
 [2] http://www.mathworks.com/matlabcentral/fileexchange/60641.
Comments
There are no comments yet.