In computational microscopy, the task of reconstructing an image of an unknown object from a collection of noisy light-intensity measurements is often formulated as an optimization problem
where the data-fidelity term ensures the consistency with the measurements and the regularizer promotes a solution with desirable prior properties such as non-negativity, self-similarity and transform-domain sparsity [1, 2, 3, 4]. Due to the nondifferentiability of most regularizers, proximal methods—such as variants of iterative shrinkage/thresholding algorithm (ISTA) [5, 6, 7, 8] and alternating direction method of multipliers (ADMM) [9, 10, 11]—are commonly used for solving (1). These algorithms avoid differentiating the regularizer by using a mathematical concept known as the proximal operator, which is itself an optimization problem equivalent to regularized image denoising.
Inspired by this mathematical equivalence, Venkatakrishnan et al.  introduced the plug-and-play priors (PnP) framework for image reconstruction. The key idea in PnP is to replace the proximal operator in an iterative algorithm with a state-of-the-art image denoiser, such as BM3D  or TNRD , which does not necessarily have a corresponding regularization objective. Although this implies that PnP methods generally lose interpretability as optimization problems, the framework has gained popularity because of its success in a range of applications in the context of imaging inverse problems [15, 16, 17, 18, 19, 20, 21, 22]. Note that regularization by denoising (RED) [23, 24] is an alternative approach for integrating a denoiser into an imaging problem. The key difference between RED and PnP is that the former builds an explicit regularizer, while the latter relies on an implicit regularization.
Nevertheless, all current iterative PnP algorithms are based on iterative batch procedures, which means that they always calculate the updates using the full set of measured data. This makes their application impractical to very large datasets . One natural example is Fourier ptychographic microscopy (FPM), where the image formation task relies on hundreds of variably illuminated intensity measurements [26, 27, 28].
A novel online extension of PnP-FISTA, called PnP-SGD, was recently proposed and theoretically analyzed for a convex . Here, we extend these results by adapting the algorithm to FPM imaging, where is nonconvex. We show that the proposed PnP-SGD-FPM enables high-quality FPM imaging at lower computational complexity by using only a small subset of measurements per iteration. We show on both simulated and experimentally measured FPM datasets that the algorithm substantially outperforms its batch counterparts when the memory budget is limited.
2.1 FPM as an inverse problem
Consider an unknown object with a complex transmission function , where denotes the spatial coordinates at the object plane. A total of LED sources are used to illuminate the object. Each illumination is treated as a local plane wave with a unique spatial frequency , . The exit wave from the object is described by the product: , which indicates that the center of the sample’s spectrum is shifted to [26, 27, 28]
. At the pupil plane, the shifted Fourier transformation of the exit wave is further filtered by the pupil function. For a single illumination, the discrete FPM model can be mathematically described by the following inverse problem
where denotes the discretized transmittance, with
being the vectorized representation of the desired object properties,represents the corresponding low-resolution light-intensity measurements, and is the noise vector. The operator computes the element-wise absolute value. The complex matrix is implemented by taking the Fourier transform () of the object, shifting and truncating the low frequency region (), multiplying it by a pupil function in the frequency domain (), and taking the inverse Fourier transform () with respect to the truncated spectrum.
In practice, the inverse problem is often reformulated as (1) using a quadratic data-fidelity term
Examples of popular regularizers in imaging include the spatial sparsity-promoting penalty and total variation (TV) penalty , where is the regularization parameter and is the discrete gradient operator . Note that the final objective in FPM is nonconvex due to the absolute value operator in the data-fidelity term (3).
FISTA and ADMM are two widely-used proximal algorithms for dealing with non-differentiable regularizers. The associated key operation is the proximal operator
which is mathematically equivalent to an image denoiser formulated as a regularized optimization . The algorithms differ from each other in their treatment of the data-fidelity. Whereas FISTA computes the gradient , ADMM relies on the corresponding proximal operator . In the context of FPM, we have
where denotes the conjugate transpose, and
2.2 Denoiser as a prior
Since the proximal operator is mathematically equivalent to regularized image denoising, the powerful idea of Venkatakrishnan et al.  was to consider replacing it with a more generic denoising operator controlled by . In order to be backward compatible with the traditional optimization formulation, this strength parameter is often scaled with the step-size as , for some parameter .
Recently, the effectiveness of PnP was extended beyond the original ADMM formulation  to other proximal algorithms such as primal-dual splitting and ISTA [20, 21, 22]. The formulation of PnP based on ISTA is summarized in Algorithm 1, where we introduce a sequence that controls the shift between FISTA and ISTA. When the following updates is adopted
the FISTA is used. On the other hand, if is set to one for any , the standard ISTA is considered. As the FISTA is known to converge faster, we design the PnP-SGD-FPM based on the FISTA formulation of PnP.
The theoretical convergence of PnP-ISTA was analyzed in a recent paper  for convex and differentiable . It was shown that the PnP-ISTA converges to a fixed point at the rate of if is a -averaged operator.
3 Proposed Method
We now introduce the online algorithm PnP-SGD-FPM and describe its advantages over PnP-FISTA. In FPM, the data-fidelity term consists of a large number of component functions
where each typically depends only on the subset of the measurements . Note that in the notation (9. The computation of the gradient of ,
scales with the total number of components , which means that when the latter is large, the classical batch PnP algorithms may become impractical in terms of speed or memory requirements. The central idea of PnP-SGD-FPM, summarized in Algorithm 2, is to approximate the gradient at every iteration with an average of component gradients
where are independent random variables that are distributed uniformly over . The minibatch size parameter controls the number of gradient components used at every iteration. Note that, when the is convex and Lipschitz continuous, the PnP-SGD-FPM is shown to converge to a fixed point at the worst-case rate of .
4 Numerical Validation
In this section, we validate our PnP-SGD-FPM on simulated and experimental data by considering two representative denoisers: TV  and BM3D . Note that our focus is to demonstrate the effectiveness of the proposed PnP method for FPM rather than to test different denoisers, although the algorithm is readily compatible with other state-of-the-art denoising methods.
4.1 Experimental Setup
We set up the simulation to match the FPM system used for the experimental data . The sample is placed 70 mm below a 32 32 surface-mounted LED array with a 4 mm pitch. All LED sources generate light with a wavelength 513 mm and the bandwidth 20 nm. We individually illuminate the sample with 293 LEDs centered in the array and record the corresponding intensity measurements with a camera placed under the sample. The numerical aperture (NA) of the objective is 0.2 . We achieve the synthetic NA of around 0.7 by summing the NA of the objective and illumination.
4.2 Benefits of PnP-SGD-FPM
We first quantitatively analyze the performance of PnP-SGD-FPM by reconstructing six common gray-scale images discretized to pixels: Cameraman, House, Jet, Lenna, Pepper and Woman. The simulated measurements were obtained by solving the forward model defined in (2
). Additionally, the measurements were corrupted by an additive Gaussian noise (AWGN) corresponding to 40 dB of input signal-to-noise ratio (SNR). The SNR is also used as a metric for numerically evaluating the quality of reconstruction. In simulations, all algorithmic hyperparameters were optimized for the best SNR performance with respect to the original image. We useaverage SNR to indicate the SNR averaged over all the test images.
Figure 2 illustrates the evolution of average SNR across iterations for different priors and PnP variants. With the minibatch , the online algorithms randomly select different subset of measurements at each iteration, whereas the batch algorithms use the same measurements in the reconstruction. Hence, as expected, by eventually cycling through all the measurements, NoReg-Online achieves a higher average SNR than NoReg-Batch, which only uses the fixed illuminations. Additionally, the performance is further enhanced by using priors. For example, TV-Online and BM3D-Online increase the average SNR from 18.48 dB to 19.42 dB and 19.74 dB, respectively. A visual illustration on Cameraman is presented in Figure 1, where BM3D-Full is shown as reference. We observe that, even with , the solution of BM3D-Online is only 0.25 dB lower than the batch PnP algorithm using all 293 measurements.
Figure 3 compares the average SNR performance of online and batch PnP algorithms within a fixed run-time. In the test, BM3D-Online uses only 60 measurements per iteration, while BM3D-Full uses all 293 measurements. The lower per iteration cost makes the reconstruction of BM3D-Online substantially faster than that of its batch rival. In particular, the averaged single-iteration run-time of BM3D-Online and BM3D-Full was 9.07 seconds and 19.66 seconds, respectively. We also note that BM3D-Online and BM3D-Full eventually converge to the same average SNR, which agrees with the plots in Figure 2. Additionally, PnP-SGD-FPM achieves a substantial speedup due to its reduction in per-iteration computational complexity, which makes the algorithm applicable to very large datasets.
4.3 Validation on real data
We now validate the performance of PnP-SGD-FPM on experimental FPM data. The sample used in the experiment consists of the human cervical adenocarcinoma epithelial (HeLa) cells . The system corresponds to the FPM described in Section 4.1 with total 293 measurements for reconstruction.
Figure 4 compares the images of HeLa cells reconstructed by PnP-SGD-FPM and PnP-FISTA. Each image has the resolution of pixels. The green arrows in the white rectangles highlight the artifacts. We consider the scenario with a limited memory budget sufficient only for 60 measurements. Since batch algorithms use a fixed subset of measurements, they produce strong unnatural features, such as the horizontal streaking artifacts in NoReg-Batch. Both TV-Batch and BM3D-Batch mitigate this artifact by using priors, but they generate blockiness and vertical grids in the cell, respectively. Online algorithms generally improve the visual quality by making use of all the data. Finally, BM3D-Online alleviates the artifacts and reconstructs a high-quality image, while NoReg-Online and TV-Online still have streaking artifacts and blockiness.
In this paper, we propose a novel online plug-and-play algorithm for the regularized Fourier ptychographic microscopy. Numerical simulations demonstrate that PnP-SGD-FPM converges to a nearly-optimal average SNR in a shorter amount of time. The experiments for FPM confirm the effectiveness and efficiency of PnP-SGD-FPM in practice, and shows its potential for other imaging applications. More generally, this work shows the potential of PnP-SGD-FPM to solve inverse problems beyond traditional convex data-fidelity terms.
-  L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268, November 1992.
M. A. T. Figueiredo and R. D. Nowak,
“Wavelet-based image estimation: An empirical Bayes approach using Jeffreys’ noninformative prior,”IEEE Trans. Image Process., vol. 10, no. 9, pp. 1322–1331, September 2001.
-  M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process., vol. 15, no. 12, pp. 3736–3745, December 2006.
-  A. Danielyan, V. Katkovnik, and K. Egiazarian, “BM3D frames and variational image deblurring,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 1715–1728, April 2012.
-  M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for wavelet-based image restoration,” IEEE Trans. Image Process., vol. 12, no. 8, pp. 906–916, August 2003.
-  I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Commun. Pure Appl. Math., vol. 57, no. 11, pp. 1413–1457, November 2004.
-  J. Bect, L. Blanc-Feraud, G. Aubert, and A. Chambolle, “A -unified variational framework for image restoration,” in Proc. ECCV, Springer, Ed., New York, 2004, vol. 3024, pp. 1–13.
-  A. Beck and M. Teboulle, “Fast gradient-based algorithm for constrained total variation image denoising and deblurring problems,” IEEE Trans. Image Process., vol. 18, no. 11, pp. 2419–2434, November 2009.
-  J. Eckstein and D. P. Bertsekas, “On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators,” Mathematical Programming, vol. 55, pp. 293–318, 1992.
-  M. V. Afonso, J. M.Bioucas-Dias, and M. A. T. Figueiredo, “Fast image recovery using variable splitting and constrained optimization,” IEEE Trans. Image Process., vol. 19, no. 9, pp. 2345–2356, September 2010.
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein,
“Distributed optimization and statistical learning via the
alternating direction method of multipliers,”
Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011.
-  S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plug-and-play priors for model based reconstruction,” in Proc. IEEE Global Conf. Signal Process. and INf. Process. (GlobalSIP), Austin, TX, USA, December 3-5, 2013, pp. 945–948.
-  K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-D transform-domain collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 16, pp. 2080–2095, August 2007.
-  Y. Chen and T. Pock, “Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1256–1272, June 2017.
S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy,
J. P. Simmons, and C. A. Bouman,
“Plug-and-play priors for bright field electron tomography and sparse interpolation,”IEEE Trans. Comp. Imag., vol. 2, no. 4, pp. 408–423, December 2016.
-  S. H. Chan, X. Wang, and O. A. Elgendy, “Plug-and-play ADMM for image restoration: Fixed-point convergence and applications,” IEEE Trans. Comp. Imag., vol. 3, no. 1, pp. 84–98, March 2017.
-  A. M. Teodoro, J. M. Biocas-Dias, and M. A. T. Figueiredo, “Image restoration and reconstruction using variable splitting and class-adapted image priors,” in Proc. IEEE Int. Conf. Image Proc. (ICIP), Phoenix, AZ, USA, September 25-28, 2016, pp. 3518–3522.
-  K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep CNN denoiser prior for image restoration,” in , 2017.
-  A. Teodoro, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “Scene-adapted plug-and-play algorithm with convergence guarantees,” in Proc. IEEE Int. Workshop on Machine Learning for Signal Processing, Tokyo, Japan, September 25-28, 2017.
-  S. Ono, “Primal-dual plug-and-play image restoration,” IEEE Signal. Proc. Let., vol. 24, no. 8, pp. 1108–1112, 2017.
-  T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” in Proc. IEEE Int. Conf. Comp. Vis. (ICCV), Venice, Italy, October 22-29, 2017, pp. 1799–1808.
-  U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plug-and-play priors approach for solving nonlinear imaging inverse problems,” IEEE Signal. Proc. Let., vol. 24, no. 12, pp. 1872–1876, December 2017.
-  Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM J. Imaging Sci., vol. 10, no. 4, pp. 1804–1844, 2017.
-  C. Metzler, P. Schniter, A. Veeraraghavan, and R. Baraniuk, “prDeep: Robust phase retrieval with a flexible deep network,” in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018.
-  L. Bottou and O. Bousquet, “The tradeoffs of large scale learning,” in Proc. Advances in Neural Information Processing Systems 20, Vancouver, BC, Canada, December 3-6, 2007, pp. 161–168.
-  G. Zheng, R. Horstmeyer, and C. Yang, “Wide-field, high-resolution fourier ptychographic microscopy,” Nature Photonics, vol. 7, pp. 739 EP –, 07 2013.
-  L. Tian, Z. Liu, L. H. Yeh, M. Chen, J. Zhong, and L. Waller, “Computational illumination for high-speed in vitro fourier ptychographic microscopy,” Optica, vol. 2, no. 10, pp. 904–911, Oct 2015.
-  Y. Zhang, P. Song, J. Zhang, and Q. Dai, “Fourier ptychographic microscopy with sparse representation,” Scientific Reports, vol. 7, no. 1, pp. 8664, 2017.
-  Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plug-and-play algorithm for regularized image reconstruction,” arXiv:1809.04693, 2018.