1 Introduction
In computational microscopy, the task of reconstructing an image of an unknown object from a collection of noisy lightintensity measurements is often formulated as an optimization problem
(1) 
where the datafidelity term ensures the consistency with the measurements and the regularizer promotes a solution with desirable prior properties such as nonnegativity, selfsimilarity and transformdomain sparsity [1, 2, 3, 4]. Due to the nondifferentiability of most regularizers, proximal methods—such as variants of iterative shrinkage/thresholding algorithm (ISTA) [5, 6, 7, 8] and alternating direction method of multipliers (ADMM) [9, 10, 11]—are commonly used for solving (1). These algorithms avoid differentiating the regularizer by using a mathematical concept known as the proximal operator, which is itself an optimization problem equivalent to regularized image denoising.
Inspired by this mathematical equivalence, Venkatakrishnan et al. [12] introduced the plugandplay priors (PnP) framework for image reconstruction. The key idea in PnP is to replace the proximal operator in an iterative algorithm with a stateoftheart image denoiser, such as BM3D [13] or TNRD [14], which does not necessarily have a corresponding regularization objective. Although this implies that PnP methods generally lose interpretability as optimization problems, the framework has gained popularity because of its success in a range of applications in the context of imaging inverse problems [15, 16, 17, 18, 19, 20, 21, 22]. Note that regularization by denoising (RED) [23, 24] is an alternative approach for integrating a denoiser into an imaging problem. The key difference between RED and PnP is that the former builds an explicit regularizer, while the latter relies on an implicit regularization.
Nevertheless, all current iterative PnP algorithms are based on iterative batch procedures, which means that they always calculate the updates using the full set of measured data. This makes their application impractical to very large datasets [25]. One natural example is Fourier ptychographic microscopy (FPM), where the image formation task relies on hundreds of variably illuminated intensity measurements [26, 27, 28].
A novel online extension of PnPFISTA, called PnPSGD, was recently proposed and theoretically analyzed for a convex [29]. Here, we extend these results by adapting the algorithm to FPM imaging, where is nonconvex. We show that the proposed PnPSGDFPM enables highquality FPM imaging at lower computational complexity by using only a small subset of measurements per iteration. We show on both simulated and experimentally measured FPM datasets that the algorithm substantially outperforms its batch counterparts when the memory budget is limited.
2 Background
2.1 FPM as an inverse problem
Consider an unknown object with a complex transmission function , where denotes the spatial coordinates at the object plane. A total of LED sources are used to illuminate the object. Each illumination is treated as a local plane wave with a unique spatial frequency , . The exit wave from the object is described by the product: , which indicates that the center of the sample’s spectrum is shifted to [26, 27, 28]
. At the pupil plane, the shifted Fourier transformation of the exit wave is further filtered by the pupil function
. For a single illumination, the discrete FPM model can be mathematically described by the following inverse problem(2) 
where denotes the discretized transmittance, with
being the vectorized representation of the desired object properties,
represents the corresponding lowresolution lightintensity measurements, and is the noise vector. The operator computes the elementwise absolute value. The complex matrix is implemented by taking the Fourier transform () of the object, shifting and truncating the low frequency region (), multiplying it by a pupil function in the frequency domain (), and taking the inverse Fourier transform () with respect to the truncated spectrum.In practice, the inverse problem is often reformulated as (1) using a quadratic datafidelity term
(3) 
Examples of popular regularizers in imaging include the spatial sparsitypromoting penalty and total variation (TV) penalty , where is the regularization parameter and is the discrete gradient operator [1]. Note that the final objective in FPM is nonconvex due to the absolute value operator in the datafidelity term (3).
FISTA and ADMM are two widelyused proximal algorithms for dealing with nondifferentiable regularizers. The associated key operation is the proximal operator
(4) 
which is mathematically equivalent to an image denoiser formulated as a regularized optimization [12]. The algorithms differ from each other in their treatment of the datafidelity. Whereas FISTA computes the gradient , ADMM relies on the corresponding proximal operator . In the context of FPM, we have
(5)  
with  (6) 
where denotes the conjugate transpose, and
(7) 
Because the datafit in FPM is not convex, the optimization in (7) is difficult to solve. Additionally, the calculation of (7) is computationally expensive when the number of measurements is large.
2.2 Denoiser as a prior
Since the proximal operator is mathematically equivalent to regularized image denoising, the powerful idea of Venkatakrishnan et al. [12] was to consider replacing it with a more generic denoising operator controlled by . In order to be backward compatible with the traditional optimization formulation, this strength parameter is often scaled with the stepsize as , for some parameter .
Recently, the effectiveness of PnP was extended beyond the original ADMM formulation [12] to other proximal algorithms such as primaldual splitting and ISTA [20, 21, 22]. The formulation of PnP based on ISTA is summarized in Algorithm 1, where we introduce a sequence that controls the shift between FISTA and ISTA. When the following updates is adopted
(8) 
the FISTA is used. On the other hand, if is set to one for any , the standard ISTA is considered. As the FISTA is known to converge faster, we design the PnPSGDFPM based on the FISTA formulation of PnP.
The theoretical convergence of PnPISTA was analyzed in a recent paper [29] for convex and differentiable . It was shown that the PnPISTA converges to a fixed point at the rate of if is a averaged operator.
3 Proposed Method
We now introduce the online algorithm PnPSGDFPM and describe its advantages over PnPFISTA. In FPM, the datafidelity term consists of a large number of component functions
(9) 
where each typically depends only on the subset of the measurements . Note that in the notation (9
), the expectation is taken over a uniformly distributed random variable
. The computation of the gradient of ,(10) 
scales with the total number of components , which means that when the latter is large, the classical batch PnP algorithms may become impractical in terms of speed or memory requirements. The central idea of PnPSGDFPM, summarized in Algorithm 2, is to approximate the gradient at every iteration with an average of component gradients
(11) 
where are independent random variables that are distributed uniformly over . The minibatch size parameter controls the number of gradient components used at every iteration. Note that, when the is convex and Lipschitz continuous, the PnPSGDFPM is shown to converge to a fixed point at the worstcase rate of [29].
4 Numerical Validation
In this section, we validate our PnPSGDFPM on simulated and experimental data by considering two representative denoisers: TV [1] and BM3D [4]. Note that our focus is to demonstrate the effectiveness of the proposed PnP method for FPM rather than to test different denoisers, although the algorithm is readily compatible with other stateoftheart denoising methods.
4.1 Experimental Setup
We set up the simulation to match the FPM system used for the experimental data [27]. The sample is placed 70 mm below a 32 32 surfacemounted LED array with a 4 mm pitch. All LED sources generate light with a wavelength 513 mm and the bandwidth 20 nm. We individually illuminate the sample with 293 LEDs centered in the array and record the corresponding intensity measurements with a camera placed under the sample. The numerical aperture (NA) of the objective is 0.2 [27]. We achieve the synthetic NA of around 0.7 by summing the NA of the objective and illumination.
4.2 Benefits of PnPSGDFPM
We first quantitatively analyze the performance of PnPSGDFPM by reconstructing six common grayscale images discretized to pixels: Cameraman, House, Jet, Lenna, Pepper and Woman. The simulated measurements were obtained by solving the forward model defined in (2
). Additionally, the measurements were corrupted by an additive Gaussian noise (AWGN) corresponding to 40 dB of input signaltonoise ratio (SNR). The SNR is also used as a metric for numerically evaluating the quality of reconstruction. In simulations, all algorithmic hyperparameters were optimized for the best SNR performance with respect to the original image. We use
average SNR to indicate the SNR averaged over all the test images.Figure 2 illustrates the evolution of average SNR across iterations for different priors and PnP variants. With the minibatch , the online algorithms randomly select different subset of measurements at each iteration, whereas the batch algorithms use the same measurements in the reconstruction. Hence, as expected, by eventually cycling through all the measurements, NoRegOnline achieves a higher average SNR than NoRegBatch, which only uses the fixed illuminations. Additionally, the performance is further enhanced by using priors. For example, TVOnline and BM3DOnline increase the average SNR from 18.48 dB to 19.42 dB and 19.74 dB, respectively. A visual illustration on Cameraman is presented in Figure 1, where BM3DFull is shown as reference. We observe that, even with , the solution of BM3DOnline is only 0.25 dB lower than the batch PnP algorithm using all 293 measurements.
Figure 3 compares the average SNR performance of online and batch PnP algorithms within a fixed runtime. In the test, BM3DOnline uses only 60 measurements per iteration, while BM3DFull uses all 293 measurements. The lower per iteration cost makes the reconstruction of BM3DOnline substantially faster than that of its batch rival. In particular, the averaged singleiteration runtime of BM3DOnline and BM3DFull was 9.07 seconds and 19.66 seconds, respectively. We also note that BM3DOnline and BM3DFull eventually converge to the same average SNR, which agrees with the plots in Figure 2. Additionally, PnPSGDFPM achieves a substantial speedup due to its reduction in periteration computational complexity, which makes the algorithm applicable to very large datasets.
4.3 Validation on real data
We now validate the performance of PnPSGDFPM on experimental FPM data. The sample used in the experiment consists of the human cervical adenocarcinoma epithelial (HeLa) cells [27]. The system corresponds to the FPM described in Section 4.1 with total 293 measurements for reconstruction.
Figure 4 compares the images of HeLa cells reconstructed by PnPSGDFPM and PnPFISTA. Each image has the resolution of pixels. The green arrows in the white rectangles highlight the artifacts. We consider the scenario with a limited memory budget sufficient only for 60 measurements. Since batch algorithms use a fixed subset of measurements, they produce strong unnatural features, such as the horizontal streaking artifacts in NoRegBatch. Both TVBatch and BM3DBatch mitigate this artifact by using priors, but they generate blockiness and vertical grids in the cell, respectively. Online algorithms generally improve the visual quality by making use of all the data. Finally, BM3DOnline alleviates the artifacts and reconstructs a highquality image, while NoRegOnline and TVOnline still have streaking artifacts and blockiness.
5 Conclusion
In this paper, we propose a novel online plugandplay algorithm for the regularized Fourier ptychographic microscopy. Numerical simulations demonstrate that PnPSGDFPM converges to a nearlyoptimal average SNR in a shorter amount of time. The experiments for FPM confirm the effectiveness and efficiency of PnPSGDFPM in practice, and shows its potential for other imaging applications. More generally, this work shows the potential of PnPSGDFPM to solve inverse problems beyond traditional convex datafidelity terms.
References
 [1] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268, November 1992.

[2]
M. A. T. Figueiredo and R. D. Nowak,
“Waveletbased image estimation: An empirical Bayes approach using Jeffreys’ noninformative prior,”
IEEE Trans. Image Process., vol. 10, no. 9, pp. 1322–1331, September 2001.  [3] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” IEEE Trans. Image Process., vol. 15, no. 12, pp. 3736–3745, December 2006.
 [4] A. Danielyan, V. Katkovnik, and K. Egiazarian, “BM3D frames and variational image deblurring,” IEEE Trans. Image Process., vol. 21, no. 4, pp. 1715–1728, April 2012.
 [5] M. A. T. Figueiredo and R. D. Nowak, “An EM algorithm for waveletbased image restoration,” IEEE Trans. Image Process., vol. 12, no. 8, pp. 906–916, August 2003.
 [6] I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Commun. Pure Appl. Math., vol. 57, no. 11, pp. 1413–1457, November 2004.
 [7] J. Bect, L. BlancFeraud, G. Aubert, and A. Chambolle, “A unified variational framework for image restoration,” in Proc. ECCV, Springer, Ed., New York, 2004, vol. 3024, pp. 1–13.
 [8] A. Beck and M. Teboulle, “Fast gradientbased algorithm for constrained total variation image denoising and deblurring problems,” IEEE Trans. Image Process., vol. 18, no. 11, pp. 2419–2434, November 2009.
 [9] J. Eckstein and D. P. Bertsekas, “On the DouglasRachford splitting method and the proximal point algorithm for maximal monotone operators,” Mathematical Programming, vol. 55, pp. 293–318, 1992.
 [10] M. V. Afonso, J. M.BioucasDias, and M. A. T. Figueiredo, “Fast image recovery using variable splitting and constrained optimization,” IEEE Trans. Image Process., vol. 19, no. 9, pp. 2345–2356, September 2010.

[11]
S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein,
“Distributed optimization and statistical learning via the
alternating direction method of multipliers,”
Foundations and Trends in Machine Learning
, vol. 3, no. 1, pp. 1–122, 2011.  [12] S. V. Venkatakrishnan, C. A. Bouman, and B. Wohlberg, “Plugandplay priors for model based reconstruction,” in Proc. IEEE Global Conf. Signal Process. and INf. Process. (GlobalSIP), Austin, TX, USA, December 35, 2013, pp. 945–948.
 [13] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3D transformdomain collaborative filtering,” IEEE Trans. Image Process., vol. 16, no. 16, pp. 2080–2095, August 2007.
 [14] Y. Chen and T. Pock, “Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1256–1272, June 2017.

[15]
S. Sreehari, S. V. Venkatakrishnan, B. Wohlberg, G. T. Buzzard, L. F. Drummy,
J. P. Simmons, and C. A. Bouman,
“Plugandplay priors for bright field electron tomography and sparse interpolation,”
IEEE Trans. Comp. Imag., vol. 2, no. 4, pp. 408–423, December 2016.  [16] S. H. Chan, X. Wang, and O. A. Elgendy, “Plugandplay ADMM for image restoration: Fixedpoint convergence and applications,” IEEE Trans. Comp. Imag., vol. 3, no. 1, pp. 84–98, March 2017.
 [17] A. M. Teodoro, J. M. BiocasDias, and M. A. T. Figueiredo, “Image restoration and reconstruction using variable splitting and classadapted image priors,” in Proc. IEEE Int. Conf. Image Proc. (ICIP), Phoenix, AZ, USA, September 2528, 2016, pp. 3518–3522.

[18]
K. Zhang, W. Zuo, S. Gu, and L. Zhang,
“Learning deep CNN denoiser prior for image restoration,”
in
Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)
, 2017.  [19] A. Teodoro, J. M. BioucasDias, and M. A. T. Figueiredo, “Sceneadapted plugandplay algorithm with convergence guarantees,” in Proc. IEEE Int. Workshop on Machine Learning for Signal Processing, Tokyo, Japan, September 2528, 2017.
 [20] S. Ono, “Primaldual plugandplay image restoration,” IEEE Signal. Proc. Let., vol. 24, no. 8, pp. 1108–1112, 2017.
 [21] T. Meinhardt, M. Moeller, C. Hazirbas, and D. Cremers, “Learning proximal operators: Using denoising networks for regularizing inverse imaging problems,” in Proc. IEEE Int. Conf. Comp. Vis. (ICCV), Venice, Italy, October 2229, 2017, pp. 1799–1808.
 [22] U. S. Kamilov, H. Mansour, and B. Wohlberg, “A plugandplay priors approach for solving nonlinear imaging inverse problems,” IEEE Signal. Proc. Let., vol. 24, no. 12, pp. 1872–1876, December 2017.
 [23] Y. Romano, M. Elad, and P. Milanfar, “The little engine that could: Regularization by denoising (RED),” SIAM J. Imaging Sci., vol. 10, no. 4, pp. 1804–1844, 2017.
 [24] C. Metzler, P. Schniter, A. Veeraraghavan, and R. Baraniuk, “prDeep: Robust phase retrieval with a flexible deep network,” in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018.
 [25] L. Bottou and O. Bousquet, “The tradeoffs of large scale learning,” in Proc. Advances in Neural Information Processing Systems 20, Vancouver, BC, Canada, December 36, 2007, pp. 161–168.
 [26] G. Zheng, R. Horstmeyer, and C. Yang, “Widefield, highresolution fourier ptychographic microscopy,” Nature Photonics, vol. 7, pp. 739 EP –, 07 2013.
 [27] L. Tian, Z. Liu, L. H. Yeh, M. Chen, J. Zhong, and L. Waller, “Computational illumination for highspeed in vitro fourier ptychographic microscopy,” Optica, vol. 2, no. 10, pp. 904–911, Oct 2015.
 [28] Y. Zhang, P. Song, J. Zhang, and Q. Dai, “Fourier ptychographic microscopy with sparse representation,” Scientific Reports, vol. 7, no. 1, pp. 8664, 2017.
 [29] Y. Sun, B. Wohlberg, and U. S. Kamilov, “An online plugandplay algorithm for regularized image reconstruction,” arXiv:1809.04693, 2018.