1 Introduction
The problem of reconstructing the spatial distribution of the dielectric permittivity of an unknown object from the measurements of the light it scatters is common in many applications such as tomographic microscopy [1] and digital holography [2]. The problem is often formulated as a linear inverse problem by adopting scattering models based on the first Born [3] or Rytov [4] approximations. However, these linear approximations are inaccurate when scattering is strong, which leads to reconstruction artifacts for objects that are large or have high permittivity contrasts [5]. For strongly scattering objects, it is preferable to use nonlinear measurement models that can account for multiple light scattering inside the object [6, 7, 8, 9, 10, 11, 12, 13, 14].
When adopting a nonlinear measurement model, it is common to formulate image reconstruction as an optimization problem. The objective function in the optimization typically includes two terms: a datafidelity term that ensures that the final image is consistent with measured data, and a regularizer that mitigates the illposedness of the problem by promoting solutions with desirable properties [15]. For example, one of the most widely adopted regularizers is total variation (TV), which preserves image edges while promoting smoothness [16]. TV is often interpreted as a sparsityenforcing penalty on the image gradient and has proven to be successful in the context of diffraction tomography with and without multiple scattering [17, 18, 19, 21, 12, 13, 20].
Despite the recent progress in regularized image reconstruction under multiple scattering, the corresponding optimization problem is difficult to solve. The challenging aspects are the nonconvex nature of the objective and the large amount of data that needs to be processed in typical imaging applications. In particular, when the scattering is strong, the problem becomes highly nonconvex, which negatively impacts both the speed of reconstruction and the quality of the final image [14].
In this paper, we consider a fundamentally different approach to the problem of image reconstruction under multiple scattering. Recently, several results have interpreted multiple scattering as a forwardpass of a convolutional neural network (CNN) [10, 21, 13]. This view inspires us to reconstruct the object by designing another CNN that is specifically trained to invert multiple scattering in a purely datadriven fashion. While our approach is consistent with the recent trend of using deep learning architectures for image reconstruction [22, 23, 24, 25, 26, 27, 28, 29, 30], it is fundamentally different in the sense that due to multiple scattering our measurement operator is both nonlinear and object dependent (and hence unknown). Our approach is also related to the recent work on reverse photon migration for diffuse optical tomography [31]. However, our focus is on diffractive imaging, where the light propagation is assumed to be deterministic, rather than stochastic as in [31]. Finally, we extensively validated the proposed method on several simulated and real datasets by comparing the method against recent optimizationbased approaches based on the LippmannSchwinger (LS) model and the TV regularizer [12, 14]. Our results show that it is possible to invert multiple scattering by training a CNN, even when imaging strongly scattering objects for which optimizationbased approaches underperform. To the best of our knowledge, the results here are the first to show the potential of deep learning to reconstruct highquality images from multiple scattered light measurements.
2 Nonlinear diffractive imaging
In this section, we describe the traditional optimizationbased approach for nonlinear diffractive imaging. We first review the image reconstruction and then discuss the details of the physical model for multiple scattering.
2.1 Nonlinear inverse problem
We consider an imaging inverse problem
(1) 
where the goal is to recover the unknown image from the noisy measurements . The measurement operator
models the response of the imaging system and the vector
represents the measurement noise, which is often assumed to be independent and identically distributed (i.i.d.) Gaussian. When the inverse problem is linear, the measurement operator is represented as a measurement matrix .In practice, problems such as (1) are often illposed; the standard approach for solving them is by formulating an optimization problem
(2) 
where the datafidelity term ensures that the final image is consistent with measured data and the regularizer promotes solutions with desirable properties. Two common regularizers for images include the spatial sparsitypromoting penalty and total variation (TV) penalty , where is the discrete gradient operator [16, 32, 33]. Two common methods for solving optimization problems of form (2) are FISTA [34] and ADMM [35], both of which were successfully applied to the problem of image reconstruction from scattered light data [11, 12, 20, 14].
2.2 Multiplescattering model
Consider the scattering problem illustrated in Fig. 1, where an object of the permittivity distribution in the bounded domain is immersed into a background medium of permittivity and illuminated with the incident electric field . We assume that the incident field is monochromatic and coherent, and it is known inside and at the locations of the sensors. The result of objectlight interaction is measured at the location of the sensors as a scattered field . The multiple scattering of light can be accurately described by the LippmannSchwinger equation [36] inside the image domain
(3) 
where is the total electric field, is the scattering potential, which is assumed to be real, and is the wavenumber in vacuum. The function is the Green’s function, defined as
(4) 
where is the wavenumber of the background medium and is the zeroorder Hankel function of the first kind. Note that the knowledge of the totalfield inside the image domain enables the prediction of the scattered field at the sensor
(5) 
The discretization and combination of (3) and (5) leads to the following matrixvector description of the scattering problem
(6a)  
(6b) 
where is the discretized scattering potential , is the measured scattered field at , is the input field inside , is the discretization of the Green’s function evaluated at , is the discretization of the Green’s function evaluated inside , denotes a componentwise multiplication between two vectors, and models the additive noise at the measurements. Using the shorthand notation , where
is the identity matrix and
is an operator that forms a diagonal matrix from its argument, we can formally specify the nonlinear forward model as follows(7) 
The recent work has shown that the computation of the operator can be interpreted as a CNN and that the gradient of the corresponding datafidelity term can be efficiently evaluated, enabling efficient optimization [21, 12, 13].
3 Scattering decoder
We now describe our proposed deep learning approach called Scattering Decoder (ScaDec).
3.1 Backpropagation
The general framework of our approach is visually illustrated in Fig. 2. The firststep in the method is backpropagation, which simply transforms the collected data from the measurement domain to the image domain. We define the backpropagation of the measurements generated by the th transmitter as
(8) 
where vector are the measurements consistent with the th transmitter and collected by receivers, and matrix is the backpropagation operator. Inside the operator , matrix is the Hermitian transpose of the discretized Green’s function , and is the elementwise conjugate of the incident light field emitted by the th transmitter. The output is a complex vector with elements, which matches the number of pixels in the original image. When the data is collected with multiple transmissions, we define the backpropagation of transmitters as
(9) 
where vector is the linear combination of and denotes the number of transmitters. Note that the backpropagation (9) does not rely on the actual forward model in (7) which is both nonlinear and object dependent. Remarkably, as we shall see, our simple backpropagation followed by a specific CNN architecture will be sufficient to recover a highquality image given multiple scattered measurements.
Note that since is a complex vector, we consider its real and complex parts as two distinct feature maps of the object . Thus, the backpropagation can be viewed as a fixed layer in a CNN with inputs and two outputs to the subsequent layers (see Fig. 2) [37]. The weights inside the layer are characterized by
’s, and the activation functions for the output nodes are
and , respectively.3.2 UNet decoder
We design the ScaDec model based on the popular UNet architecture [37], which was recently applied to various image reconstruction tasks such as Xray CT [27, 39]. Fig. 3 shows a detailed diagram of the proposed CNN architecture. There are two key properties that recommend UNet for our purpose.

Multiresolution decomposition
: The decoder employs a contractionexpansion structure based on the maxpooling and the upconvolution. This means that given a fixed size convolution kernel (
in our case), the effective receptive field of the network increases as the input goes deeper into the network. 
Localglobal composition: In each resolution level, the outputs of the convolutional block in the contraction are directly connected and concatenated with the input of the convolutional block in the expansion. The skip connection enables the later layers to reconstruct the feature maps with both the local details and the global texture.
The suitability of UNet architecture is further corroborated by the results in Section 4, demonstrating the ability of the network to form highquality images from multiple scattered measurements.
4 Experimental validation
We now present the results of validating our method on simulated and experimental datasets. We evaluate the dataadaptive recovery capability of ScaDec by selecting datasets that consist of images with nontrivial features that can be well represented by a CNN, but are not well captured by fixed regularizers such as TV. The first simulated dataset consists of synthetically generated piecewisesmooth images with sharp edges and smooth Gaussian regions. The second simulated dataset consists of human face images [40]. The experimental dataset is the public dataset provided by the Fresnel Institute [41], which consists of experimental microwave measurements of the scattered electric field from 2D targets consisting of foam and plastic cylinders.
4.1 Results on simulated datasets
The two simulated datasets were obtained by using a highfidelity simulation of multiple scattering with the conjugategradient solver. Each of the datasets contains 1548 images, separated into 1500 images for training, 24 images for validation, and 24 images for testing. The physical size of images was set to 18 cm 18 cm, discretized to a grid. The background medium was assumed to be air with and the wavelength of the illumination was set to
cm. The measurements were collected from 40 transmissions uniformly distributed along a circle of radius
m and, for each transmission, 360 measurements around the object were recorded. The simulated scattered data was additionally corrupted by an additive white Gaussian noise corresponding to 20 dB of input signaltonoise ratio (SNR).Method  Average SNR over the dataset  

Piecewisesmooth  Human faces  
Weak  Strong  Weak  Strong  
FBNN  16.49  12.79  10.39  6.61 
LSNN  16.49  16.74  10.39  10.85 
FBTV  23.04  15.53  19.79  7.08 
LSTV  23.04  22.57  19.79  20.12 
ScaDec  26.14  26.19  20.26  20.21 
We evaluated the proposed model in two distinct scenarios associated with the weak and strong scattering. Weak scattering corresponds to the regime where first Born approximation is valid. In particular, we defined the permittivity contrast as , where . The permittivity contrast quantifies the degree of nonlinearity in the inverse problem, with higher indicating stronger levels of multiple scattering. We regarded the weakly scattering scenario as , whereas the strong scattering scenario was considered as
. For each scenario, we trained a separate ScaDec architecture using the corresponding training dataset with the reconstruction mean squared error (MSE) as the loss function. For quantitively measuring the quality of the reconstructed image
with respect to the true image , we used the signaltonoise ratio (SNR) defined aswhere higher values of SNR correspond to a better match between the true and reconstructed images. As illustrated in Fig. 4, we observed no issues with the convergence of the training for our architecture and datasets. Note that all the SNR and visual results were obtained on a distinct dataset that does not contain images used in training.
Table 1 summarizes the results of comparing ScaDec against the baseline optimizationbased methods corresponding to two different priors: nonnegativity constraints on the image and TV. For each prior, we consider the effects of the linearity versus nonlinearity of the measurement model. The linear measurement model is obtained by using the first Bornapproximation, while the nonlinear model takes into account multiple scattering by using the full LippmannSchwinger equations [12, 14]. Fig. 5 additionally shows some visual examples of the reconstructed images for each scenario under consideration. Note that the regularization parameters for TV were optimized for the best SNR performance for all the experiments.
The results confirm that as the level of scattering increases, the performance under the linear inverse problem formulation based on the first Born approximation degenerates with or without regularization. While TV substantially improves the SNR, it also imposes a piecewiseconstant structure, leading to blocky artifacts visible in Fig. 5. On the other hand, the output of ScaDec substantially outperforms the baseline methods and leads to higher SNR values and to more natural looking images free of blocky artifacts. ScaDec also enjoys good stability in terms of performance, where the reconstruction SNR is nearly identical in weakly and strongly scattering regimes.
Finally, the computational cost of ScaDec is extremely low during the reconstruction stage, where each reconstruction corresponds to a simple forward pass through the CNN. In our case, all the optimizationbased methods were run on a pair of 2 CPUs (Intel Xeon processor E52620 v4) during testing, while ScaDec was evaluated on a single GPU (NVIDIA GeForce GTX 1080 Ti). We observed that the reconstructing time of ScaDec for a single image was less than 2 seconds in all scenarios, while LSTV took over 8 and 35 minutes to reconstruct one image in the weakly and strongly scattering cases, respectively.
4.2 Results on experimental datasets
For the experimental validation, two 2D settings were considered: FoamDielExtTM and FoamDielIntTM that consist of a fixed foam cylinder and a plastic cylinder located outside or inside of the foam (Fig. 6). In both settings, the objects were placed within a 18 cm 18 cm square region, discretized to 128 128 grid, hence, the pixel size of the reconstructed images was 1.4 mm 1.4 mm. Total 8 transmitters were uniformly distributed along a circle of radius m, emitting electromagnetic wave towards the objects, and the measurements of the scattered wave were recorded by 360 receivers. Though the dataset contains measurements of a range of wave frequencies, we only consider the case of 5 GHz; hence, the wavelength of the transmission is mm. The background medium was air with . The permittivity contrasts of foam and plastic were measured as and , respectively [41].
For different settings, we trained the same ScaDec architecture with 6500 pairs of
synthetic images and their simulated scattered measurements. The measurements were generated by computing the multiple scattering measurements governed by LippmannSchwinger equations. Each image was synthesized with one centered circle with a lower contrast and another randomlyplaced circle with a higher contrast. Furthermore, all measurements were corrupted with an additive white Gaussian noise corresponding to 20 dB of input SNR. The ScaDec was trained for 1000 epochs to minimize the MSE between the true image and the restored image.
Visual results of the reconstructed images of different methods are shown in Fig. 6, where we compare ScaDec against LSTV and FBTV. The first column shows the ground truth of the foam cylinder (light blue) and the plastic cylinder (bright yellow) in each setting. The linear model FBTV dramatically underestimates the permittivity distribution and fails to reconstruct the shape of objects. On the other hand, the nonlinear model of LSTV produces better reconstructed images by taking into account both the multiple scattering and the piecewiseconstant nature of the image. Finally, the proposed method obtains the highest quality reconstruction in terms of both the contrast value and the shape of objects. The edges of the foam and plastic were clear and sharp, and no obvious degradation of the contrast value was observed. Visually, the results of ScaDec look very close to the ground truth, which is due to the capability of the framework to adapt to the features in the training dataset. Remarkably, the experimental results also illustrate the potential of using simulated data for training, and then deploying the trained CNN for image formation from experimental data.
5 Conclusion
We designed and experimentally demonstrated a deep convolutional neural network for solving a multiple scattering problem in diffraction tomography. The proposed method, called ScaDec, successfully reconstructed highquality images and outperformed stateofart optimizationbased methods in all scenarios. Remarkably, the method trained on simulated data, also succeeded in reconstructing images from real experimental data consisting of highly scattering objects. One of the key advantages of the proposed approach is that the actual process of image formation is substantially accelerated compared to optimizationbased reconstruction methods. These features make ScaDec a promising alternative to optimization based methods and opens rich perspectives for efficient correction of scattering in biological samples.
Acknowledgments
We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU for research.
References
 [1] W. Choi, C. FangYen, K. Badizadegan, S. Oh, N. Lue, R. R. Dasari, and M. S. Feld, “Tomographic phase microscopy,” Nat. Methods, vol. 4, no. 9, pp. 717–719, September 2007.
 [2] D. J. Brady, K. Choi, D. L. Marks, R. Horisaki, and S. Lim, “Compressive holography,” Opt. Express, vol. 17, no. 15, pp. 13 040–13 049, 2009.
 [3] E. Wolf, “Threedimensional structure determination of semitransparent objects from holographic data,” Opt. Commun., vol. 1, no. 4, pp. 153–156, September/October 1969.
 [4] A. J. Devaney, “Inversescattering theory within the Rytov approximation,” Opt. Lett., vol. 6, no. 8, pp. 374–376, August 1981.
 [5] B. Chen and J. J. Stamnes, “Validity of diffraction tomography based on the first born and the first rytov approximations,” Appl. Opt., vol. 37, no. 14, pp. 2996–3006, May 1998.
 [6] K. Belkebir and A. Sentenac, “Highresolution optical diffraction microscopy,” J. Opt. Soc. Am. A, vol. 20, no. 7, pp. 1223–1229, July 2003.
 [7] K. Belkebir, P. C. Chaumet, and A. Sentenac, “Superresolution in total internal reflection tomography,” J. Opt. Soc. Am. A, vol. 22, no. 9, pp. 1889–1897, September 2005.
 [8] E. Mudry, P. C. Chaumet, K. Belkebir, and A. Sentenac, “Electromagnetic wave imaging of threedimensional targets using a hybrid iterative inversion method,” Inv. Probl., vol. 28, no. 6, p. 065007, April 2012.
 [9] L. Tian and L. Waller, “3D intensity and phase imaging from light field measurements in an LED array microscope,” Optica, vol. 2, pp. 104–111, 2015.
 [10] U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch, M. Unser, and D. Psaltis, “Learning approach to optical tomography,” Optica, vol. 2, no. 6, pp. 517–522, June 2015.
 [11] T. Zhang, C. Godavarthi, P. C. Chaumet, G. Maire, H. Giovannini, A. Talneau, M. Allain, K. Belkebir, and A. Sentenac, “Farfield diffraction microscopy at resolution,” Optica, vol. 3, no. 6, pp. 609–612, June 2016.
 [12] E. Soubies, T.A. Pham, and M. Unser, “Efficient inversion of multiplescattering model for optical diffraction tomography,” Opt. Express, vol. 25, no. 18, pp. 21 786–21 800, September 2017.
 [13] H.Y. Liu, D. Liu, H. Mansour, P. T. Boufounos, L. Waller, and U. S. Kamilov, “SEAGLE: Sparsitydriven image reconstruction under multiple scattering,” IEEE Trans. Comput. Imaging, vol. 4, no. 1, pp. 73–86, March 2018.
 [14] Y. Ma, H. Mansour, D. Liu, P. T. Boufounos, and U. S. Kamilov, “Accelerated image reconstruction for nonlinear diffractive imaging,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Process. (ICASSP 2018), Calgary, Canada, April 1520, 2018, arXiv:1708.01663 [cs.CV].
 [15] A. Ribés and F. Schmitt, “Linear inverse problems in imaging,” IEEE Signal Process. Mag., vol. 25, no. 4, pp. 84–99, July 2008.
 [16] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D, vol. 60, no. 1–4, pp. 259–268, November 1992.
 [17] Y. Sung and R. R. Dasari, “Deterministic regularization of threedimensional optical diffraction tomography,” J. Opt. Soc. Am. A, vol. 28, no. 8, pp. 1554–1561, August 2011.
 [18] J. W. Lim, K. R. Lee, K. H. Jin, S. Shin, S. E. Lee, Y. K. Park, and J. C. Ye, “Comparative study of iterative reconstruction algorithms for missing cone problems in optical diffraction tomography,” Opt. Express, vol. 23, no. 13, pp. 16 933–16 948, June 2015.
 [19] U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch, M. Unser, and D. Psaltis, “Optical tomographic image reconstruction based on beam propagation and sparse regularization,” IEEE Trans. Comp. Imag., vol. 2, no. 1, pp. 59–70,, March 2016.
 [20] T.A. Pham, E. Soubies, A. Goy, J. Lim, F. Soulez, D. Psaltis, and M. Unser, “Versatile reconstruction framework for diffraction tomography with intensity measurements and multiple scattering,” Opt Express, vol. 26, no. 3, pp. 2749–2763, February 2018.
 [21] U. S. Kamilov, D. Liu, H. Mansour, and P. T. Boufounos, “A recursive Born approach to nonlinear inverse scattering,” IEEE Signal Process. Lett., vol. 23, no. 8, pp. 1052–1056, August 2016.

[22]
C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image superresolution,” in
Proc. ECCV, Zurich, Switzerland, September 612, 2014, pp. 184–199. 
[23]
U. Schmidt and S. Roth, “Shrinkage fields for effective image restoration,”
in
Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR)
, Columbus, OH, USA, June 2328, 2014, pp. 2774–2781.  [24] A. Mousavi, A. B. Patel, and R. G. Baraniuk, “A deep learning approach to structured signal recovery,” in Proc. Allerton Conf. Communication, Control, and Computing, Allerton Park, IL, USA, September 30October 2, 2015, pp. 1336–1343.
 [25] Y. Chen, W. Yu, and T. Pock, “On learning optimized reaction diffuction processes for effective image restoration,” in Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, June 810, 2015, pp. 5261–5269.
 [26] U. S. Kamilov and H. Mansour, “Learning optimal nonlinearities for iterative thresholding algorithms,” IEEE Signal Process. Lett., vol. 23, no. 5, pp. 747–751, May 2016.
 [27] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” 2016, arXiv:1611.03679 [cs.CV].
 [28] M. Borgerding and P. Schniter, “Onsangercorrected deep networks for sparse linear inverse problems,” 2016, arXiv:1612.01183 [cs.IT].
 [29] Y. S. Han, J. Yoo, and J. C. Ye, “Deep residual learning for compressed sensing CT reconstruction via persistent homology analysis,” 2016, arXiv:1611.06391 [cs.CV].
 [30] A. Sinha, J. Lee, S. Li, and G. Barbastathis, “Lensless computational imaging through deep learning,” Optica, vol. 4, no. 9, pp. 1117–1125, September 2017.
 [31] J. Yoo, S. Sabir, D. Heo, K. H. Kim, A. Wahab, Y. Choi, S.I. Lee, E. Y. Chae, H. H. Kim, Y. M. Bae, Y.W. Choi, S. Cho, and J. C. Ye, “Deep learning can reverse photon migration for diffuse optical tomography,” 2017, arXiv:1712.00912 [cs.CV].
 [32] E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory, vol. 52, no. 2, pp. 489–509, February 2006.
 [33] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol. 52, no. 4, pp. 1289–1306, April 2006.
 [34] A. Beck and M. Teboulle, “Fast gradientbased algorithm for constrained total variation image denoising and deblurring problems,” IEEE Trans. Image Process., vol. 18, no. 11, pp. 2419–2434, November 2009.
 [35] M. V. Afonso, J. M.BioucasDias, and M. A. T. Figueiredo, “Fast image recovery using variable splitting and constrained optimization,” IEEE Trans. Image Process., vol. 19, no. 9, pp. 2345–2356, September 2010.
 [36] M. Born and E. Wolf, Principles of Optics, 7th ed. Cambridge Univ. Press, 2003, ch. Scattering from inhomogeneous media, pp. 695–734.
 [37] O. Ronneberger, P.Fischer, and T. Brox, “Unet: Convolutional networks for biomedical image segmentation,” in Medical Image Computing and ComputerAssisted Intervention (MICCAI), ser. LNCS, vol. 9351. Springer, 2015, pp. 234–241, (available on arXiv:1505.04597 [cs.CV]). [Online]. Available: http://lmb.informatik.unifreiburg.de/Publications/2015/RFB15a
 [38] Y. Sung, W. Choi, C. FangYen, K. Badizadegan, R. R. Dasari, and M. S. Feld, “Optical diffraction tomography for high resolution live cell imaging,” Opt. Express, vol. 17, no. 1, pp. 266–277, December 2009.
 [39] J. C. Ye, Y. Han, and E. Cha, “Deep Convolutional Framelets: A General Deep Learning Framework for Inverse Problems,” ArXiv eprints, Jul. 2017.
 [40] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proceedings of International Conference on Computer Vision (ICCV), 2015.
 [41] J.M. Geffrin, P. Sabouroux, and C. Eyraud, “Free space experimental scattering database continuation: experimental setup and measurement precision,” Inv. Probl., vol. 21, no. 6, pp. S117–S130, 2005.
 [42] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations (ICLR), 2015. arXiv:1412.6980 [cs.LG].
Comments
There are no comments yet.