Gravitational lensing is the deflection of light rays as they traverse through the curved space caused by the presence of mass. In the present era of precision cosmology, gravitational lensing has become a powerful probe in many areas of astrophysics and cosmology, from stellar scale to cosmological scale. Galaxy-galaxy strong lensing (GGSL) is a particular case of gravitational lensing in which the background source and foreground lens are both galaxies and the lensing system is sufficient to distort images of sources into arcs or even rings, depending on the relative angular position of the two objects. Since the discovery of the first GGSL system in 1988 Hewitt et al. (1988), many valuable scientific applications have been realized, such as studying galaxy mass density profiles Sonnenfeld et al. (2015); Shu et al. (2016); Küng et al. (2018), detecting and inferring galaxy substructure Vegetti et al. (2014); Hezaveh et al. (20may); Bayer et al. (2018); Brehmer et al. (2019), measuring cosmological parameters Collett and Auger (2014); Rana et al. (2017); Suyu et al. (2017), investigating the nature of high-redshift galaxies Bayliss et al. (2017); Dye et al. (2018); Sharda et al. (2018), and constraining the properties of self-interacting dark matter candidates Shu et al. (2016); Gilman et al. (2017); Kummer et al. (2018)
With the capabilities of the next-generation telescopes such as the Large Synoptic Survey Telescope (LSST),111https://www.lsst.org/, Euclid222https://www.euclid-ec.org/ the number of known GGSLs is predicted to increase by several orders of magnitude Collett (2015). The strong gravitational lens finding challenge project Metcalf et al. (2019)
proved the success of applying machine learning approaches to detect GGSL systems in an automated manner. Lanusse et al.Lanusse et al. (2018), Morningstar et al. Morningstar et al. (2018), Hezaveh et al. Hezaveh et al. (2017), Levasseur et al. Levasseur et al. (2017) and Pearson et al. Pearson et al. (2019) have shown the feasibility and reliability of utilizing deep learning to model strong lenses as a vastly more efficient alternative to traditional parametric methods. Fast forward modeling for strong lensing image reconstructions Morningstar et al. (2019)
may also be combined with inference pipelines such as Markov chain Monte Carlo for lensing parameter estimation. However, the preprocessing of the original images—for example, deblending and denoising with machine learning—is still in its infancy.
In this paper, we address this growing need for preparing automated analysis for GGSLs in two ways. First, we create a dataset of 1 million simulated images (500K GGSLs and 500K non-GGSLs) using a catalog of GGSLs and a state-of-the-art semi-analytic catalog of galaxies named cosmoDC2 into a strong lensing simulation program named PICS. To demonstrate the feasibility of the pipeline for analyzing GGSLs, we use only 120K simulated images (60K GGSLs and 60K non-GGSLs) out of the 1 million images. However, we use the 1 million images to quantify the performance of our pipeline in further studies. Second, we develop an end-to-end machine learning pipeline for automated lens finding and characterization for GGSLs, which consists of four modules—denoising, deblending, lens identification, and lens characterization. We adopt Deep Residual Networks (ResNet)-based fully convolutional neural network architectures for denoising the original pixelized images and removing the lens light in the deblending module. The lens identification and characterization modules perform classification and regression, respectively, which are both built by using ResNet-50 architecture. We demonstrate considerable improvement over lens finding and characterization without the pipeline, and we discuss potential avenues for future improvement.
2 Data Preparation – Simulations
We created a realistic simulated dataset, including 500K GGSLs and 500K non-GGSLs by adopting a catalog of strong lenses Collett (2015) (hereafter, Collett15) and a state-of-the-art extragalactic catalog Korytov et al. (2019). Collett15 provides the mass models and simple light models of both lens and source galaxies; on the other hand, cosmoDC2 provides more realistic light profiles of galaxies containing bulges and disks. To create the inputs for our strong lensing simulation program named PICS Li et al. (2016) by connecting the mass profiles from Collett15 and light profiles from cosmoDC2, we crossmatch the apparent magnitudes, axis ratios, position angles, and redshifts of the galaxies from Collett15 and CosmoDC2.
The mass model of an individual lens galaxy is a singular-isothermal ellipsoid (SIE) as is adopted in Collett15, which not only is analytically tractable but also has been found to be consistent with models of individual lenses and lens statistics on the length scales relevant for strong lensing Koopmans et al. (2006); Gavazzi et al. (2007); Dye et al. (2008). Accordingly, the deflection maps can be given by the parameters of the positions, velocity dispersion, axis ratio, position angle, and redshift of lens as well as redshifts of source galaxies, i,e, . Since can be fixed to by centering the cutouts at the lens galaxies and since the lensing strength (i.e., Einstein radius) can be given by , the parameter array can be simplified as , where is the speed of light, and are the angular diameter distance from the deflector to the source and from the observer to the source, respectively.
We added noise and the point spread function (PSF) to make the images realistic using models of a ground-based-like telescope from Collett (2015); Connolly et al. (2010). The noise model is a mix between read noise, which is a Gaussian-like noise, and shot noise, which is a Poisson-like noise that can be calculated according to the flux in the pixelized images. The PSF model is also a Gaussian function with different full width at half maximum (FWHM) of bands. Examples are shown in Appendix A, Fig. 2. The nonlensing systems are generated in the same way but with the strong lensing effects removed by considering the deflection angles as zeros.
3 Methodology – Pipeline Training and Inference
Our proposed machine learning pipeline consists of four modules—denoising, source separation (deblending), lens searching (classification), and Lens modeling (regression)—as shown in Fig. 1
Denoising is an image restoration approach in which the goal is to recover a clean image from a noisy observation . Traditionally, image denoising has been posed as an inverse problem, where optimization approaches and special purpose regularizers (known as image priors) have been used to achieve this Anwar et al. (2019). Recently, deep-learning-based approaches have been being increasingly adopted and are currently the state-of-the-art algorithms Lim et al. (2017); Zhang et al. (2018) for image denoising. We adopt an enhanced deep super-residual network (EDSR) architecture Lim et al. (2017)
that was proposed for a specific type of image restoration known as super resolution. The residual network (ResNetHe et al. (2016)
) incorporates skip connections between residual blocks (which consist of convolution, batch normalization, and nonlinear activation layers) in a deep network, and has been shown to work well for a variety of tasksSzegedy et al. (2017)
. The residual networks overcome the vanishing gradient problem by learning the function mapping for the residual (with respect to inputs). EDSR proposes to get rid of the batch normalization layers that are deemed unnecessary for the image-to-image tasks. Since the inputs and the outputs for denoising have the same resolution, we removed the up-sampling layer from the EDSR architecture which is composed ofkernels and feature channels. The source separation (deblending) module decouples the lensed light and the source galaxy from the observations. The module also utilizes the same modified EDSR architecture that was used for the denoising module. The reason is that the source separation is also an image-to-image task that takes the images with coupled source and foreground galaxies as input and outputs the corresponding lensed or nonlensed source galaxy that is separated from the foreground lens. The classification
module is used to detect the lensing systems from the source separated images. In other words, each of the observed image needs to be classified as to whether it is a lensed or a nonlensed system. We utilize the ResNet-50 architecture to perform this classification. In this architecture, each residual block is three layers deep and consists of convolution layers with channel size increasing fromto and the filter size being either or . The parameter estimation (regression) module takes the source-separated galaxy and predict its characteristics—Einstein radius, axis ratio, and position angle. The parameter estimation also uses the ResNet-50 architecture that has been adopted for the classification, but the last layer is replaced with a fully connected layer that is used to predict the three continuous quantities. We also considered the case where we single model was used for denoising and debleding together, but we found the discussed pipeline (Figure 1) to be the best and hence do not discuss the former in detail, in the interest of space.
4 Results and Discussion
Each of the four modules—denoising, deblending, classification, and regression—was trained individually using the corresponding parts of the simulation data. Once the model was trained, the weights in these models were fixed and deployed as part of the inference pipeline, where the predictions from each module were fed into the subsequent module with the end goal of characterizing the lensed galaxies. The results for training the modules will be discussed first, followed by inference.
The denoising ESDR model was trained for epochs using images (and tested on ), where the inputs were noisy and blended galaxy images (Noisy-Sim) and the ground truth was the corresponding noiseless blended images (Noiseless-Sim
) from the simulation. We used the peak signal-to-noise ratio (PSNR) to evaluate the denoising accuracy.
The accuracy metrics corresponding on the test data using the trained denoising model are shown in Table 1(a) in Appendix A. First, the difference between Noisy-Sim and Noiseless-Sim is shown to demonstrate the effect of the noise. The mean value for PSNR over all the test data is , indicating that in fact the noise has a significant effect on the image. Then, the ability of the denoising machine learning model to predict the denoised images from the noisy images is measured by comparing the model prediction (Noiseless-ML) with the corresponding ground truth (Noiseless-Sim). The metric value of for PSNR indicates very good noise removal by the trained model. To measure the accuracy of the deblending module, we first compared the Noiseless-Sim with noiseless and deblended simulation data (Deblended-Sim) to characterise the difference between the input and the corresponding ground truth that the prediction seeks to match (Table 1(a)). The mean value over all the test data for PSNR is , which indicates a significant difference between these image pairs. Then, the output from the deblending machine learning model (Deblended-ML) was compared with the ground truth (Deblended-Sim) on the test data to characterize the predictive accuracy. The PSNR value of indicates a good recovery of the source galaxy by deblending.
The classification module was trained over 108,000 images with a batch size of 256, epochs, and a learning rate that decays by half every two epochs starting with a value of with the Adam optimizer. The mean classification accuracy (over two classes) was used to measure the accuracy of the classification model (Table 1(b)). As a baseline we trained a classification model to predict the label directly from the noisy blended simulation images (Noisy-Sim) and evaluated the metrics on the same corresponding test data images. The mean accuracy was found to be , while the classification model trained with Deblended-Sim gave a mean accuracy of , which is a significant improvement overthe baseline.
For the parameter estimation (regression) module, we used the same ResNet-50 architecture but with the last layer being a fully connected one to predict the three continuous parameters. Only the lensed images in the deblended simulation data Deblended-Sim-Len were used to train the regression model. Hence, a total of were used for training and for testing. The same batch size and learning rate schedule used for classification were employed for regression as well, while the number of epochs was increased to . The regression accuracy was measured by using the mean absolute error (MAE) in the normalized ([0,1] w.r.t the maximum and minimum of training data) coordinates, as shown in Table 1(b); the plots comparing the observed and predicted are shown in Fig. 3. The regression accuracy for training is for MAE, which indicates a very good agreement with the ground truth. For the test data, the corresponding MAE is , while the baseline MAE is .
With the inference pipeline, we looked at an application scenario where all four modules were used in unison to predict the galaxy parameters from the noisy observations. The input to the denoising module was the full Noisy-Sim data, and we evaluated the denoising performance of the inference data using a procedure similar to that used for the test data, where the similarity between Noisy-Sim data and the Noiseless-Sim was calculated with the three metrics. We found the results to be similar to those obtained on the test data, giving us confidence that there wss no significant change in the noise distribution. Next, we compared the performance of the predictions of the deblending model (Noiseless-ML) with the ground truth (Noiseless-Sim) and again found the metrics to be close to those obtained for test data, thus validating the predictive capability and generalizability of this denoising model beyond the data it’s trained on. For the deblending process, the denoised model predictions (Noiseless-ML) were taken as input and the corresponding deblended outputs were obtained (Deblended-ML-ML) as output. The accuracy was evaluated with respect to the ground truth deblended images Deblended-Sim for Noiseless-ML and Deblended-ML-ML over all the images (Table 1(c)). We found the PSNR for the latter to be , which is lower than that for the test data (in the training phase) but is significantly better than the baseline of .
For the classification inference, we calculated the mean accuracy for the deblending scenario and found that the accuracy is lower than for the test data (in the training phase), with a mean accuracy of compared with . However, this accuracy is much higher than the baseline case accuracy of .
For the regression inference, we calculated the MAE for the deblending scenario and found that their regression accuracy (Table 1(d)) is slightly lower (MAE of ) than the accuracies obtained on the test data but is an improvement over the baseline of .
Limitations of the Denoising/Deblending Modules: Although we obtained good training and test accuracy for all the training modules and significant improvement in the lens finding (classification) over the baseline for the inference pipeline, the lens characterization accuracy improvement in inference is only marginal. We attribute this to two factors: (1) sensitivity of the deblending module to the denoising input, where we found that even though the PSNR is close to ideal, the minor differences with the ground truth ( cause additional features in the deblended image (Fig. 2(e) in Appendix B) for some cases; and (2) the processes of denoising and deblending, which work well for extracting bright lensed arcs but can erase the faint counter-images of the primary lensed images because of the high contrasts between the images of the lenses and the counterimage. These factors may bias the ellipticity and inner density slopes of lens galaxies. We will try to improve these issues by involving larger training sets and more complex models in the follow-up work.
Combining high-fidelity simulation data and a systematic machine learning pipeline is crucial for developing fast and accurate GSSL analysis techniques for future cosmological surveys. To this end, we proposed a dataset of 1 million synthetic images (500k GGSLs and 500k non-GGSLs), which is the largest simulation for GGSL ever made, and we developed an end-to-end machine learning pipeline with separate modules for denoising, deblending, lens searching, and lens modeling that is trained on this data. We demonstrate good denoising and deblending performance on both the training and inference (compared with the ground truth) and, consequently, a significant improvement in classification and regression over the baseline (working directly with the noisy blended data). We also identify a few limitations of the simulation data, which lead to underestimated contamination from substructures in the context of both mass and light profiles of galaxies and in the denoising/deblending model, which either misses counterimages or introduces additional artifacts. We plan to address these issues by scaling up to the full million image dataset, training the denoising and deblending model together, and employing a hyperparameter search to improve the classification and regression accuracies. In addition, we will explore uncertainty quantification for the lens modeling output through probabilistic regression. Eventually, the pipeline intended to be used for real-time lens finding and characterization with data from next-generation large-scale sky surveys such as Euclid, LSST, and WFIRST.
This material is based upon work supported by the U.S. Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory. The work is also supported by the UK Science and Technology Facilities Council (STFC).
-  (2019) A deep journey into super-resolution: a survey. arXiv preprint arXiv:1904.07523. Cited by: §3.
-  (2018-03) Observational constraints on the sub-galactic matter-power spectrum from galaxy-galaxy strong gravitational lensing. ArXiv e-prints. External Links: Cited by: §1.
-  (2017-aug.) Spatially Resolved Patchy Ly Emission within the Central Kiloparsec of a Strongly Lensed Quasar Host Galaxy at z = 2.8. The Astrophysical Journal Letters 845, pp. L14. External Links: Cited by: §1.
-  (2019-Sept.) Mining for dark matter substructure: inferring subhalo population properties from strong lenses with machine learning. arXiv e-prints, pp. arXiv:1909.02005. External Links: Cited by: §1.
-  (2014-sept.) Cosmological constraints from the double source plane lens SDSSJ0946+1006. Monthly Notices of the Royal Astronomical Society 443, pp. 969–976. External Links: Cited by: §1.
-  (2015-09) The Population of Galaxy-Galaxy Strong Lenses in Forthcoming Optical Imaging Surveys. The Astrophysical Journal 811, pp. 20. External Links: Cited by: §1, §2, §2.
-  (2010) Simulating the lsst system. Modeling, Systems Engineering, and Project Management for Astronomy IV 7738. External Links: Cited by: §2.
-  (2008-07) Models of the Cosmic Horseshoe gravitational lens J1004+4112. Monthly Notices of the Royal Astronomical Society 388 (1), pp. 384–392. External Links: Cited by: §2.
-  (2018-06) Modelling high-resolution ALMA observations of strongly lensed highly star-forming galaxies detected by Herschel. Monthly Notices of the Royal Astronomical Society 476, pp. 4383–4394. External Links: Cited by: §1.
-  (2007-Sept.) The Sloan Lens ACS Survey, IV: the mass density profile of early-type galaxies out to 100 effective radii. The Astrophysical Journal 667 (1), pp. 176–190. External Links: Cited by: §2.
-  (2017-dec.) Probing the nature of dark matter by forward modeling flux ratios in strong gravitational lenses. ArXiv e-prints. External Links: Cited by: §1.
-  (2016) Deep residual learning for image recognition. In , pp. 770–778. Cited by: §3.
-  (1988-jjne) Unusual radio source MG1131+0456 - A possible Einstein ring. Nature 333, pp. 537–540. External Links: Cited by: §1.
-  (20may) Detection of lensing substructure using ALMA observations of the Dusty Galaxy SDP.81. The Astrophysical Journal 823, pp. 37. External Links: Cited by: §1.
-  (2017-aug.) Fast automated analysis of strong gravitational lenses with convolutional neural networks. Nature 548, pp. 555–557. External Links: Cited by: §1.
-  (2006-Oct.) The Sloan Lens ACS Survey, III: The structure and formation of early-type galaxies and their evolution since z ~1. The Astrophysical Journal 649 (2), pp. 599–615. External Links: Cited by: §2.
-  (2019-07) CosmoDC2: A Synthetic Sky Catalog for Dark Energy Science with LSST. arXiv e-prints, pp. arXiv:1907.06530. External Links: Cited by: §2.
-  (2018-feb.) Effective description of dark matter self-interactions in small dark matter haloes. Monthly Notices of the Royal Astronomical Society 474, pp. 388–399. External Links: Cited by: §1.
-  (2018-03) Models of gravitational lens candidates from Space Warps CFHTLS. Monthly Notices of the Royal Astronomical Society 474, pp. 3700–3713. External Links: Cited by: §1.
-  (2018-jan.) CMU DeepLens: deep learning for automatic image-based galaxy-galaxy strong lens finding. Monthly Notices of the Royal Astronomical Society 473, pp. 3895–3906. External Links: Cited by: §1.
-  (2017) Uncertainties in parameters estimated with neural networks: application to strong gravitational lensing. arXiv preprint arXiv:1708.08843. Cited by: §1.
-  (2016-Sept.) PICS: simulations of strong gravitational lensing in galaxy clusters. The Astrophysical Journal 828 (1), pp. 54. External Links: Cited by: §2.
-  (2017) Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 136–144. Cited by: §3.
-  (2019-05) The strong gravitational lens finding challenge. Astronomy & Astrophysics 625, pp. A119. External Links: Cited by: §1.
-  (2018) Analyzing interferometric observations of strong gravitational lenses with recurrent and convolutional neural networks. arXiv preprint arXiv:1808.00011. Cited by: §1.
-  (2019) Data-driven reconstruction of gravitationally lensed galaxies using recurrent inference machines. arXiv preprint arXiv:1901.01359. Cited by: §1.
-  (2019-sept.) The use of convolutional neural networks for modelling large optically-selected strong galaxy-lens samples. Monthly Notices of the Royal Astronomical Society 488, pp. 991–1004. External Links: Cited by: §1.
-  (2017-07) Probing the cosmic distance duality relation using time delay lenses. Journal of Cosmology and Astroparticle Physics 7, pp. 010. External Links: Cited by: §1.
-  (2018-07) Testing star formation laws in a starburst galaxy at redshift 3 resolved with ALMA. Monthly Notices of the Royal Astronomical Society 477, pp. 4380–4390. External Links: Cited by: §1.
-  (2016-dec.) The BOSS Emission-line Lens Survey. IV. Smooth Lens Models for the BELLS GALLERY Sample. The Astrophysical Journal 833, pp. 264. External Links: Cited by: §1.
-  (2016-03) Kiloparsec Mass/Light Offsets in the Galaxy Pair-Ly Emitter Lens System SDSS J1011+0143. The Astrophysical Journal 820, pp. 43. External Links: Cited by: §1.
-  (2015-feb.) The SL2S Galaxy-scale Lens Sample. V. Dark Matter Halos and Stellar IMF of Massive Early-type Galaxies Out to Redshift 0.8. The Astrophysical Journal 800, pp. 94. External Links: Cited by: §1.
-  (2017-07) H0LiCOW - I. H Lenses in COSMOGRAIL’s Wellspring: program overview. Monthly Notices of the Royal Astronomical Society 468, pp. 2590–2604. External Links: Cited by: §1.
Inception-v4, Inception-ResNet and the impact of residual connections on learning. In AAAI, Vol. 4, pp. 12. Cited by: §3.
-  (2014-aug.) Inference of the cold dark matter substructure mass function at z = 0.2 using strong gravitational lenses. Monthly Notices of the Royal Astronomical Society 442, pp. 2017–2035. External Links: Cited by: §1.
-  (2018) Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2472–2481. Cited by: §3.
Appendix A Supplemental Tables and Figures
Appendix B Limitations of the Simulation Model
We adopt SIE as the mass model of the lenses, which is insufficient for studying the influence of the subtle structures of the lens galaxies on the performance of analyzing strong lenses using our pipeline. To make the simulation more realistic, we plan to adopt the particle data from cosmological N-body simulations and stellar mass distributions from semi-analytical models for presenting the mass distribution of lens galaxies. Furthermore, for image simulation, we involve the images of sources and lenses only; but in real observations, images of galaxies on the line of sight are also considerable, and the cosmoDC2 light cone is helpful to include these effects. The study is focused on ground-based-like telescopes such as LSST, so the light profiles with bulges and disks are sufficient because of the coarse pixelization and large PSF. However, for the case of space-based-like telescopes such as Euclid and WFIRST, the light profile lacks detailed structures of galaxies such as spirals and clumps. We are attempting to attach substructures onto the galaxies by using GANs in a parallel project. Another issue is the overkill problem in the processes of denoising and deblending; that is, faint counterimages of the primary lensed images can be erased on some level because of the high contrasts between the images of the lenses and the counterimages. We will try to avoid potential biases because of this problem by using more advanced algorithms in our follow-up work.
The submitted manuscript has been created by UChicago Argonne, LLC, Operator of Argonne National Laboratory (“Argonne”). Argonne, a U.S. Department of Energy Office of Science laboratory, is operated under Contract No. DE-AC02-06CH11357. The U.S. Government retains for itself, and others acting on its behalf, a paid-up nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan. http://energy.gov/downloads/doe-public-access-plan