Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning

09/04/2019 ∙ by Johann Brehmer, et al. ∙ NYU college 24

The subtle and unique imprint of dark matter substructure on extended arcs in strong lensing systems contains a wealth of information about the properties and distribution of dark matter on small scales and, consequently, about the underlying particle physics. However, teasing out this effect poses a significant challenge since the likelihood function for realistic simulations of population-level parameters is intractable. We apply recently-developed simulation-based inference techniques to the problem of substructure inference in galaxy-galaxy strong lenses. By leveraging additional information extracted from the simulator, neural networks are efficiently trained to estimate likelihood ratios associated with population-level parameters characterizing substructure. Through proof-of-principle application to simulated data, we show that these methods can provide an efficient and principled way to simultaneously analyze an ensemble of strong lenses, and can be used to mine the large sample of lensing images deliverable by near-future surveys for signatures of dark matter substructure.



There are no comments yet.


page 8

page 12

page 13

page 14

page 15

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Dark matter (DM) accounts for nearly a quarter of the energy budget of the Universe, and pinning down its fundamental nature and interactions is one of the most pressing problems in cosmology and particle physics today. Despite an organized effort to do so through terrestrial (e. g., Akerib et al., 2017; Cui et al., 2017; Aprile et al., 2018), astrophysical (e. g., Albert et al., 2017; Chang et al., 2018; Lisanti et al., 2018), and collider (e. g., Sirunyan et al., 2017; Aaboud et al., 2019) searches, no conclusive evidence of interactions between the Standard Model (SM) and dark matter exists to-date.

Meanwhile, dark matter can also be studied directly through its irreducible gravitational interactions. The concordance Cold Dark Matter (CDM) framework of non-relativistic, collisionless dark matter particles provides an excellent description of the observed distribution of matter on large scales. However, many well-motivated models predict deviations from CDM on smaller scales. Fundamental dark matter microphysical properties such as its particle mass and self-interaction cross-section can imprint themselves onto its macroscopic distribution in ways that can be probed by current and future experiments (Buckley and Peter, 2018; Drlica-Wagner et al., 2019; Simon et al., 2019). As a motivating example, if dark matter has a significant free-streaming length, this would lead to its early decoupling from the cosmic plasma and an underabundance of lower-mass subhalos today (Bond and Szalay, 1983; Bode et al., 2001; Dalcanton and Hogan, 2001; Boyanovsky et al., 2008; Boyanovsky and Wu, 2011). Dark matter self-interactions (Spergel and Steinhardt, 2000; Yoshida et al., 2000; Davé et al., 2001; Colín et al., 2002; Vogelsberger et al., 2012; Peter et al., 2013; Zavala et al., 2013; Kaplinghat et al., 2014, 2016; Kamada et al., 2017; Elbert et al., 2018; Vogelsberger et al., 2019; Kahlhoefer et al., 2019; Nishikawa et al., 2019; Robles et al., 2019) and dissipative dynamics in the dark sector (Fan et al., 2013; Agrawal et al., 2017; Agrawal and Randall, 2017; Buckley and DiFranzo, 2018) are examples of scenarios that would modify the structure of the subhalo density profiles in addition to possibly depressing the abundance of lower-mass halos as compared to CDM predictions in the latter case (Buckley et al., 2014; Schewtschenko et al., 2015; Vogelsberger et al., 2016).

There exist several avenues for probing the distribution of dark matter on small scales. While the detection of ultrafaint dwarf galaxies through the study of stellar overdensities and kinematics (Koposov et al., 2008, 2015; Bechtol et al., 2015; Drlica-Wagner et al., 2015) can be used to make statements about the underlying dark matter properties, theoretical uncertainties in the connection between stellar and halo properties (Nadler et al., 2019; Wechsler and Tinker, 2018) and the effect of baryons on the satellite galaxy population (Errani et al., 2017; Garrison-Kimmel et al., 2017; Brooks, 2018; Fitts et al., 2018) pose a challenge. Furthermore, suppressed star formation in smaller halos means that there exists a threshold () below which subhalos are expected to be mostly dark and devoid of baryonic activity (Efstathiou, 1992; Fitts et al., 2017; Read et al., 2017). This makes studying the imprint of gravitational interactions the only viable avenue for probing substructure at smaller scales. In this spirit, the study of subhalo-induced perturbations to the kinematic phase-space distribution in cold stellar streams (Johnston et al., 1999; Carlberg, 2012; Carlberg and Grillmair, 2013; Bonaca and Hogg, 2018; Bonaca et al., 2019), and in Galactic stellar fields (Buschmann et al., 2018) have been proposed as methods to look for low-mass subhalos through their gravitational interactions in the Milky Way.

Complementary to the study of locally-induced gravitational effects, gravitational lensing has emerged as an important technique for studying the distribution of matter over a large range of scales. Locally, the use of time-domain astrometry has been proposed as a promising method to measure the distribution of local substructure through correlated, lens-induced motions of background celestial objects due to foreground subhalos (Van Tilburg et al., 2018). In the extragalactic regime, galaxy-scale strong lenses are laboratories for studying dark matter substructure. The typical substructure abundance within galaxy-scale lenses has been constrained through the measurement of positions and flux ratios of multiple images in quasar lenses (Dalal and Kochanek, 2002; Hsueh et al., 2019) and lensed images of extended (Vegetti et al., 2010, 2010, 2012; Hezaveh et al., 2016) as well as quasar sources (Fadely and Keeton, 2012; Nierenberg et al., 2014, 2017; Gilman et al., 2019a) have been used to set limits on the abundance of or find evidence for individual subhalo clumps with masses . Although these individual high-significance detections can be used to derive constraints on substructure abundance and the subhalo mass function, searches for one (or a general fixed number of) subhalos do not take into account covariances between models with different numbers of subhalos and can leave unexpressed the degeneracies between, e.g., the imprint of several low-mass subhalos and that of a massive subhalo perturber. Additionally, these detections by definition probe the most massive subhalos in the lensing galaxies which, given the particle physics-motivated goal of constraining small-scale structure, is the less interesting regime compared to probing the fainter end of the subhalo mass function.

Another approach relies on probing the collective effect of sub-threshold (i. e., not individually resolvable) subhalos on extended arcs in strongly lensed systems. A particular challenge here is that the properties of the individual subhalos correspond to a high-dimensional space of latent variables, which must be marginalized to compute the likelihood. This complicated marginalization integral makes the likelihood for population-level parameters effectively intractable. Methods based on summary statistics (Birrer et al., 2017) and studying the amplitude of spatial fluctuations on different scales through a power spectrum decomposition (Hezaveh et al., 2016; Cyr-Racine et al., 2016; Diaz Rivero et al., 2018; Chatterjee and Koopmans, 2018; Díaz Rivero et al., 2018; Cyr-Racine et al., 2019; Brennan et al., 2019) have been proposed as ways to reduce the dimensionality of the problem and enable substructure inference in a tractable way. This class of methods is well-suited to studying dark matter substructure since they can be sensitive to the population properties of low-mass subhalos in strongly lensed galaxies which are directly correlated with the underlying dark matter particle physics.

Particularly promising in this regard are trans-dimensional techniques like probabilistic cataloging (Brewer et al., 2016; Daylan et al., 2018)

that have been proposed to take into account covariances between models with different numbers of subhalos in a principled manner and can efficiently map out the parameter space associated with multiple sub-threshold objects in lensing systems. The output of such analyses is a ensemble of posterior-weighted subhalo catalogs which can be marginalized over to infer higher-level parameters (hyperparameters) characterizing the population properties of subhalos, potentially over multiple lensing images. These results can be highly sensitive to the assumed metamodel complexity however 

(Daylan et al., 2018) and potentially computationally limited for a large number of lenses as they require running an independent analysis to produce a probabilistic catalog for each image.

Current and near-future observatories like DES (Dark Energy Survey Collaboration et al., 2016), LSST (LSST Science Collaboration et al., 2009; Drlica-Wagner et al., 2019; Verma et al., 2019), and Euclid (Refregier et al., 2010) are expected to find hundreds to thousands of galaxy-galaxy strong lenses (Oguri and Marshall, 2010; Treu, 2010; Collett, 2015), making substructure inference in these systems (and high-resolution followups on a subset) one of the key avenues for investigating dark matter substructure and stress-testing the Cold Dark Matter paradigm in the near future. This calls for methods that can efficiently analyze large samples of lensed images to infer the underlying substructure properties with minimal loss of information stemming from dimensional reduction.

In recent years, a large number of methods have been developed that train neural networks to estimate the likelihood function, likelihood ratio function, or posterior (Fan et al., 2012; Dinh et al., 2014; Germain et al., 2015; Jimenez Rezende and Mohamed, 2015; Cranmer et al., 2015; Dinh et al., 2016; Paige and Wood, 2016; Papamakarios and Murray, 2016; Thomas et al., 2016; Uria et al., 2016; van den Oord et al., 2016, 2016, 2016; Tran et al., 2017; Papamakarios et al., 2017; Louppe and Cranmer, 2017; Lueckmann et al., 2017; Gutmann et al., 2017; Chen et al., 2018; Dinev and Gutmann, 2018; Grathwohl et al., 2018; Huang et al., 2018; Kingma and Dhariwal, 2018; Lueckmann et al., 2018; Papamakarios et al., 2018; Alsing et al., 2019; Hermans et al., 2019). These techniques can be directly applied to population-level parameters, avoiding an additional marginalization step. In contrast to traditional simulation-based (or “likelihood-free”) approaches, namely Approximate Bayesian Computation, they do not rely on summary statistics and instead learn to extract information directly from the full input data, which in our case corresponds to the observed lensed images. Finally, some of these methods let us to amortize the computational cost of the inference—after an upfront simulation and training phase, inference for any observed lensed image is efficient, enabling a simultaneous analysis of a large number of observations.

In this paper, we follow this approach and apply a particularly powerful technique for simulation-based inference introduced in Brehmer et al. (2018b, a, 2018); Stoye et al. (2018) to the problem of extracting high-level substructure properties from an ensemble of galaxy-galaxy strong lensing images. This method extracts additional information from the simulator, which is then used to train a neural network as a surrogate for the likelihood ratio function. The additional information increases the sample efficiency during training and thus reduces the computational cost. A calibration procedure ensures correct inference results even in the case of imperfectly trained networks. We demonstrate the feasibility of this method on a catalog of simulated lenses. After discussing the information content in individual lensed images, we switch to a simultaneous analysis of multiple observed images and calculate the expected combined constraints on population-level substructure parameters in both a frequentist and a Bayesian setup.

This paper is organized as follows. In Section 2 we briefly review the formalism of gravitational strong lensing and describe our simulation setup, including the assumptions we make about the population of background sources and host galaxies, the substructure population, and observational parameters. In Section 3 we describe the simulation-based analysis technique used and its particular application to the problem of mining dark matter substructure properties from an ensemble of extended lensed arcs. We show a proof-of-principle application to simulated data in Section 4 and comment on how this method can be extended to more realistic scenarios in Section 5. We conclude in Section 6. In the spirit of reproducibility, code associated with this paper is available on GitHub and we provide links below each figure () pointing to the Jupyter notebooks used to generate them.

2 Strong lensing formalism and simulation setup

In strong lensing systems a background light source is gravitationally lensed by an intervening mass distribution, resulting in multiple localized images on the lens plane (in the case of a point-like quasar source) or an arc-like image (in the case of an extended galaxy source). The latter provides the ability to probe substructure over a relatively larger region on the lens plane. Additionally, young, blue galaxies are ubiquitous in the redshift regime and dominate the faint end of the galaxy luminosity function, resulting in a larger deliverable sample of galaxy-galaxy strong lenses compared to that of multiply-imaged quasars. For these reasons, we focus our method towards galaxy-galaxy lenses—systems with extended background sources producing images with lensed arcs—although the techniques presented here can also be applied to samples of lensed quasars.

We now briefly review the basic mathematical formalism behind strong gravitational lensing before describing in turn the models for the lensing galaxy, background source, and dark matter substructure assumed in this study. We also describe the mock observational parameters assumed for the image sample as well as the population properties of the host lenses. Taken together, these define our lensing forward model.

2.1 Strong lensing formalism

Given a mass distribution with dimensionless projected surface mass density , where is the critical lensing surface density and , , and are the observer-lens, observer-source, and lens-source angular diameter distances respectively, the two-dimensional projected lensing potential is given by (e. g., Schneider et al., 1992; Bartelmann and Schneider, 2001)


The reduced deflection angle is given by the gradient of the projected lensing potential,


and can be used to determine the position of the lensed source through the lens equation,


where is the position of the source. For an extended source profile , the final lensed image can be obtained as the source light profile evaluated on the image plane (e. g., Daylan et al., 2018),


Given a lens density profile, the deflection vector can be computed using Equation (

2), and analytic expressions for many commonly considered profiles are available in the literature (e. g., Keeton, 2001). The projected lensing potential and mass density are related through the Poisson equation , and its linearity implies that the combined projected potential due to multiple perturbers can be written as the sum of individual potentials, and the individual deflections can be superimposed as a consequence. The total deflection can then be used to calculate the lensed image for a given source profile using Equation (4). For more details on the gravitational lensing formalism see, e. g., Schneider et al. (1992); Bartelmann and Schneider (2001); Treu (2010).

2.2 Lensing host galaxy

Cosmological -body simulations suggest that the dark matter distribution in structures at galactic scales can be well-described by a universal, spherically symmetric Navarro-Frenk-White (NFW) profile. However, strong lensing probes a region of the host galaxy much smaller than the typical virial radii of galaxy-scale dark matter halo, and the mass budget here is dominated by the baryonic bulge component of the galaxy. Taking this into account, the total mass budget of the lensing host galaxy, being early-type, can be well-described by a singular isothermal ellipsoid (SIE) profile. Since neither the dark matter nor the baryonic components are individually isothermal, this is sometimes known as the bulge-halo conspiracy (Treu, 2010). We consider the spherical simplification of the SIE profile, the singular isothermal sphere (SIS), with the density distribution given by (Kormann et al., 1994; Treu, 2010)


where is the central 1-D velocity dispersion of the lens galaxy and is the ellipsoid axis ratio, with corresponding to the SIS profile. The Einstein radius for this profile, defining the characteristic lensing scale, is given by (Treu, 2010)


where and are respectively the lens and source redshifts. We use the cosmology from Planck Collaboration et al. (2016) to compute cosmological distances throughout this paper.

The deflection field for the SIE profile is given by (Keeton, 2001)


with and we explicitly denote our angular coordinates as .

Although the total galaxy mass (baryons + dark matter) describe the macro lensing field, for the purposes of describing substructure we require being able to map the measured properties of an SIE lens onto the properties of the host dark matter halo. To do this, we relate the central stellar velocity dispersion to the mass of the host dark matter halo. Zahid et al. (2018) derived a tight correlation between and , modeled as


with and . We model the host dark matter halo with an NFW profile (Navarro et al., 1996, 1997)


where and are the scale density and scale radius, respectively. The halo virial mass describes the total mass contained with the virial radius , defined as the radius within which the mean density is 200 times the critical density of the universe and related to the scale radius through the concentration parameter . Thus, an NFW halo is completely described by the parameters . We use the concentration model from Sánchez-Conde and Prada (2014) to derive the halo concentration for a given NFW virial mass.

The spherically-symmetric deflection for an NFW perturber is given by (Keeton, 2001)


where , and


We described the population parameters used to model the host velocity dispersion (and thus its Einstein radius and dark matter halo mass) in Section 2.6 below.

2.3 Background source

We model the emission from background source galaxies using a Sérsic profile, with the surface brightness given by (Sérsic, 1963)


where is the effective circular half-light radius, is the Sérsic index, and is a factor depending on that ensures that contains half the total intensity from the source galaxy, given by (Ciotti and Bertin, 1999)

We assume for the source galaxies, corresponding to a flattened exponential profile and consistent with expectation for blue-type galaxies at the relevant redshifts. encodes the flux at half-light radius, which can be inferred from the total flux (or magnitude) associated with a given galaxy as follows. For a detector with zero-point magnitude , which specifies the magnitude of a source giving 1 count s in expectation, by definition the total counts are given by . Requiring the half-light radius to contain half the expected counts, for we have the relation in counts arcsec, where is the exposure time.

The treatment of the other Sérsic parameters, in particular the total emission and half-light radius, in the context of population studies is described in Section 2.6 below.

2.4 Lensing substructure

The ultimate goal of our method is to characterize the substructure population in strong lenses. Here we describe our procedure to model the substructure contribution to the lensing signal. Understanding the expected abundance of substructure in galaxies over a large range of epochs is a complex problem and an active ongoing area of research. Properties of individual subhalos (such as their density profiles) as well as those that describe their population (such as the mass and spatial distribution) are strongly affected by their host environment, and accurately modeling all aspects of subhalo evolution and environment is beyond the scope of this paper. Instead, we use a simplified description to model the substructure contribution in order to highlight the broad methodological points associated with the application of our method.

Cold Dark Matter (CDM), often called the standard model of cosmology, predicts a scale-invariant power spectrum of primordial fluctuations and the existence of substructure over a broad range of masses with approximately equal contribution per logarithmic mass interval. We parameterize the distribution of subhalo masses in a given host halo of mass —the subhalo mass function—as a power law distribution with a linear dependence on the host halo mass,


where encodes the overall substructure abundance, with larger corresponding to more substructure, and the slope encodes the relative contribution of subhalos at different masses, with more negative corresponding to a steeper slope with more low-mass subhalos. and are arbitrary normalization factors.

Theory and simulations within the framework of CDM predict a slope  (Madau et al., 2008; Springel et al., 2008), resulting in a nearly scale-invariant spectrum of subhalos, which we assume in our fiducial setup. We parameterize the overall subhalo abundance through the mass fraction within the lensing galaxies contained in subhalos, , defined as the fraction of the total dark matter halo mass contained in bound substructure in a given mass range:


For a given and host halo mass , this can be used to determine in Equation (14). The linear scaling of the subhalo mass function with the host halo mass in Equation (14) is additionally described in Han et al. (2016); Despali and Vegetti (2017). In our fiducial setups, we take the minimum and maximum subhalo masses to be and  (Despali and Vegetti, 2017; Hiroshima et al., 2018) respectively, and corresponding fiducial substructure mass fraction in this range of 5%, roughly consistent with observations in Dalal and Kochanek (2002); Hiroshima et al. (2018); Hsueh et al. (2019).

With all parameters of the subhalo mass function specified, the total number of subhalos expected within the virial radius of the host halo can be inferred as . Strong lensing probes a region much smaller than this scale—the typical Einstein radii for the host deflector are much smaller than the virial radius of the host dark matter halos. In order to obtain the expected number of subhalos within the lensing observation’s region of interest (ROI), we scale the total number of subhalos obtained from the above procedure by the ratio of projected mass within our region of interest and the host halo mass as follows. We assume the subhalos to be distributed in number density following the host NFW dark matter profile. In this case, the enclosed mass function is  (e. g., Keeton, 2001), where is the angular radius in units of the scale radius, and is given by Equation (12) above. The expected number of subhalos within our ROI is thus obtained as . We conservatively take the lensing ROI to enclose a region of angular size twice the Einstein radius of the host halo, .

Since strong lensing probes the line-of-sight distribution of subhalos within the host, their projected spatial distribution is approximately uniform within the lensing ROI (Despali and Vegetti, 2017). We thus distribute subhalos uniformly within our ROI. The density profile of subhalos is assumed to be NFW and given by Equation (10), with associated lensing properties as described and the concentration inferred using the model in Sánchez-Conde and Prada (2014).

We finally emphasize that we do not intent to capture all of the intricacies of the subhalo distribution, such as the effects of baryonic physics, tidal disruption of subhalos in proximity to the center of the host and redshift evolution of host as well as substructure properties. Although our description can be extended to take these effects into account (see Section 5), their precise characterization is still subject to large uncertainties, and our simple model above captures the essential physics for demonstration purposes.

2.5 Observational considerations

A noted above, our method is best-suited to analyzing a statistical sample of strong lenses, such as those that are expected to be obtained in the near future with optical telescopes like Euclid and LSST, to quantify the effect of substructure. Given the challenges associated with the precise characterization of such a sample at the present time, we describe here the observational characteristics we assume in order to build up training and testing samples to validate our inference techniques.

We largely follow the description of Collett (2015) and use the associated LensPop package to characterize our mock observations. In particular, we use the nominal detector configuration for Euclid, assuming a zero-point magnitude in the single optical VIS passband, a pixel grid with pixel size 0.1 arcsec, a Gaussian point spread function (PSF) with FWHM 0.18 arcsec, individual exposures with exposure time 1610 s, and an isotropic sky background with magnitude 22.8 arcsec in the detector passband.

These properties, in particular the exposure, sky background, and PSF shape, are expected to vary somewhat across the lens sample. Additionally, a given region may be imaged by multiple exposures over a range of color bands. Although such variations can be incorporated into our analysis, modeling these features is beyond the scope of this study. We comment on these extensions in Section 5.

2.6 Population properties of the lens and source samples

The fact that the strong lens population is expected to be dominated by higher-redshift () blue source galaxies lensed by intermediate-redshift () elliptical galaxies presents significant challenges for quantifying the lens population obtainable with future observations. Specifically, planned ground-based surveys like LSST and space telescopes like Euclid present complementary challenges for delivering images of strong lensing systems suitable for substructure studies. LSST is expected to image in six bands, allowing for efficient separation between source and lens emission, but at the cost of lower resolution by virtue of being a ground-based instrument. Euclid imaging is expected be higher in resolution but with a single optical passband (VIS). Near-IR imaging from WFIRST may deliver a high-resolution, multi-wavelength dataset that is more suitable for substructure studies, although potentially with different lens and source samples from those deliverable by optical telescopes.

In light of these uncertainties, we confine ourselves to a setting where the main methodological points can be made without detailed modeling of the detector capabilities and the deliverable lensing dataset, which is outside of the scope of the current paper. For concreteness, we simulate a sample of lenses with a simplified subset of host galaxy properties consistent with those deliverable by Euclid as modeled by Collett (2015). In particular, we assume spherical lenses, with ellipticity parameter in Equation (5). We draw the central 1-D velocity dispersions

of host galaxies from a normal distribution with mean 225 km s

and standard deviation 50 km s

. Following Zahid et al. (2018), Equation (9) is used to map the drawn to a dark matter halo mass , and the host Einstein radius is analytically inferred with Equation (6).

Figure 1: A sample of simulated lenses. The cross markers show the offset between the host and source centers, the color corresponding to the virial mass of the host dark matter halo. The simulated subhalos are shown as dots, the color again indicates their masses. The greyscale images show the corresponding observed images. We show seven images randomly generated for and .

We draw the lens redshifts

from a log-normal distribution with mean 0.56 and scatter 0.25 dex, discarding lenses with

as these tend to have a small angular size over which substructure perturbations are relevant. The source redshift is fixed at , its offsets and are drawn from a normal distribution with zero mean and standard deviation 0.2. These choices are consistent with the lens sample generated from the LensPop code packaged with Collett (2015). We show a sample of simulated lensed images with these settings in Figure 1.

3 Statistical formalism and simulation-based inference

Our goal is to infer the subhalo mass function parameters from a catalog of images of observed lenses. In this section we will describe the challenges of this inference problem and our approach of simulation-based inference. For simplicity, we will use a more abstract notation, distinguishing between three sets of quantities in the lensing system:

Parameters of interest

The vector parameterizes the subhalo mass function given, and our goal is to infer their values.

Latent variables

A vector of all other unobservable random variables in the simulator. These include the mass

, source-host offset , and redshift of the lens, the number of subhalos in the region of interest , the position and mass of each subhalo, and the random variables related to the point spread function and Poisson fluctuations.


The observed lens images.

Unfortunately, the same symbols are used with different meanings in astrophysics and statistics: note the difference between the parameters and the angular positions , and the Einstein radius ; between the latent variables and the redshifts , ; and between the observed image and the argument of the NFW profile and used in the last section.

As described above, we have implemented a simulator for the lensing process in the “forward” direction: for given parameters , the simulator samples latent variables and finally observed images . Here

is the probability density or likelihood function of observing a lens image

given parameters . It can be schematically written as


where we integrate over the latent variables and is the joint likelihood of observables and latent variables:


Here is the distribution of the host halo parameters; is the mean number of subhalos in the region of interest as a function of the parameters , while is the actually realized number in the simulation; and are the subhalo masses and positions; is the normalized subhalo mass function given in Equation (14); and in the last line is the probability of observing an image based on the true lensed image taking into account Poisson fluctuations and detector response through the point spread function.

Standard frequentist and Bayesian inference methods rely on evaluating the likelihood function

. Unfortunately, even in our somewhat simplified simulator each run of the simulation easily involves hundreds to thousands of latent variables, the integral in Equation (16) over this enormous space clearly cannot be computed explicitly. The likelihood function

is thus intractable, providing a major challenge for both frequentist and Bayesian inference. Similarly, inference with Markov Chain Monte Carlo (MCMC) methods based directly on the joint likelihood function

requires unfeasibly many samples before converging because the latent space is so large. Systems defined through a forward simulator that does not admit a tractable likelihood are known as “implicit models”, inference techniques for this case as “simulation-based inference” or “likelihood-free inference”.

One way to tackle this issue is to estimate the density for observables from samples from the simulator, where the latent variables are marginalized by the sampling procedure. But traditional density estimation techniques require reducing the dimensionality of with summary statistics , for instance based on power spectra (Hezaveh et al., 2016; Cyr-Racine et al., 2016; Diaz Rivero et al., 2018; Chatterjee and Koopmans, 2018; Díaz Rivero et al., 2018; Cyr-Racine et al., 2019; Brennan et al., 2019). The likelihood

in the space of summary statistics can either be explicitly estimated through density estimation techniques such as histograms, kernel density estimation, or Gaussian processes, or replaced by a rejection probability in an Approximate Bayesian Computation (ABC) technique 

(Rubin, 1984). Substructure inference in quasar and extended-arc lenses using ABC techniques was explored in Gilman et al. (2018) and Birrer et al. (2017), respectively. While the compression to summary statistics makes the analysis tractable, it typically loses information and hence reduces the statistical power of the analysis.

Instead, the likelihood function or density can be approximated without any compression to summary statistics with a neural network, which has to be trained only once and can be evaluated efficiently for any parameter point and observed image. Similarly, one can train a neural network to estimate the likelihood ratio for a fixed observation between two different hypotheses or parameter points. We will show how this turns the intractable integral in Equation (16) into a tractable minimization problem and amortizes the marginalization over . This approach scales well to the expected large number of lenses expected in upcoming surveys (Oguri and Marshall, 2010; Treu, 2010; Collett, 2015). Since the full image is used as input, no information is lost due to dimensionality reduction.

We use a simulation-based inference technique introduced in Brehmer et al. (2018b, a, 2018) that extracts additional information from the simulation and uses it to improve the sample efficiency of the training of the neural network. Our inference strategy consists of four steps:

  1. During each run of the simulator, additional information that characterizes the subhalo population and lensing process is stored together with the simulated observed image.

  2. This information is used to train a neural network to approximate the likelihood ratio function.

  3. The neural network output is calibrated, ensuring that errors during training do not lead to incorrect inference results.

  4. The calibrated network output is then used in either a frequentist or Bayesian setting to perform inference.

In the remainder of this section, we will explain these four steps in detail.

3.1 Extracting additional information from the simulator

In a first step, we generate training data by simulating a large number of observed lenses. For each lens, we first draw two parameter points from a proposal distribution, . This proposal distribution should cover the region of interest in the parameter space, but does not have to be identical to the prior in a Bayesian inference setting, which allows us to be agnostic about the inference setup at this stage. Note that we use the term “proposal distribution” to avoid confusion with the prior, even though it is not a proposal distribution in the MCMC sense.

Next, the simulator is run for the parameter point , generating an observed image . In addition, we calculate and save two quantities: the joint likelihood ratio


and the joint score


The joint likelihood ratio quantifies how much more or less likely a particular simulation chain including the latent variables is for the parameter point compared to a reference distribution


where we choose the marginal distribution of latent variables and observables corresponding to the proposal distribution . Unlike the distribution for a single reference parameter point, this marginal model has support for every potential outcome of the simulation (Hermans et al., 2019). The joint score is the gradient of the joint log likelihood in model parameter space and quantifies if a particular simulation chain becomes more or less likely under infinitesimal changes of the parameters of interest. Both quantities depend on the latent variables of the simulation chain.

We compute the joint likelihood ratio and joint score with Equation (17). Conveniently, the first and third line of that equation do not explicitly depend on the parameters of interest and cancel in the joint likelihood ratio and joint score; the remaining terms can be evaluated with little overhead to the simulation code. We also calculate the joint likelihood ratio and the joint score for the second parameter point and store the parameter points and , the simulated image , as well as the joint likelihood ratios and joint scores.

Our training samples consist of images, with parameter points chosen from a uniform range in and .

3.2 Machine learning

How are the joint likelihood ratio and joint score, which are dependent on the latent variables , useful for inference based on the likelihood function , which only depends on the observed lens images and the parameters of interest? Consider the functional


where we abbreviate , , , , , and for readability. Note that the test function is a function of and only. The first two lines of Equation (21

) are an improved version of the cross-entropy loss, in which the joint likelihood ratio is used to decrease the variance compared to the canonical cross-entropy 

(Stoye et al., 2018). The last line adds gradient information, weighted by a hyperparameter .

As shown in Stoye et al. (2018)

, this “ALICES” loss functional is minimized by the function


one-to-one with the likelihood ratio function


We demonstrate the minimization of this functional explicitly in Appendix A. This means that if we can construct the functional in Equation (21) with the joint likelihood ratio and joint score extracted from the simulator and numerically minimize it, the resulting function lets us reconstruct the (otherwise intractable) likelihood ratio function ! Essentially, this step lets us integrate out the dependence on latent variables from the joint likelihood ratio and score, but in a general, functional form that does not depend on a set of observed images.

This is why extraction of the joint likelihood ratio and joint score has been described with the analogy of “mining gold” from the simulator (Brehmer et al., 2018)—while calculating these quantities may require some effort and changes to the simulator code, through the minimization of a suitable functional they allow us to calculate the otherwise intractable likelihood ratio function.

In practice, we implement this minimization with machine learning. A neural network plays the role of the test function , the integrals in Equation (21) are approximated with a sum over training data sampled according to

, and we minimize the loss numerically through a stochastic gradient descent algorithm. The neural network trained in this way provides an estimator

of the likelihood ratio function that is exact in the limit of infinite training samples, sufficient network capacity, and efficient minimization. Note the “parameterized” structure of the network, in which a single neural network is trained to estimate the likelihood ratio over all of the parameter space, with the tested parameter point being an input to the network (Cranmer et al., 2015; Baldi et al., 2016). This approach is more efficient than a point-by-point analysis of a grid of parameter points: it allows the network to “borrow” information from neighboring parameter points, benefit ting from the typically smooth structure of the parameter space.

Given the image nature of the lensing data, we choose a convolutional network architecture based on the ResNet-18 (He et al., 2016) implementation in PyTorch (Paszke et al., 2017). The parameters enter as additional inputs in the fully connected layers of the network. Compared to the original ResNet-18 architecture, we add another fully connected layer at the end to ensure that the relation between parameters of interest and image data can be modeled. All inputs are normalized to zero mean and unit variance. We train the networks by minimizing the loss in Equation (21) with over 100 epochs with a batch size of 128 using stochastic gradient descent with momentum (Qian, 1999), exponentially decaying the learning rate from 0.01 to 0.0001 with early stopping. We pretrain the model on data generated from a simplified version of the simulator, namely the “fix” scenario described in Appendix B. This architecture and hyperparameter configuration performed best during a rough hyperparameter scan, though for this proof-of-concept study we have not performed an exhaustive optimization.

3.3 Calibration

In reality, the neural network might not learn the likelihood ratio function exactly, for instance due to limited training data or inefficient training. To make sure that our inference results are correct even in this case, we calibrate the network output with histograms (Cranmer et al., 2015; Brehmer et al., 2018a). For every parameter point that we want to test, we simulate a set of images from this parameter point and calculate the network prediction for each image. We also simulate a set of images from the reference model, again calculating the network prediction for each lens. The calibrated likelihood ratio is then calculated from histograms of the network predictions as


where the denote probability densities estimated with univariate histograms.

This additional calibration stage comes with a certain computational cost that increases linearly with the number of evaluated parameter points. However, it guarantees that as long as the simulator accurately models the process, the inference results will be perfect or conservative, but not too optimistic, even if the neural network output is substantially different from the true likelihood ratio.

We will show results both without and with calibration. Where calibration is used, it is based on histograms with 50 bins, with bin boundaries determined automatically to match the distribution of likelihood ratios.

3.4 Inference

After a neural network has been trained (and optionally calibrated) to estimate the likelihood ratio function, it provides the basic ingredient to both frequentist and Bayesian inference. Multiple observations can be combined in a straightforward way: since all lens images are assumed as identically distributed and independent (except for the common dependence on the population-level parameters), the combined likelihood of a set of images is given by the product of likelihood ratios for each individual lens,


For frequentist hypothesis tests, the most powerful test statistic to distinguish two parameter points

and is the likelihood ratio (Neyman and Pearson, 1933)


where in the last step we have replaced the exact likelihood ratio with the estimation from the (calibrated) neural network. In addition, the asymptotic properties of the likelihood ratio allow us in many cases to directly translate a value of the likelihood ratio into a -value and thus into exclusion limits at a given confidence level (Wilks, 1938; Wald, 1943; Cowan et al., 2011).

For Bayesian inference, note that we can write Bayes’ theorem as


where is the set of observed lens images and is the prior on the parameters of interest, which may be different from the proposal distribution used during the generation of training data. The posterior can thus be directly calculated given an estimator , provided that the space of the parameters of interest is low-dimensional enough to calculate the integral, or with MCMC (Hermans et al., 2019) or variational inference techniques otherwise.

While our approach to inference is strongly based on the ideas in Brehmer et al. (2018b, a, 2018); Stoye et al. (2018), there are some novel features in our analysis that we would like to highlight briefly. Unlike in those earlier papers, we use a marginal model based on the proposal distribution as reference model in the denominator of the likelihood ratio, which substantially improves the numerical stability of the algorithm. This choice also allows us to include the “flipped” terms with and in the loss function in Equation (21); we found that this new, improved version of the ALICES loss improves the sample efficiency of our algorithms. Both of these improvements are inspired by Hermans et al. (2019). Finally, this is the first application of the “gold mining” idea to image data, the first combination with a convolutional network architecture, and the first use for Bayesian inference. Although machine learning-based methods have previously been proposed for inferring strong lensing host parameters (Hezaveh et al., 2017; Perreault Levasseur et al., 2017; Morningstar et al., 2018) and for lensed source reconstruction (Morningstar et al., 2019), this paper represents the first proposed application of machine learning for dark matter substructure inference in strong lenses and, as far as we are aware, for substructure inference in general.

4 Results

Figure 2: Four simulated lens images (upper panels) and the corresponding likelihood ratio maps estimated by the network (lower panels, without calibration). The star marks the true point used to generate the images, the black line shows  CL contours in parameter space based on each image.

After training the neural network using the simulations described in Section 2 and the formalism described in Section 3, we can run the inference step on a given set of images to extract the likelihood ratio estimates associated with the substructure parameters of interest . We start by illustrating in Figure 2 inference on individual simulated lensed images realizing substructure corresponding to benchmark parameters and . The top row shows example simulated images, with the corresponding inferred 2-D likelihood surfaces shown in the bottom row. The true parameter point is marked with a star and the 95% confidence level (CL) contours are shown.

Several interesting features can already be seen in these results. The 95% CL contours contain the true parameter point, with the overall likelihood surface being strongly correlated with the corresponding image. A smaller projected surface area of the lensed arc, resulting from a smaller host halo or a larger offset between the host and source centers, generally results in a flatter likelihood surface. This is expected, since a smaller host galaxy will contain relatively less substructure, and a smaller host or larger relative offset will result in a smaller effective arc area over which the substructure can imprint itself. The first column of Figure 2 shows an example of such a system. In contrast, the last columns show a system with a relatively massive host and a small offset, producing a symmetric image with a larger effective arc surface area over which the effects of substructure can be discerned. This results in a “peakier” inferred likelihood surface, corresponding to a higher sensitivity to and . The second and third columns of Figure 2 correspond to systems with a small, centered and a large, offset halo respectively, and show intermediate sensitivity to substructure properties.

Figure 3: The expected per-lens likelihood ratio map assuming and in the two-dimensional parameter space (left) and along a one-dimensional slice at (right). The lines show expected  CL exclusion limits for 5 (dotted), 20 (dashed), and 100 (solid) observed lenses. While the colormap shows the network output without calibration, the lines include the calibration procedure described in Section 3.3.

In the spirit of stacking multiple observations, we next consider a simultaneous analysis of multiple lensed images. As discussed in Section 3.4, the product of the likelihood maps of the individual images defines the appropriate test statistic. For the purpose of population-level inference, these two-dimensional likelihood maps are hence a good alternative way to define a probabilistic catalog over individual observations, avoiding the complications of prior dependence and of communicating a complicated trans-dimensional posterior. In the left panel of Figure 3, we show the expected log likelihood ratio surface per-image in the asymptotic limit, with the 1-D slice corresponding to shown in the right panel. The 95% CL expected exclusion limits for 5, 20, and 100 lenses are shown using the dotted, dashed, and solid lines respectively. The procedure can easily be extended to an arbitrarily large collection of lenses.

We find that, at least within the simplifying assumptions of our simulator, an analysis of a few tens of lenses is already sensitive to the overall substructure abundance parameterized by . A larger observed lens sample provides a tighter constraint on substructure properties. Approximately 100 lens images are required to begin resolving

. The expected exclusion contours are centered around the true values, confirming that our inference methods yield an unbiased estimate of the underlying substructure properties. Note the “banana” shape of the expected exclusion limits, which approximately traces the total deflection contributed by substructure. We demonstrate this in Figure 

4, where we show a proxy for the total subhalo-induced deflection, , equal to the space-independent part of Equation (11), and compare it to the expected exclusion limits. In our particular substructure scenario, this proxy can be shown to approximately scale like . We note that this comparison is schematic, as the subtle effects of substructure over a wide range of masses cannot be quantified through a single number (here, the total deflection).

With the likelihood ratio in hand, Equation (27) easily admits a Bayesian interpretation. In the left panel of Figure 5 we show the posterior for 100 lenses derived from the expected likelihood ratio results, assuming a Gaussian prior with mean and standard deviation on the slope . This choice is intended to capture a prior expectation on the subhalo mass function slope consistent with the Cold Dark Matter scenario (e. g., Madau et al., 2008; Springel et al., 2008). As expected from the likelihood maps, we find a posterior density peaked around the true point.

The corresponding inferred subhalo mass function (SHMF) per host halo mass, marginalized over the host halo properties, is shown in the right panel of Figure 5

. We show the point-wise mean (solid line) and point-wise 68 / 95% credible intervals (cyan and blue bands). A comparison with the true simulated subhalo mass function (dotted line, also marginalized over the host halo properties) shows excellent agreement.

5 Extensions

For the proof-of-concept analysis presented here our lensing simulation makes a number of simplifying assumptions in order to highlight the broad methodological points in a computationally tractable setting. An application of our method to real lensing data will invariably require modifications to our simulation and inference pipelines to account for the vast physical diversity in host and source galaxy morphologies, as well as ways to deal with more realistic detector response. Modeling substructure in a more involved setting than presented here (e. g., to account for tidal evolution and/or suppression of small-scale structure), and accounting for substructure along the line of sight is also desired. We will now discuss these features and comment on how they might affect our pipeline and the results presented here, leaving implementation and application to real lensing data to future work.

Figure 4: Expected proxy for the total subhalo-induced deflection (see Equation (11)) as a function of and . The solid white lines show contours of constant deflection, while the dotted black lines show the expected exclusion limits from the left panel of Figure 3.
Figure 5: Left: Expected posterior and credible region on subhalo mass function parameters for 100 observed lenses. The mock observations are generated for and . We assume a uniform prior on and a Gaussian prior with mean and standard deviation for . Right: Corresponding inferred subhalo mass function, marginalized over the host halo properties. We show the point-wise mean (solid line) as well as point-wise 68 / 95% credible intervals (cyan / blue bands) of the subhalo mass function per host halo mass. The true simulated subhalo mass function for and is shown as the dotted line. An animation showing the evolution of the substructure parameter and mass function posteriors as the number of lenses analyzed is increased can be found at .

First, we currently fix all properties of the background source as described in Section 2.3. It is straightforward to instead draw and marginalize over the parameters associated with a chosen parameterization for the source light distribution, with Gaussian and Sérsic (Sérsic, 1963) profile models being common choices. For high-fidelity images (e. g., those obtainable by targeted followups or interferometric imaging) more complicated features in the background galaxies such as outflows may not be adequately captured by such a parameterization and could introduce degeneracies with the effects of substructure. Alternative parameterizations using shapelet basis sets (Birrer et al., 2015; Tagore and Jackson, 2016; Birrer and Amara, 2018), and methods based on regularized linear inversion on grids (Warren and Dye, 2003; Suyu et al., 2006; Tagore and Keeton, 2014; Nightingale et al., 2018) have been introduced as ways to model more complicated source features. For our purposes, generative / data-driven modeling of background galaxies could easily be interfaced with our pipeline to account for the variation in structure of the background sources (Morningstar et al., 2019).

Similarly, the host lens (and associated host dark matter halo) model can be made more realistic by relaxing the restriction to spherical host halos and including more complicated profiles than the Singular Isothermal Sphere considered here, drawing and marginalizing over additional host parameters as required. External shear, which models the fact that the local large-scale structure environment of the host galaxy can induce an additional overall deflection field in a preferred direction, can similarly be parameterized (e. g., Keeton et al., 1997; Schneider, 1997) and marginalized over.

A realistic simulator should also model the dynamical evolution of subhalos (Despali and Vegetti, 2017). Effects associated with tidal disruption due to the large gradient of the galactic potential towards the center of the host galaxy are expected to deplete the fraction of mass bound in substructures there, leading to a depressed overall subhalo abundance (Han et al., 2016) with profile properties (e. g., concentration (Moliné et al., 2017) and a truncation radius (Baltz et al., 2009)) that depend on the distance from the host center. This could easily be implemented within our framework by drawing 3-D positions for the subhalos from the host center and assigning properties consistent with more involved modeling. Our subhalo mass function in Equation (14) is independent of the lens redshift, but can easily be extended to include this dependency (Despali and Vegetti, 2017; Hiroshima et al., 2018). A more complicated dependence on the host halo than the linear one assumed in Equation (14) is also easily admitted.

All of these effects are straightforward to implement in our setup and only require modifications to the simulation code. The inference algorithm is unaffected; since these extensions do not explicitly depend on the parameters of interest, the likelihood terms associated with them cancel in the calculation of the joint likelihood ratio and the joint score. Nevertheless, these changes affect the final observed image and therefore also the true likelihood function; the variance of the joint likelihood ratio and score could therefore increase, requiring larger training samples before the network converges to the correct likelihood ratio function.

With these extensions, the redshift of the background source and the lens will play a more important role. Since these redshifts can potentially be measured through spectroscopic follow-up observations, it is likely that we can improve the performance of the inference algorithm by using this information. We can treat both the source and lens redshift, potentially with added uncertainty to model measurement noise, as additional observables. The input to the neural networks then consists of the observed lens image, the measured (potentially noisy) redshifts, and the tested parameter point. Except for a simple modification of the network architecture, the inference algorithm remains unchanged.

Including line-of-sight substructure can be somewhat more involved, since it necessitates the introduction of a separate line-of-sight halo mass function (Birrer et al., 2017; Despali et al., 2018; Gilman et al., 2019b; Hsueh et al., 2019). Depending on the specific model (and whether foreground substructure is treated as a nuisance effect or additional signal to be leveraged) its parameters could depend on the parameters of interest, which would require a modification of the calculation of the joint likelihood ratio and joint score. Structurally this is identical to our current modeling of subhalos within the lens. Since the abundance of foreground substructure is expected to be at most comparable to the substructure within the lensing galaxy (depending on the source redshift), we expect that these additional factors in the joint likelihood ratio and joint score will not slow down the overall simulation significantly, and will not increase the variance of the inference techniques too much while having the potential to improve the overall sensitivity of the analysis to substructure abundance in the Universe.

Modification to the subhalo mass function parameterization that we have considered may be desirable for constraining specific particle physics scenarios. For example, warm dark matter introduces a lower cutoff scale in the subhalo mass function (Bode et al., 2001) which can be parameterized and mapped onto the dark matter mass (Schneider et al., 2012; Lovell et al., 2014; Li et al., 2016; Birrer et al., 2017). This would also require a straightforward modification of the joint likelihood ratio and joint score calculation depending on the specific parameterization.

It is expected that a sample of strong lenses will include image-to-image variations on the exposure, sky background, and detector effects like the point spread function depending on the specific scanning strategy of the observatory. The sky background can be marginalized over as usual. Rather than treating the exposure and PSF model as nuisance parameters, passing them as additional a priori known inputs to the network in addition to normalizing the network input to unit exposure is likely to improve performance. Multiple color bands can easily be modeled and included as inputs to the neural network as different color channels, something that is commonly done when using the ResNet architecture we consider. This can substantially improve discrimination between light from the source, host, and sky background which tend to have a degree of separation in color space.

While including these extensions in our simulation and inference code is feasible, the detailed modeling is beyond the scope of the current paper. We thus leave the implementation of these features and application to real lensing data to future work.

6 Conclusions

Strong lensing offers a unique way to probe the properties and distribution of dark matter on sub-galactic scales through the subtle imprint of substructure on lensed arcs. The high dimensionality of the underlying latent space characterizing substructure poses a significant challenge, however. In this paper, we have introduced a novel simulation-based inference technique based on the ideas introduced in Cranmer et al. (2015); Brehmer et al. (2018b, a, 2018); Stoye et al. (2018) for inferring high-level population properties characterizing the distribution of substructure in an ensemble of galaxy-galaxy strong lenses and demonstrated its feasibility through proof-of-principle examples.

Our results on simulated data demonstrate that this method, based on calibrated likelihood ratio estimators with a machine learning back end, offers a promising way to analyze extended-arc strong lensing images with the goal of inferring properties of dark matter substructure. Our proposed method offers several combined advantages over established techniques. In probing the collective effect of a large number of low-mass, sub-threshold subhalos it can offer sensitivity to the faint end of the subhalo mass function where deviations from the concordance CDM paradigm and the effects of new physics are most likely to be expressed. It can naturally be applied to perform fast, principled, and concurrent analyses of a large sample of strong lenses that share a common set of hyperparameters describing the underlying substructure population properties. By efficiently marginalizing out the individual subhalo properties and directly inferring the population-level parameters of interest we are able to sidestep the more expensive two-step procedure of characterizing individual subhalos before using them to infer higher-level population parameters. Population-level likelihood scans for individual images are thus a suitable alternative to probabilistic catalogs over subhalos, avoiding both prior dependence as well as the logistical complexity of communicating a complicated trans-dimensional posterior. Furthermore, rigorous selection of lensing images out of a large sample is not necessary within our framework since images with a smaller effective arc area or low overall fidelity simply do not contribute significantly to the simultaneous substructure analysis, and non-detections are just as valuable as detections. Finally, our analysis is performed at the level of image data without incurring loss of information associated with dimensionality reduction.

Although we have focused on a simple proof-of-principle example in this paper, extensions to more realistic scenarios—including more complicated descriptions of the host, source, and substructure populations—are easily admitted within our framework. The flexibility of the proposed method allows for applications beyond substructure population inference as well. For example, a large lens sample can be used to perform cosmological parameter estimation while accounting for substructure effects and in particular to independently constrain the Hubble constant (Chen et al., 2019; Wong et al., 2019) through its dependence on the angular diameter distance scales in lensing systems. In the spirit of Alsing and Wandelt (2018), these techniques can also be used to learn powerful summary statistics (Brehmer et al., 2018).

We are currently at the dawn of a new era in observational cosmology, when ongoing and upcoming surveys—e. g., DES, LSST, Euclid, and WFIRST—are expected to discover and deliver images of thousands of strong lensing systems. These will harbor the subtle imprint of dark matter substructure, whose characterization could hold the key to unveiling the particle nature of dark matter. In this paper, we have introduced a powerful machine learning-based method that can be used to uncover the properties of small-scale structure within these lenses and in the Universe at large. The techniques presented have the potential to maximize the information that can be extracted from a complex lens sample and zero in on signatures of new physics.

The code used to obtain the results in this paper is available at .

We thank Simon Birrer, Christopher Fassnacht, Daniel Gilman, Siavash Golkar, and Neal Weiner for useful conversations. SM thanks Laura Chang for collaboration at the early stages of this work. JB and KC are partially supported by NSF awards ACI-1450310, OAC-1836650, and OAC-1841471, and the Moore-Sloan Data Science Environment at NYU. KC is also supported through the NSF grant PHY-1505463. JH thanks the F.R.S.-FNRS for his FRIA scholarship. SM is supported by the NSF CAREER grant PHY-1554858, NSF grants PHY-1620727 and PHY-1915409, and the Simons Foundation. This work was also supported through the NYU IT High Performance Computing resources, services, and staff expertise. This research has made use of NASA’s Astrophysics Data System. Astropy (Astropy Collaboration et al., 2013, 2018), Jupyter (Kluyver et al., 2016), IPython (Perez and Granger, 2007), LensPop (Collett, 2015), MadMiner (Brehmer et al., 2019), Matplotlib (Hunter, 2007), NumPy (van der Walt et al., 2011), Palettable (Davis, 2012), PyTorch (Paszke et al., 2017), SciPy (Jones et al., 2001).

Appendix A Minimum of the loss functional

A central step in our inference technique is numerically minimizing the functional given in Equation (21) to obtain an estimator for the likelihood ratio function. Here we will use calculus of variation to explicitly show that the solution given in Equation (22) in fact minimizes this loss, closely following Brehmer et al. (2018a); Stoye et al. (2018).

First consider the case of , i. e. the functional


where we use the shorthand notation , , , . The function that minimizes this functional has to satisfy


As long as , this is equivalent to


and finally


in agreement with Equation (22). Note that this result is independent of the choice of , as long as this proposal distribution has support at all relevant parameter points.

Similarly it can be shown that the gradient term in the loss functional weighted by is minimized when the gradient of the log likelihood ratio estimated by the neural network is equal to the true score,


We refer the reader to Brehmer et al. (2018a) for the derivation. While not strictly necessary for the inference technique, including this term in the loss function substantially improves the sample efficiency of the algorithm, similar to how gradient information makes any fit converge faster.

Appendix B Simplified scenarios

Figure 6: Left: The expected  CL exclusion limits for 5 observed lenses for four different levels of complexity of the simulator. Right: The expected likelihood ratio along a one-dimensional slice through the parameter space at for the same four simulator scenarios. In both panels we compare the “full” simulator discussed in Section 2, a scenario in which the host mass is varied but the offset relative to the source is fixed at zero (“mass”), a case in which the source offset is varied but the host halo mass is fixed (“align”), and a toy scenario in which both the offset and the mass of the host halo are fixed (“fix”). The data was generated for and .

In order to validate our setup and to disentangle the impact of different latent variables on the inference results we consider three additional versions of our simulation. In the simplest one, which we call “fix”, all source and host properties are fixed to particular value, including the host halo mass and the offset between source and lens, which is set to zero. In the “align” scenario we relax the restriction on the source offset variables and and draw them from a Gaussian as described in Section 2. In the “mass” version, on the other hand, the offset is fixed at zero, but the host halo mass is drawn from a distribution as described above. We train separate neural networks on lens images generated in these three scenarios and calculated likelihood maps as described in Section 3, although to save computation time we do not perform a calibration procedure.

The expected confidence limits for 5 observed lens images in the three simplified scenarios and our “full” setup are compared in Figure 6. As expected, the more latent variables we keep fixed, the more the inference technique becomes more sensitive to the parameters of interest. In particular fixing the source-host alignment substantially increases the strength of the expected limits.


  • M. Aaboud, G. Aad, B. Abbott, D. C. Abbott, O. Abdinov, D. K. Abhayasinghe, S. H. Abidi, O. S. AbouZeid, N. L. Abraham, H. Abramowicz, and et al. (2019) Constraints on mediator-based dark matter and scalar dark energy models using {s} = 13 TeV pp collision data collected by the ATLAS detector. JHEP 2019 (5), pp. 142. External Links: Document, 1903.01400 Cited by: §1.
  • P. Agrawal, F. Cyr-Racine, L. Randall, and J. Scholtz (2017) Dark catalysis. J. Cosmology Astropart. Phys 2017 (8), pp. 021. External Links: Document, 1702.05482 Cited by: §1.
  • P. Agrawal and L. Randall (2017) Point sources from dissipative dark matter. J. Cosmology Astropart. Phys 2017 (12), pp. 019. External Links: Document, 1706.04195 Cited by: §1.
  • D. S. Akerib, S. Alsum, H. M. Araújo, X. Bai, A. J. Bailey, J. Balajthy, P. Beltrame, E. P. Bernard, A. Bernstein, T. P. Biesiadzinski, E. M. Boulton, R. Bramante, P. Brás, D. Byram, S. B. Cahn, M. C. Carmona-Benitez, C. Chan, A. A. Chiller, C. Chiller, A. Currie, J. E. Cutter, T. J. R. Davison, A. Dobi, J. E. Y. Dobson, E. Druszkiewicz, B. N. Edwards, C. H. Faham, S. Fiorucci, R. J. Gaitskell, V. M. Gehman, C. Ghag, K. R. Gibson, M. G. D. Gilchriese, C. R. Hall, M. Hanhardt, S. J. Haselschwardt, S. A. Hertel, D. P. Hogan, M. Horn, D. Q. Huang, C. M. Ignarra, M. Ihm, R. G. Jacobsen, W. Ji, K. Kamdin, K. Kazkaz, D. Khaitan, R. Knoche, N. A. Larsen, C. Lee, B. G. Lenardo, K. T. Lesko, A. Lindote, M. I. Lopes, A. Manalaysay, R. L. Mannino, M. F. Marzioni, D. N. McKinsey, D. -M. Mei, J. Mock, M. Moongweluwan, J. A. Morad, A. St. J. Murphy, C. Nehrkorn, H. N. Nelson, F. Neves, K. O’Sullivan, K. C. Oliver-Mallory, K. J. Palladino, E. K. Pease, P. Phelps, L. Reichhart, C. Rhyne, S. Shaw, T. A. Shutt, C. Silva, M. Solmaz, V. N. Solovov, P. Sorensen, S. Stephenson, T. J. Sumner, M. Szydagis, D. J. Taylor, W. C. Taylor, B. P. Tennyson, P. A. Terman, D. R. Tiedt, W. H. To, M. Tripathi, L. Tvrznikova, S. Uvarov, J. R. Verbus, R. C. Webb, J. T. White, T. J. Whitis, M. S. Witherell, F. L. H. Wolfs, J. Xu, K. Yazdani, S. K. Young, C. Zhang, and LUX Collaboration (2017) Results from a Search for Dark Matter in the Complete LUX Exposure. Phys. Rev. Lett. 118 (2), pp. 021303. External Links: Document, 1608.07648 Cited by: §1.
  • A. Albert, B. Anderson, K. Bechtol, A. Drlica-Wagner, M. Meyer, M. Sánchez-Conde, L. Strigari, M. Wood, T. M. C. Abbott, F. B. Abdalla, A. Benoit-Lévy, G. M. Bernstein, R. A. Bernstein, E. Bertin, D. Brooks, D. L. Burke, A. Carnero Rosell, M. Carrasco Kind, J. Carretero, M. Crocce, C. E. Cunha, C. B. D’Andrea, L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, T. F. Eifler, A. E. Evrard, A. Fausti Neto, D. A. Finley, B. Flaugher, P. Fosalba, J. Frieman, D. W. Gerdes, D. A. Goldstein, D. Gruen, R. A. Gruendl, K. Honscheid, D. J. James, S. Kent, K. Kuehn, N. Kuropatkin, O. Lahav, T. S. Li, M. A. G. Maia, M. March, J. L. Marshall, P. Martini, C. J. Miller, R. Miquel, E. Neilsen, B. Nord, R. Ogando, A. A. Plazas, K. Reil, A. K. Romer, E. S. Rykoff, E. Sanchez, B. Santiago, M. Schubnell, I. Sevilla-Noarbe, R. C. Smith, M. Soares-Santos, F. Sobreira, E. Suchyta, M. E. C. Swanson, G. Tarle, V. Vikram, A. R. Walker, R. H. Wechsler, Fermi-LAT Collaboration, and DES Collaboration (2017) Searching for Dark Matter Annihilation in Recently Discovered Milky Way Satellites with Fermi-Lat. ApJ 834 (2), pp. 110. External Links: Document, 1611.03184 Cited by: §1.
  • J. Alsing, T. Charnock, S. Feeney, and B. Wand elt (2019)

    Fast likelihood-free cosmology with neural density estimators and active learning

    MNRAS 488 (3), pp. 4440–4458. External Links: Document, 1903.00007 Cited by: §1.
  • J. Alsing and B. Wandelt (2018) Generalized massive optimal data compression. MNRAS 476 (1), pp. L60–L64. External Links: Document, 1712.00012 Cited by: §6.
  • E. Aprile, J. Aalbers, F. Agostini, M. Alfonsi, L. Althueser, F. D. Amaro, M. Anthony, F. Arneodo, L. Baudis, B. Bauermeister, M. L. Benabderrahmane, T. Berger, P. A. Breur, A. Brown, A. Brown, E. Brown, S. Bruenner, G. Bruno, R. Budnik, C. Capelli, J. M. R. Cardoso, D. Cichon, D. Coderre, A. P. Colijn, J. Conrad, J. P. Cussonneau, M. P. Decowski, P. de Perio, P. di Gangi, A. di Giovanni, S. Diglio, A. Elykov, G. Eurin, J. Fei, A. D. Ferella, A. Fieguth, W. Fulgione, A. Gallo Rosso, M. Galloway, F. Gao, M. Garbini, C. Geis, L. Grandi, Z. Greene, H. Qiu, C. Hasterok, E. Hogenbirk, J. Howlett, R. Itay, F. Joerg, B. Kaminsky, S. Kazama, A. Kish, G. Koltman, H. Landsman, R. F. Lang, L. Levinson, Q. Lin, S. Lindemann, M. Lindner, F. Lombardi, J. A. M. Lopes, J. Mahlstedt, A. Manfredini, T. Marrodán Undagoitia, J. Masbou, D. Masson, M. Messina, K. Micheneau, K. Miller, A. Molinario, K. Morâ, M. Murra, J. Naganoma, K. Ni, U. Oberlack, B. Pelssers, F. Piastra, J. Pienaar, V. Pizzella, G. Plante, R. Podviianiuk, N. Priel, D. Ramírez García, L. Rauch, S. Reichard, C. Reuter, B. Riedel, A. Rizzo, A. Rocchetti, N. Rupp, J. M. F. Dos Santos, G. Sartorelli, M. Scheibelhut, S. Schindler, J. Schreiner, D. Schulte, M. Schumann, L. Scotto Lavina, M. Selvi, P. Shagin, E. Shockley, M. Silva, H. Simgen, D. Thers, F. Toschi, G. Trinchero, C. Tunnell, N. Upole, M. Vargas, O. Wack, H. Wang, Z. Wang, Y. Wei, C. Weinheimer, C. Wittweg, J. Wulf, J. Ye, Y. Zhang, T. Zhu, and Xenon Collaboration (2018) Dark Matter Search Results from a One Ton-Year Exposure of XENON1T. Phys. Rev. Lett. 121 (11), pp. 111302. External Links: Document, 1805.12562 Cited by: §1.
  • Astropy Collaboration, A. M. Price-Whelan, B. M. Sipőcz, H. M. Günther, P. L. Lim, S. M. Crawford, S. Conseil, D. L. Shupe, M. W. Craig, N. Dencheva, A. Ginsburg, J. T. VanderPlas, L. D. Bradley, D. Pérez-Suárez, M. de Val-Borro, T. L. Aldcroft, K. L. Cruz, T. P. Robitaille, E. J. Tollerud, C. Ardelean, T. Babej, Y. P. Bach, M. Bachetti, A. V. Bakanov, S. P. Bamford, G. Barentsen, P. Barmby, A. Baumbach, K. L. Berry, F. Biscani, M. Boquien, K. A. Bostroem, L. G. Bouma, G. B. Brammer, E. M. Bray, H. Breytenbach, H. Buddelmeijer, D. J. Burke, G. Calderone, J. L. Cano Rodríguez, M. Cara, J. V. M. Cardoso, S. Cheedella, Y. Copin, L. Corrales, D. Crichton, D. D’Avella, C. Deil, É. Depagne, J. P. Dietrich, A. Donath, M. Droettboom, N. Earl, T. Erben, S. Fabbro, L. A. Ferreira, T. Finethy, R. T. Fox, L. H. Garrison, S. L. J. Gibbons, D. A. Goldstein, R. Gommers, J. P. Greco, P. Greenfield, A. M. Groener, F. Grollier, A. Hagen, P. Hirst, D. Homeier, A. J. Horton, G. Hosseinzadeh, L. Hu, J. S. Hunkeler, Ž. Ivezić, A. Jain, T. Jenness, G. Kanarek, S. Kendrew, N. S. Kern, W. E. Kerzendorf, A. Khvalko, J. King, D. Kirkby, A. M. Kulkarni, A. Kumar, A. Lee, D. Lenz, S. P. Littlefair, Z. Ma, D. M. Macleod, M. Mastropietro, C. McCully, S. Montagnac, B. M. Morris, M. Mueller, S. J. Mumford, D. Muna, N. A. Murphy, S. Nelson, G. H. Nguyen, J. P. Ninan, M. Nöthe, S. Ogaz, S. Oh, J. K. Parejko, N. Parley, S. Pascual, R. Patil, A. A. Patil, A. L. Plunkett, J. X. Prochaska, T. Rastogi, V. Reddy Janga, J. Sabater, P. Sakurikar, M. Seifert, L. E. Sherbert, H. Sherwood-Taylor, A. Y. Shih, J. Sick, M. T. Silbiger, S. Singanamalla, L. P. Singer, P. H. Sladen, K. A. Sooley, S. Sornarajah, O. Streicher, P. Teuben, S. W. Thomas, G. R. Tremblay, J. E. H. Turner, V. Terrón, M. H. van Kerkwijk, A. de la Vega, L. L. Watkins, B. A. Weaver, J. B. Whitmore, J. Woillez, V. Zabalza, and Astropy Contributors (2018) The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package. AJ 156, pp. 123. External Links: Document, 1801.02634 Cited by: §6.
  • Astropy Collaboration, T. P. Robitaille, E. J. Tollerud, P. Greenfield, M. Droettboom, E. Bray, T. Aldcroft, M. Davis, A. Ginsburg, A. M. Price-Whelan, W. E. Kerzendorf, A. Conley, N. Crighton, K. Barbary, D. Muna, H. Ferguson, F. Grollier, M. M. Parikh, P. H. Nair, H. M. Unther, C. Deil, J. Woillez, S. Conseil, R. Kramer, J. E. H. Turner, L. Singer, R. Fox, B. A. Weaver, V. Zabalza, Z. I. Edwards, K. Azalee Bostroem, D. J. Burke, A. R. Casey, S. M. Crawford, N. Dencheva, J. Ely, T. Jenness, K. Labrie, P. L. Lim, F. Pierfederici, A. Pontzen, A. Ptak, B. Refsdal, M. Servillat, and O. Streicher (2013) Astropy: A community Python package for astronomy. A&A 558, pp. A33. External Links: Document, 1307.6212 Cited by: §6.
  • P. Baldi, K. Cranmer, T. Faucett, P. Sadowski, and D. Whiteson (2016) Parameterized neural networks for high-energy physics. European Physical Journal C 76 (5), pp. 235. External Links: Document, 1601.07913 Cited by: §3.2.
  • E. A. Baltz, P. Marshall, and M. Oguri (2009) Analytic models of plausible gravitational lens potentials. J. Cosmology Astropart. Phys 2009 (1), pp. 015. External Links: Document, 0705.0682 Cited by: §5.
  • M. Bartelmann and P. Schneider (2001) Weak gravitational lensing. Phys. Rep. 340 (4-5), pp. 291–472. External Links: Document, astro-ph/9912508 Cited by: §2.1, §2.1.
  • K. Bechtol, A. Drlica-Wagner, E. Balbinot, A. Pieres, J. D. Simon, B. Yanny, B. Santiago, R. H. Wechsler, J. Frieman, A. R. Walker, P. Williams, E. Rozo, E. S. Rykoff, A. Queiroz, E. Luque, A. Benoit-Lévy, D. Tucker, I. Sevilla, R. A. Gruendl, L. N. da Costa, A. Fausti Neto, M. A. G. Maia, T. Abbott, S. Allam, R. Armstrong, A. H. Bauer, G. M. Bernstein, R. A. Bernstein, E. Bertin, D. Brooks, E. Buckley-Geer, D. L. Burke, A. Carnero Rosell, F. J. Castander, R. Covarrubias, C. B. D’Andrea, D. L. DePoy, S. Desai, H. T. Diehl, T. F. Eifler, J. Estrada, A. E. Evrard, E. Fernandez, D. A. Finley, B. Flaugher, E. Gaztanaga, D. Gerdes, L. Girardi, M. Gladders, D. Gruen, G. Gutierrez, J. Hao, K. Honscheid, B. Jain, D. James, S. Kent, R. Kron, K. Kuehn, N. Kuropatkin, O. Lahav, T. S. Li, H. Lin, M. Makler, M. March, J. Marshall, P. Martini, K. W. Merritt, C. Miller, R. Miquel, J. Mohr, E. Neilsen, R. Nichol, B. Nord, R. Ogando, J. Peoples, D. Petravick, A. A. Plazas, A. K. Romer, A. Roodman, M. Sako, E. Sanchez, V. Scarpine, M. Schubnell, R. C. Smith, M. Soares-Santos, F. Sobreira, E. Suchyta, M. E. C. Swanson, G. Tarle, J. Thaler, D. Thomas, W. Wester, J. Zuntz, and DES Collaboration (2015) Eight New Milky Way Companions Discovered in First-year Dark Energy Survey Data. ApJ 807 (1), pp. 50. External Links: Document, 1503.02584 Cited by: §1.
  • S. Birrer, A. Amara, and A. Refregier (2015) Gravitational Lens Modeling with Basis Sets. ApJ 813 (2), pp. 102. External Links: Document, 1504.07629 Cited by: §5.
  • S. Birrer, A. Amara, and A. Refregier (2017) Lensing substructure quantification in RXJ1131-1231: a 2 keV lower bound on dark matter thermal relic mass. J. Cosmology Astropart. Phys 2017 (5), pp. 037. External Links: Document, 1702.00009 Cited by: §1, §3, §5.
  • S. Birrer and A. Amara (2018) lenstronomy: Multi-purpose gravitational lens modelling software package. Physics of the Dark Universe 22, pp. 189–201. External Links: Document, 1803.09746 Cited by: §5.
  • S. Birrer, C. Welschen, A. Amara, and A. Refregier (2017) Line-of-sight effects in strong lensing: putting theory into practice. J. Cosmology Astropart. Phys 2017 (4), pp. 049. External Links: Document, 1610.01599 Cited by: §5.
  • P. Bode, J. P. Ostriker, and N. Turok (2001) Halo Formation in Warm Dark Matter Models. ApJ 556 (1), pp. 93–107. External Links: Document, astro-ph/0010389 Cited by: §1, §5.
  • A. Bonaca, D. W. Hogg, A. M. Price-Whelan, and C. Conroy (2019) The Spur and the Gap in GD-1: Dynamical Evidence for a Dark Substructure in the Milky Way Halo. ApJ 880 (1), pp. 38. External Links: Document, 1811.03631 Cited by: §1.
  • A. Bonaca and D. W. Hogg (2018) The Information Content in Cold Stellar Streams. ApJ 867 (2), pp. 101. External Links: Document, 1804.06854 Cited by: §1.
  • J. R. Bond and A. S. Szalay (1983) The collisionless damping of density fluctuations in an expanding universe. ApJ 274, pp. 443–468. External Links: Document Cited by: §1.
  • D. Boyanovsky, H. J. de Vega, and N. G. Sanchez (2008) Dark matter transfer function: Free streaming, particle statistics, and memory of gravitational clustering. Phys. Rev. D 78 (6), pp. 063546. External Links: Document, 0807.0622 Cited by: §1.
  • D. Boyanovsky and J. Wu (2011) Small scale aspects of warm dark matter: Power spectra and acoustic oscillations. Phys. Rev. D 83 (4), pp. 043524. External Links: Document, 1008.0992 Cited by: §1.
  • J. Brehmer, K. Cranmer, G. Louppe, and J. Pavez (2018a) A guide to constraining effective field theories with machine learning. Phys. Rev. D 98 (5), pp. 052004. External Links: Document, 1805.00020 Cited by: Appendix A, Appendix A, §1, §3.3, §3.4, §3, §6.
  • J. Brehmer, K. Cranmer, G. Louppe, and J. Pavez (2018b) Constraining Effective Field Theories with Machine Learning. Phys. Rev. Lett. 121 (11), pp. 111801. External Links: Document, 1805.00013 Cited by: §1, §3.4, §3, §6.
  • J. Brehmer, F. Kling, I. Espejo, and K. Cranmer (2019) MadMiner: Machine learning-based inference for particle physics. , pp. . External Links: 1907.10621 Cited by: §6.
  • J. Brehmer, G. Louppe, J. Pavez, and K. Cranmer (2018) Mining gold from implicit models to improve likelihood-free inference. , pp. . External Links: 1805.12244 Cited by: §1, §3.2, §3.4, §3, §6, §6.
  • S. Brennan, A. J. Benson, F. Cyr-Racine, C. R. Keeton, L. A. Moustakas, and A. R. Pullen (2019) Quantifying the power spectrum of small-scale structure in semi-analytic galaxies. MNRAS 488 (4), pp. 5085–5092. External Links: Document Cited by: §1, §3.
  • B. J. Brewer, D. Huijser, and G. F. Lewis (2016) Trans-dimensional Bayesian inference for gravitational lens substructures. MNRAS 455 (2), pp. 1819–1829. External Links: Document, 1508.00662 Cited by: §1.
  • A. M. Brooks (2018) Understanding Dwarf Galaxies in order to Understand Dark Matter. , pp. . External Links: 1812.00044 Cited by: §1.
  • M. R. Buckley and A. DiFranzo (2018) Collapsed Dark Matter Structures. Phys. Rev. Lett. 120 (5), pp. 051102. External Links: Document, 1707.03829 Cited by: §1.
  • M. R. Buckley and A. H. G. Peter (2018) Gravitational probes of dark matter physics. Physics Reports 761, pp. 1–60. External Links: Document, 1712.06615 Cited by: §1.
  • M. R. Buckley, J. Zavala, F. Cyr-Racine, K. Sigurdson, and M. Vogelsberger (2014) Scattering, damping, and acoustic oscillations: Simulating the structure of dark matter halos with relativistic force carriers. Phys. Rev. D 90 (4), pp. 043524. External Links: Document, 1405.2075 Cited by: §1.
  • M. Buschmann, J. Kopp, B. R. Safdi, and C. Wu (2018) Stellar Wakes from Dark Matter Subhalos. Phys. Rev. Lett. 120 (21), pp. 211101. External Links: Document, 1711.03554 Cited by: §1.
  • R. G. Carlberg and C. J. Grillmair (2013) Gaps in the GD-1 Star Stream. ApJ 768 (2), pp. 171. External Links: Document, 1303.4342 Cited by: §1.
  • R. G. Carlberg (2012) Dark Matter Sub-halo Counts via Star Stream Crossings. ApJ 748 (1), pp. 20. External Links: Document, 1109.6022 Cited by: §1.
  • L. J. Chang, M. Lisanti, and S. Mishra-Sharma (2018) Search for dark matter annihilation in the Milky Way halo. Phys. Rev. D 98 (12), pp. 123004. External Links: Document, 1804.04132 Cited by: §1.
  • S. Chatterjee and L. V. E. Koopmans (2018) The inner mass power spectrum of galaxies using strong gravitational lensing: beyond linear approximation. MNRAS 474 (2), pp. 1762–1772. External Links: Document, 1710.03075 Cited by: §1, §3.
  • G. C. -F. Chen, C. D. Fassnacht, Sherry. H. Suyu, C. E. Rusu, J. H. H. Chan, K. C. Wong, M. W. Auger, S. Hilbert, V. Bonvin, S. Birrer, M. Millon, L. V. E. Koopmans, D. J. Lagattuta, J. P. McKean, S. Vegetti, F. Courbin, X. Ding, A. Halkola, I. Jee, A. J. Shajib, D. Sluse, A. Sonnenfeld, and T. Treu (2019) A SHARP view of H0LiCOW: from three time-delay gravitational lens systems with adaptive optics imaging. , pp. . External Links: 1907.02533 Cited by: §6.
  • R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud (2018)

    Neural Ordinary Differential Equations

    , pp. . External Links: 1806.07366 Cited by: §1.
  • L. Ciotti and G. Bertin (1999) Analytical properties of the R law. A&A 352, pp. 447–451. External Links: astro-ph/9911078 Cited by: §2.3.
  • P. Colín, V. Avila-Reese, O. Valenzuela, and C. Firmani (2002) Structure and Subhalo Population of Halos in a Self-interacting Dark Matter Cosmology. ApJ 581 (2), pp. 777–793. External Links: Document, astro-ph/0205322 Cited by: §1.
  • T. E. Collett (2015) The Population of Galaxy-Galaxy Strong Lenses in Forthcoming Optical Imaging Surveys. ApJ 811, pp. 20. External Links: Document, 1507.02657 Cited by: §1, §2.5, §2.6, §2.6, §3, §6.
  • G. Cowan, K. Cranmer, E. Gross, and O. Vitells (2011) Asymptotic formulae for likelihood-based tests of new physics. European Physical Journal C 71, pp. 1554. External Links: Document, 1007.1727 Cited by: §3.4.
  • K. Cranmer, J. Pavez, and G. Louppe (2015)

    Approximating Likelihood Ratios with Calibrated Discriminative Classifiers

    , pp. . External Links: 1506.02169 Cited by: §1, §3.2, §3.3, §6.
  • X. Cui, A. Abdukerim, W. Chen, X. Chen, Y. Chen, B. Dong, D. Fang, C. Fu, K. Giboni, F. Giuliani, L. Gu, Y. Gu, X. Guo, Z. Guo, K. Han, C. He, D. Huang, S. He, X. Huang, Z. Huang, X. Ji, Y. Ju, S. Li, Y. Li, H. Lin, H. Liu, J. Liu, Y. Ma, Y. Mao, K. Ni, J. Ning, X. Ren, F. Shi, A. Tan, C. Wang, H. Wang, M. Wang, Q. Wang, S. Wang, X. Wang, X. Wang, Q. Wu, S. Wu, M. Xiao, P. Xie, B. Yan, Y. Yang, J. Yue, D. Zhang, H. Zhang, T. Zhang, T. Zhang, L. Zhao, J. Zhou, N. Zhou, and X. Zhou (2017) Dark Matter Results from 54-Ton-Day Exposure of PandaX-II Experiment. Phys. Rev. Lett. 119 (18), pp. 181302. External Links: Document Cited by: §1.
  • F. Cyr-Racine, C. R. Keeton, and L. A. Moustakas (2019) Beyond subhalos: Probing the collective effect of the Universe’s small-scale structure with gravitational lensing. Phys. Rev. D 100 (2), pp. 023013. External Links: Document, 1806.07897 Cited by: §1, §3.
  • F. Cyr-Racine, L. A. Moustakas, C. R. Keeton, K. Sigurdson, and D. A. Gilman (2016) Dark census: Statistically detecting the satellite populations of distant galaxies. Phys. Rev. D 94 (4), pp. 043505. External Links: Document, 1506.01724 Cited by: §1, §3.
  • N. Dalal and C. S. Kochanek (2002) Direct Detection of Cold Dark Matter Substructure. ApJ 572 (1), pp. 25–33. External Links: Document, astro-ph/0111456 Cited by: §1, §2.4.
  • J. J. Dalcanton and C. J. Hogan (2001) Halo Cores and Phase-Space Densities: Observational Constraints on Dark Matter Physics and Structure Formation. ApJ 561 (1), pp. 35–45. External Links: Document, astro-ph/0004381 Cited by: §1.
  • Dark Energy Survey Collaboration, T. Abbott, F. B. Abdalla, J. Aleksić, S. Allam, A. Amara, D. Bacon, E. Balbinot, M. Banerji, K. Bechtol, A. Benoit-Lévy, G. M. Bernstein, E. Bertin, J. Blazek, C. Bonnett, S. Bridle, D. Brooks, R. J. Brunner, E. Buckley-Geer, D. L. Burke, G. B. Caminha, D. Capozzi, J. Carlsen, A. Carnero-Rosell, M. Carollo, M. Carrasco-Kind, J. Carretero, F. J. Castander, L. Clerkin, T. Collett, C. Conselice, M. Crocce, C. E. Cunha, C. B. D’Andrea, L. N. da Costa, T. M. Davis, S. Desai, H. T. Diehl, J. P. Dietrich, S. Dodelson, P. Doel, A. Drlica-Wagner, J. Estrada, J. Etherington, A. E. Evrard, J. Fabbri, D. A. Finley, B. Flaugher, R. J. Foley, P. Fosalba, J. Frieman, J. García-Bellido, E. Gaztanaga, D. W. Gerdes, T. Giannantonio, D. A. Goldstein, D. Gruen, R. A. Gruendl, P. Guarnieri, G. Gutierrez, W. Hartley, K. Honscheid, B. Jain, D. J. James, T. Jeltema, S. Jouvel, R. Kessler, A. King, D. Kirk, R. Kron, K. Kuehn, N. Kuropatkin, O. Lahav, T. S. Li, M. Lima, H. Lin, M. A. G. Maia, M. Makler, M. Manera, C. Maraston, J. L. Marshall, P. Martini, R. G. McMahon, P. Melchior, A. Merson, C. J. Miller, R. Miquel, J. J. Mohr, X. Morice-Atkinson, K. Naidoo, E. Neilsen, R. C. Nichol, B. Nord, R. Ogando, F. Ostrovski, A. Palmese, A. Papadopoulos, H. V. Peiris, J. Peoples, W. J. Percival, A. A. Plazas, S. L. Reed, A. Refregier, A. K. Romer, A. Roodman, A. Ross, E. Rozo, E. S. Rykoff, I. Sadeh, M. Sako, C. Sánchez, E. Sanchez, B. Santiago, V. Scarpine, M. Schubnell, I. Sevilla-Noarbe, E. Sheldon, M. Smith, R. C. Smith, M. Soares-Santos, F. Sobreira, M. Soumagnac, E. Suchyta, M. Sullivan, M. Swanson, G. Tarle, J. Thaler, D. Thomas, R. C. Thomas, D. Tucker, J. D. Vieira, V. Vikram, A. R. Walker, R. H. Wechsler, J. Weller, W. Wester, L. Whiteway, H. Wilcox, B. Yanny, Y. Zhang, and J. Zuntz (2016) The Dark Energy Survey: more than dark energy - an overview. MNRAS 460 (2), pp. 1270–1299. External Links: Document, 1601.00329 Cited by: §1.
  • R. Davé, D. N. Spergel, P. J. Steinhardt, and B. D. Wandelt (2001) Halo Properties in Cosmological Simulations of Self-interacting Cold Dark Matter. ApJ 547 (2), pp. 574–589. External Links: Document, astro-ph/0006218 Cited by: §1.
  • M. Davis (2012) palettable: color palettes for Python. Note: [Online; accessed August 28, 2019] External Links: Link Cited by: §6.
  • T. Daylan, F. Cyr-Racine, A. Diaz Rivero, C. Dvorkin, and D. P. Finkbeiner (2018) Probing the Small-scale Structure in Strongly Lensed Systems via Transdimensional Inference. ApJ 854 (2), pp. 141. External Links: Document, 1706.06111 Cited by: §1, §2.1.
  • G. Despali, S. Vegetti, S. D. M. White, C. Giocoli, and F. C. van den Bosch (2018) Modelling the line-of-sight contribution in substructure lensing. MNRAS 475 (4), pp. 5424–5442. External Links: Document, 1710.05029 Cited by: §5.
  • G. Despali and S. Vegetti (2017) The impact of baryonic physics on the subhalo mass function and implications for gravitational lensing. MNRAS 469, pp. 1997–2010. External Links: Document, 1608.06938 Cited by: §2.4, §2.4, §5.
  • A. Diaz Rivero, F. Cyr-Racine, and C. Dvorkin (2018) Power spectrum of dark matter substructure in strong gravitational lenses. Phys. Rev. D 97 (2), pp. 023001. External Links: Document, 1707.04590 Cited by: §1, §3.
  • A. Díaz Rivero, C. Dvorkin, F. Cyr-Racine, J. Zavala, and M. Vogelsberger (2018) Gravitational lensing and the power spectrum of dark matter substructure: Insights from the ETHOS N -body simulations. Phys. Rev. D 98 (10), pp. 103517. External Links: Document, 1809.00004 Cited by: §1, §3.
  • T. Dinev and M. U. Gutmann (2018) Dynamic Likelihood-free Inference via Ratio Estimation (DIRE). , pp. . External Links: 1810.09899 Cited by: §1.
  • L. Dinh, D. Krueger, and Y. Bengio (2014) NICE: Non-linear Independent Components Estimation. , pp. . External Links: 1410.8516 Cited by: §1.
  • L. Dinh, J. Sohl-Dickstein, and S. Bengio (2016) Density estimation using Real NVP. , pp. . External Links: 1605.08803 Cited by: §1.
  • A. Drlica-Wagner, K. Bechtol, E. S. Rykoff, E. Luque, A. Queiroz, Y. -Y. Mao, R. H. Wechsler, J. D. Simon, B. Santiago, B. Yanny, E. Balbinot, S. Dodelson, A. Fausti Neto, D. J. James, T. S. Li, M. A. G. Maia, J. L. Marshall, A. Pieres, K. Stringer, A. R. Walker, T. M. C. Abbott, F. B. Abdalla, S. Allam, A. Benoit-Lévy, G. M. Bernstein, E. Bertin, D. Brooks, E. Buckley-Geer, D. L. Burke, A. Carnero Rosell, M. Carrasco Kind, J. Carretero, M. Crocce, L. N. da Costa, S. Desai, H. T. Diehl, J. P. Dietrich, P. Doel, T. F. Eifler, A. E. Evrard, D. A. Finley, B. Flaugher, P. Fosalba, J. Frieman, E. Gaztanaga, D. W. Gerdes, D. Gruen, R. A. Gruendl, G. Gutierrez, K. Honscheid, K. Kuehn, N. Kuropatkin, O. Lahav, P. Martini, R. Miquel, B. Nord, R. Ogando, A. A. Plazas, K. Reil, A. Roodman, M. Sako, E. Sanchez, V. Scarpine, M. Schubnell, I. Sevilla-Noarbe, R. C. Smith, M. Soares-Santos, F. Sobreira, E. Suchyta, M. E. C. Swanson, G. Tarle, D. Tucker, V. Vikram, W. Wester, Y. Zhang, J. Zuntz, and DES Collaboration (2015) Eight Ultra-faint Galaxy Candidates Discovered in Year Two of the Dark Energy Survey. ApJ 813 (2), pp. 109. External Links: Document, 1508.03622 Cited by: §1.
  • A. Drlica-Wagner, Y. Mao, S. Adhikari, R. Armstrong, A. Banerjee, N. Banik, K. Bechtol, S. Bird, K. K. Boddy, A. Bonaca, J. Bovy, M. R. Buckley, E. Bulbul, C. Chang, G. Chapline, J. Cohen-Tanugi, A. Cuoco, F. Cyr-Racine, W. A. Dawson, A. Díaz Rivero, C. Dvorkin, D. Erkal, C. D. Fassnacht, J. García-Bellido, M. Giannotti, V. Gluscevic, N. Golovich, D. Hendel, Y. D. Hezaveh, S. Horiuchi, M. J. Jee, M. Kaplinghat, C. R. Keeton, S. E. Koposov, C. Y. Lam, T. S. Li, J. R. Lu, R. Mandelbaum, S. D. McDermott, M. McNanna, M. Medford, M. Meyer, M. Marc, S. Murgia, E. O. Nadler, L. Necib, E. Nuss, A. B. Pace, A. H. G. Peter, D. A. Polin, C. Prescod-Weinstein, J. I. Read, R. Rosenfeld, N. Shipp, J. D. Simon, T. R. Slatyer, O. Straniero, L. E. Strigari, E. Tollerud, J. A. Tyson, M. Wang, R. H. Wechsler, D. Wittman, H. Yu, G. Zaharijas, Y. Ali-Haïmoud, J. Annis, S. Birrer, R. Biswas, J. Blazek, A. M. Brooks, E. Buckley-Geer, R. Caputo, E. Charles, S. Digel, S. Dodelson, B. Flaugher, J. Frieman, E. Gawiser, A. P. Hearin, R. Hložek, B. Jain, T. E. Jeltema, S. M. Koushiappas, M. Lisanti, M. LoVerde, S. Mishra-Sharma, J. A. Newman, B. Nord, E. Nourbakhsh, S. Ritz, B. E. Robertson, M. A. Sánchez-Conde, A. Slosar, T. M. P. Tait, A. Verma, R. Vilalta, C. W. Walter, B. Yanny, and A. R. Zentner (2019) Probing the Fundamental Nature of Dark Matter with the Large Synoptic Survey Telescope. , pp. . External Links: 1902.01055 Cited by: §1, §1.
  • G. Efstathiou (1992) Suppressing the formation of dwarf galaxies via photoionization. MNRAS 256 (2), pp. 43P–47P. External Links: Document Cited by: §1.
  • O. D. Elbert, J. S. Bullock, M. Kaplinghat, S. Garrison-Kimmel, A. S. Graus, and M. Rocha (2018) A Testable Conspiracy: Simulating Baryonic Effects on Self-interacting Dark Matter Halos. ApJ 853 (2), pp. 109. External Links: Document, 1609.08626 Cited by: §1.
  • R. Errani, J. Peñarrubia, C. F. P. Laporte, and F. A. Gómez (2017) The effect of a disc on the population of cuspy and cored dark matter substructures in Milky Way-like galaxies. MNRAS 465 (1), pp. L59–L63. External Links: Document, 1608.01849 Cited by: §1.
  • R. Fadely and C. R. Keeton (2012) Substructure in the lens HE 0435-1223. MNRAS 419 (2), pp. 936–951. External Links: Document, 1109.0548 Cited by: §1.
  • J. Fan, A. Katz, L. Randall, and M. Reece (2013) Double-Disk Dark Matter. Physics of the Dark Universe 2 (3), pp. 139–156. External Links: Document, 1303.1521 Cited by: §1.
  • Y. Fan, D. J. Nott, and S. A. Sisson (2012) Approximate Bayesian Computation via Regression Density Estimation. , pp. . External Links: 1212.1479 Cited by: §1.
  • A. Fitts, M. Boylan-Kolchin, B. Bozek, J. S. Bullock, A. Graus, V. Robles, P. F. Hopkins, K. El-Badry, S. Garrison-Kimmel, C. Faucher-Giguère, A. Wetzel, and D. Kereš (2018) Dwarf Galaxies in CDM, WDM, and SIDM: Disentangling Baryons and Dark Matter Physics. , pp. . External Links: 1811.11791 Cited by: §1.
  • A. Fitts, M. Boylan-Kolchin, O. D. Elbert, J. S. Bullock, P. F. Hopkins, J. Oñorbe, A. Wetzel, C. Wheeler, C. Faucher-Giguère, D. Kereš, E. D. Skillman, and D. R. Weisz (2017) fire in the field: simulating the threshold of galaxy formation. MNRAS 471 (3), pp. 3547–3562. External Links: Document, 1611.02281 Cited by: §1.
  • S. Garrison-Kimmel, A. Wetzel, J. S. Bullock, P. F. Hopkins, M. Boylan-Kolchin, C. Faucher-Giguère, D. Kereš, E. Quataert, R. E. Sanderson, A. S. Graus, and T. Kelley (2017) Not so lumpy after all: modelling the depletion of dark matter subhaloes by Milky Way-like galaxies. MNRAS 471 (2), pp. 1709–1727. External Links: Document, 1701.03792 Cited by: §1.
  • M. Germain, K. Gregor, I. Murray, and H. Larochelle (2015)

    MADE: Masked Autoencoder for Distribution Estimation

    , pp. . External Links: 1502.03509 Cited by: §1.
  • D. Gilman, S. Birrer, A. Nierenberg, T. Treu, X. Du, and A. Benson (2019a) Warm dark matter chills out: constraints on the halo mass function and the free-streaming length of dark matter with 8 quadruple-image strong gravitational lenses. , pp. . External Links: 1908.06983 Cited by: §1.
  • D. Gilman, S. Birrer, T. Treu, C. R. Keeton, and A. Nierenberg (2018) Probing the nature of dark matter by forward modelling flux ratios in strong gravitational lenses. MNRAS 481 (1), pp. 819–834. External Links: Document, 1712.04945 Cited by: §3.
  • D. Gilman, S. Birrer, T. Treu, A. Nierenberg, and A. Benson (2019b) Probing dark matter structure down to 10 solar masses: flux ratio statistics in gravitational lenses with line-of-sight haloes. MNRAS 487 (4), pp. 5721–5738. External Links: Document, 1901.11031 Cited by: §5.
  • W. Grathwohl, R. T. Q. Chen, J. Bettencourt, I. Sutskever, and D. Duvenaud (2018) FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models. , pp. . External Links: 1810.01367 Cited by: §1.
  • M. U. Gutmann, R. Dutta, S. Kaski, and J. Corander (2017) Likelihood-free inference via classification. Statistics and Computing, pp. 1–15. Cited by: §1.
  • J. Han, S. Cole, C. S. Frenk, and Y. Jing (2016) A unified model for the spatial and mass distribution of subhaloes. MNRAS 457 (2), pp. 1208–1223. External Links: Document, 1509.02175 Cited by: §2.4, §5.
  • K. He, X. Zhang, S. Ren, and J. Sun (2016) Deep residual learning for image recognition. In

    Proceedings of the IEEE conference on computer vision and pattern recognition

    pp. 770–778. Cited by: §3.2.
  • J. Hermans, V. Begy, and G. Louppe (2019) Likelihood-free MCMC with Approximate Likelihood Ratios. , pp. . External Links: 1903.04057 Cited by: §1, §3.1, §3.4, §3.4.
  • Y. D. Hezaveh, N. Dalal, D. P. Marrone, Y. Mao, W. Morningstar, D. Wen, R. D. Bland ford, J. E. Carlstrom, C. D. Fassnacht, G. P. Holder, A. Kemball, P. J. Marshall, N. Murray, L. Perreault Levasseur, J. D. Vieira, and R. H. Wechsler (2016) Detection of Lensing Substructure Using ALMA Observations of the Dusty Galaxy SDP.81. ApJ 823 (1), pp. 37. External Links: Document, 1601.01388 Cited by: §1.
  • Y. D. Hezaveh, L. Perreault Levasseur, and P. J. Marshall (2017)

    Fast automated analysis of strong gravitational lenses with convolutional neural networks

    Nature 548 (7669), pp. 555–557. External Links: Document, 1708.08842 Cited by: §3.4.
  • Y. Hezaveh, N. Dalal, G. Holder, T. Kisner, M. Kuhlen, and L. Perreault Levasseur (2016) Measuring the power spectrum of dark matter substructure using strong gravitational lensing. J. Cosmology Astropart. Phys 2016 (11), pp. 048. External Links: Document, 1403.2720 Cited by: §1, §3.
  • N. Hiroshima, S. Ando, and T. Ishiyama (2018) Modeling evolution of dark matter substructure and annihilation boost. Phys. Rev. D 97 (12), pp. 123002. External Links: Document, 1803.07691 Cited by: §2.4, §5.
  • J. Hsueh, W. Enzi, S. Vegetti, M. Auger, C. D. Fassnacht, G. Despali, L. V. E. Koopmans, and J. P. McKean (2019) SHARP – VII. New constraints on warm dark matter free-streaming properties and substructure abundance from flux-ratio anomalous lensed quasars. , pp. . External Links: 1905.04182 Cited by: §1, §2.4, §5.
  • C.-W. Huang, D. Krueger, A. Lacoste, and A. Courville (2018) Neural Autoregressive Flows. , pp. . External Links: 1804.00779 Cited by: §1.
  • J. D. Hunter (2007) Matplotlib: a 2d graphics environment. Computing In Science & Engineering 9 (3), pp. 90–95. Cited by: §6.
  • D. Jimenez Rezende and S. Mohamed (2015) Variational Inference with Normalizing Flows. , pp. . External Links: 1505.05770 Cited by: §1.
  • K. V. Johnston, H. Zhao, D. N. Spergel, and L. Hernquist (1999) Tidal Streams as Probes of the Galactic Potential. ApJ 512 (2), pp. L109–L112. External Links: Document, astro-ph/9807243 Cited by: §1.
  • E. Jones, T. Oliphant, P. Peterson, et al. (2001) SciPy: open source scientific tools for Python. Note: [Online; accessed August 26, 2019] External Links: Link Cited by: §6.
  • F. Kahlhoefer, M. Kaplinghat, T. R. Slatyer, and C. Wu (2019) Diversity in density profiles of self-interacting dark matter satellite halos. , pp. . External Links: 1904.10539 Cited by: §1.
  • A. Kamada, M. Kaplinghat, A. B. Pace, and H. Yu (2017) Self-Interacting Dark Matter Can Explain Diverse Galactic Rotation Curves. Phys. Rev. Lett. 119 (11), pp. 111102. External Links: Document, 1611.02716 Cited by: §1.
  • M. Kaplinghat, R. E. Keeley, T. Linden, and H. Yu (2014) Tying Dark Matter to Baryons with Self-Interactions. Phys. Rev. Lett. 113 (2), pp. 021302. External Links: Document, 1311.6524 Cited by: §1.
  • M. Kaplinghat, S. Tulin, and H. Yu (2016) Dark Matter Halos as Particle Colliders: Unified Solution to Small-Scale Structure Puzzles from Dwarfs to Clusters. Phys. Rev. Lett. 116 (4), pp. 041302. External Links: Document, 1508.03339 Cited by: §1.
  • C. R. Keeton, C. S. Kochanek, and U. Seljak (1997) Shear and Ellipticity in Gravitational Lenses. ApJ 482 (2), pp. 604–620. External Links: Document, astro-ph/9610163 Cited by: §5.
  • C. R. Keeton (2001) A Catalog of Mass Models for Gravitational Lensing. , pp. . External Links: astro-ph/0102341 Cited by: §2.1, §2.2, §2.2, §2.4.
  • D. P. Kingma and P. Dhariwal (2018) Glow: Generative Flow with Invertible 1x1 Convolutions. , pp. . External Links: 1807.03039 Cited by: §1.
  • T. Kluyver, B. Ragan-Kelley, F. Pérez, B. E. Granger, M. Bussonnier, J. Frederic, K. Kelley, J. B. Hamrick, J. Grout, S. Corlay, P. Ivanov, D. Avila, S. Abdalla, C. Willing, and et al. (2016) Jupyter notebooks - a publishing format for reproducible computational workflows. In ELPUB, Cited by: §6.
  • S. Koposov, V. Belokurov, N. W. Evans, P. C. Hewett, M. J. Irwin, G. Gilmore, D. B. Zucker, H. -W. Rix, M. Fellhauer, E. F. Bell, and E. V. Glushkova (2008) The Luminosity Function of the Milky Way Satellites. ApJ 686 (1), pp. 279–291. External Links: Document, 0706.2687 Cited by: §1.
  • S. E. Koposov, V. Belokurov, G. Torrealba, and N. W. Evans (2015) Beasts of the Southern Wild: Discovery of Nine Ultra Faint Satellites in the Vicinity of the Magellanic Clouds.. ApJ 805 (2), pp. 130. External Links: Document, 1503.02079 Cited by: §1.
  • R. Kormann, P. Schneider, and M. Bartelmann (1994) Isothermal elliptical gravitational lens models.. A&A 284, pp. 285–299. Cited by: §2.2.
  • R. Li, C. S. Frenk, S. Cole, L. Gao, S. Bose, and W. A. Hellwing (2016) Constraints on the identity of the dark matter from strong gravitational lenses. MNRAS 460 (1), pp. 363–372. External Links: Document, 1512.06507 Cited by: §5.
  • M. Lisanti, S. Mishra-Sharma, N. L. Rodd, and B. R. Safdi (2018) Search for Dark Matter Annihilation in Galaxy Groups. Phys. Rev. Lett. 120 (10), pp. 101101. External Links: Document, 1708.09385 Cited by: §1.
  • G. Louppe and K. Cranmer (2017) Adversarial Variational Optimization of Non-Differentiable Simulators. , pp. . External Links: 1707.07113 Cited by: §1.
  • M. R. Lovell, C. S. Frenk, V. R. Eke, A. Jenkins, L. Gao, and T. Theuns (2014) The properties of warm dark matter haloes. MNRAS 439 (1), pp. 300–317. External Links: Document, 1308.1399 Cited by: §5.
  • LSST Science Collaboration, P. A. Abell, J. Allison, S. F. Anderson, J. R. Andrew, J. R. P. Angel, L. Armus, D. Arnett, S. J. Asztalos, T. S. Axelrod, and et al. (2009) LSST Science Book, Version 2.0. , pp. . External Links: 0912.0201 Cited by: §1.
  • J. Lueckmann, G. Bassetto, T. Karaletsos, and J. H. Macke (2018) Likelihood-free inference with emulator networks. , pp. . External Links: 1805.09294 Cited by: §1.
  • J. Lueckmann, P. J. Goncalves, G. Bassetto, K. Öcal, M. Nonnenmacher, and J. H. Macke (2017) Flexible statistical inference for mechanistic models of neural dynamics. , pp. . External Links: 1711.01861 Cited by: §1.
  • P. Madau, J. Diemand, and M. Kuhlen (2008) Dark Matter Subhalos and the Dwarf Satellites of the Milky Way. ApJ 679 (2), pp. 1260–1271. External Links: Document, 0802.2265 Cited by: §2.4, §4.
  • Á. Moliné, M. A. Sánchez-Conde, S. Palomares-Ruiz, and F. Prada (2017) Characterization of subhalo structural properties and implications for dark matter annihilation signals. MNRAS 466 (4), pp. 4974–4990. External Links: Document, 1603.04057 Cited by: §5.
  • W. R. Morningstar, Y. D. Hezaveh, L. Perreault Levasseur, R. D. Blandford, P. J. Marshall, P. Putzky, and R. H. Wechsler (2018) Analyzing interferometric observations of strong gravitational lenses with recurrent and convolutional neural networks. , pp. . External Links: 1808.00011 Cited by: §3.4.
  • W. R. Morningstar, L. Perreault Levasseur, Y. D. Hezaveh, R. Blandford, P. Marshall, P. Putzky, T. D. Rueter, R. Wechsler, and M. Welling (2019) Data-Driven Reconstruction of Gravitationally Lensed Galaxies using Recurrent Inference Machines. , pp. . External Links: 1901.01359 Cited by: §3.4, §5.
  • E. O. Nadler, Y. Mao, G. M. Green, and R. H. Wechsler (2019) Modeling the Connection between Subhalos and Satellites in Milky Way-like Systems. ApJ 873 (1), pp. 34. External Links: Document, 1809.05542 Cited by: §1.
  • J. F. Navarro, C. S. Frenk, and S. D. M. White (1996) The Structure of Cold Dark Matter Halos. ApJ 462, pp. 563. External Links: Document, astro-ph/9508025 Cited by: §2.2.
  • J. F. Navarro, C. S. Frenk, and S. D. M. White (1997)

    A Universal Density Profile from Hierarchical Clustering

    ApJ 490, pp. 493–508. External Links: Document, astro-ph/9611107 Cited by: §2.2.
  • J. Neyman and E. S. Pearson (1933) On the Problem of the Most Efficient Tests of Statistical Hypotheses. Philosophical Transactions of the Royal Society of London Series A 231, pp. 289–337. External Links: Document Cited by: §3.4.
  • A. M. Nierenberg, T. Treu, G. Brammer, A. H. G. Peter, C. D. Fassnacht, C. R. Keeton, C. S. Kochanek, K. B. Schmidt, D. Sluse, and S. A. Wright (2017) Probing dark matter substructure in the gravitational lens HE 0435-1223 with the WFC3 grism. MNRAS 471 (2), pp. 2224–2236. External Links: Document, 1701.05188 Cited by: §1.
  • A. M. Nierenberg, T. Treu, S. A. Wright, C. D. Fassnacht, and M. W. Auger (2014) Detection of substructure with adaptive optics integral field spectroscopy of the gravitational lens B1422+231. MNRAS 442 (3), pp. 2434–2445. External Links: Document, 1402.1496 Cited by: §1.
  • J. W. Nightingale, S. Dye, and R. J. Massey (2018) AutoLens: automated modeling of a strong lens’s light, mass, and source. MNRAS 478 (4), pp. 4738–4784. External Links: Document, 1708.07377 Cited by: §5.
  • H. Nishikawa, K. K. Boddy, and M. Kaplinghat (2019) Accelerated core collapse in tidally stripped self-interacting dark matter halos. , pp. . External Links: 1901.00499 Cited by: §1.
  • M. Oguri and P. J. Marshall (2010) Gravitationally lensed quasars and supernovae in future wide-field optical imaging surveys. MNRAS 405 (4), pp. 2579–2593. External Links: Document, 1001.2037 Cited by: §1, §3.
  • B. Paige and F. Wood (2016) Inference Networks for Sequential Monte Carlo in Graphical Models. , pp. . External Links: 1602.06701 Cited by: §1.
  • G. Papamakarios, T. Pavlakou, and I. Murray (2017) Masked Autoregressive Flow for Density Estimation. , pp. . External Links: 1705.07057 Cited by: §1.
  • G. Papamakarios, D. C. Sterratt, and I. Murray (2018) Sequential Neural Likelihood: Fast Likelihood-free Inference with Autoregressive Flows. , pp. . External Links: 1805.07226 Cited by: §1.
  • G. Papamakarios and I. Murray (2016) Fast -free inference of simulation models with bayesian conditional density estimation. In Advances in Neural Information Processing Systems, pp. 1028–1036. Cited by: §1.
  • A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer (2017) Automatic differentiation in pytorch. In NIPS-W, Cited by: §3.2, §6.
  • F. Perez and B. E. Granger (2007) IPython: A System for Interactive Scientific Computing. Computing in Science and Engineering 9 (3), pp. 21–29. External Links: Document Cited by: §6.
  • L. Perreault Levasseur, Y. D. Hezaveh, and R. H. Wechsler (2017) Uncertainties in Parameters Estimated with Neural Networks: Application to Strong Gravitational Lensing. ApJ 850 (1), pp. L7. External Links: Document, 1708.08843 Cited by: §3.4.
  • A. H. G. Peter, M. Rocha, J. S. Bullock, and M. Kaplinghat (2013) Cosmological simulations with self-interacting dark matter - II. Halo shapes versus observations. MNRAS 430 (1), pp. 105–120. External Links: Document, 1208.3026 Cited by: §1.
  • Planck Collaboration, P. A. R. Ade, N. Aghanim, M. Arnaud, M. Ashdown, J. Aumont, C. Baccigalupi, A. J. Banday, R. B. Barreiro, J. G. Bartlett, and et al. (2016) Planck 2015 results. XIII. Cosmological parameters. A&A 594, pp. A13. External Links: Document, 1502.01589 Cited by: §2.2.
  • N. Qian (1999) On the momentum term in gradient descent learning algorithms. Neural Netw. 12 (1), pp. 145–151. External Links: Document, ISSN 0893-6080 Cited by: §3.2.
  • J. I. Read, G. Iorio, O. Agertz, and F. Fraternali (2017) The stellar mass-halo mass relation of isolated field dwarfs: a critical test of CDM at the edge of galaxy formation. MNRAS 467 (2), pp. 2019–2038. External Links: Document, 1607.03127 Cited by: §1.
  • A. Refregier, A. Amara, T. D. Kitching, A. Rassat, R. Scaramella, J. Weller, and f. t. Euclid Imaging Consortium (2010) Euclid Imaging Consortium Science Book. , pp. . External Links: 1001.0061 Cited by: §1.
  • V. H. Robles, T. Kelley, J. S. Bullock, and M. Kaplinghat (2019) The Milky Way’s Halo and Subhalos in Self-Interacting Dark Matter. , pp. . External Links: 1903.01469 Cited by: §1.
  • D. B. Rubin (1984) Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Statist. 12 (4), pp. 1151–1172. External Links: Document, Link Cited by: §3.
  • M. A. Sánchez-Conde and F. Prada (2014) The flattening of the concentration-mass relation towards low halo masses and its implications for the annihilation signal boost. MNRAS 442, pp. 2271–2277. External Links: Document, 1312.1729 Cited by: §2.2, §2.4.
  • J. A. Schewtschenko, R. J. Wilkinson, C. M. Baugh, C. Bœhm, and S. Pascoli (2015) Dark matter-radiation interactions: the impact on dark matter haloes. MNRAS 449 (4), pp. 3587–3596. External Links: Document, 1412.4905 Cited by: §1.
  • A. Schneider, R. E. Smith, A. V. Macciò, and B. Moore (2012) Non-linear evolution of cosmological structures in warm dark matter models. MNRAS 424 (1), pp. 684–698. External Links: Document, 1112.0330 Cited by: §5.
  • P. Schneider, J. Ehlers, and E. E. Falco (1992) Gravitational Lenses. External Links: Document Cited by: §2.1, §2.1.
  • P. Schneider (1997) The cosmological lens equation and the equivalent single-plane gravitational lens. MNRAS 292 (3), pp. 673–678. External Links: Document, astro-ph/9706185 Cited by: §5.
  • J. L. Sérsic (1963) Influence of the atmospheric and instrumental dispersion on the brightness distribution in a galaxy. Boletin de la Asociacion Argentina de Astronomia La Plata Argentina 6, pp. 41. Cited by: §2.3, §5.
  • J. Simon, S. Birrer, K. Bechtol, S. Chakrabarti, F. Cyr-Racine, I. Dell’Antonio, A. Drlica-Wagner, C. Fassnacht, M. Geha, D. Gilman, Y. D. Hezaveh, D. Kim, T. S. Li, L. Strigari, and T. Treu (2019) Testing the Nature of Dark Matter with Extremely Large Telescopes. BAAS 51 (3), pp. 153. External Links: 1903.04742 Cited by: §1.
  • A. M. Sirunyan, A. Tumasyan, W. Adam, E. Asilar, T. Bergauer, J. Brandstetter, E. Brondolin, M. Dragicevic, J. Erö, M. Flechl, and et al. (2017) Search for dijet resonances in proton-proton collisions at { s} = 13TeV and constraints on dark matter and other models. Physics Letters B 769, pp. 520–542. External Links: Document, 1611.03568 Cited by: §1.
  • D. N. Spergel and P. J. Steinhardt (2000) Observational Evidence for Self-Interacting Cold Dark Matter. Phys. Rev. Lett. 84 (17), pp. 3760–3763. External Links: Document, astro-ph/9909386 Cited by: §1.
  • V. Springel, J. Wang, M. Vogelsberger, A. Ludlow, A. Jenkins, A. Helmi, J. F. Navarro, C. S. Frenk, and S. D. M. White (2008) The Aquarius Project: the subhaloes of galactic haloes. MNRAS 391 (4), pp. 1685–1711. External Links: Document, 0809.0898 Cited by: §2.4, §4.
  • M. Stoye, J. Brehmer, G. Louppe, J. Pavez, and K. Cranmer (2018) Likelihood-free inference with an improved cross-entropy estimator. , pp. . External Links: 1808.00973 Cited by: Appendix A, §1, §3.2, §3.2, §3.4, §6.
  • S. H. Suyu, P. J. Marshall, M. P. Hobson, and R. D. Bland ford (2006) A Bayesian analysis of regularized source inversions in gravitational lensing. MNRAS 371 (2), pp. 983–998. External Links: Document, astro-ph/0601493 Cited by: §5.
  • A. S. Tagore and N. Jackson (2016) On the use of shapelets in modelling resolved, gravitationally lensed images. MNRAS 457 (3), pp. 3066–3075. External Links: Document, 1505.00198 Cited by: §5.
  • A. S. Tagore and C. R. Keeton (2014) Statistical and systematic uncertainties in pixel-based source reconstruction algorithms for gravitational lensing. MNRAS 445 (1), pp. 694–710. External Links: Document, 1408.6297 Cited by: §5.
  • O. Thomas, R. Dutta, J. Corander, S. Kaski, and M. U. Gutmann (2016) Likelihood-free inference by ratio estimation. , pp. . External Links: 1611.10242 Cited by: §1.
  • D. Tran, R. Ranganath, and D. Blei (2017) Hierarchical implicit models and likelihood-free variational inference. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), pp. 5523–5533. Cited by: §1.
  • T. Treu (2010) Strong Lensing by Galaxies. Annual Review of Astronomy and Astrophysics 48, pp. 87–125. External Links: Document, 1003.5567 Cited by: §1, §2.1, §2.2, §3.
  • B. Uria, M.-A. Côté, K. Gregor, I. Murray, and H. Larochelle (2016) Neural Autoregressive Distribution Estimation. , pp. . External Links: 1605.02226 Cited by: §1.
  • A. van den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. Senior, and K. Kavukcuoglu (2016) WaveNet: A Generative Model for Raw Audio. , pp. . External Links: 1609.03499 Cited by: §1.
  • A. van den Oord, N. Kalchbrenner, and K. Kavukcuoglu (2016)

    Pixel Recurrent Neural Networks

    , pp. . External Links: 1601.06759 Cited by: §1.
  • A. van den Oord, N. Kalchbrenner, O. Vinyals, L. Espeholt, A. Graves, and K. Kavukcuoglu (2016) Conditional Image Generation with PixelCNN Decoders. , pp. . External Links: 1606.05328 Cited by: §1.
  • S. van der Walt, S. C. Colbert, and G. Varoquaux (2011) The NumPy Array: A Structure for Efficient Numerical Computation. Computing in Science and Engineering 13 (2), pp. 22–30. External Links: Document, 1102.1523 Cited by: §6.
  • K. Van Tilburg, A. Taki, and N. Weiner (2018) Halometry from astrometry. J. Cosmology Astropart. Phys 2018 (7), pp. 041. External Links: Document, 1804.01991 Cited by: §1.
  • S. Vegetti, L. V. E. Koopmans, A. Bolton, T. Treu, and R. Gavazzi (2010) Detection of a dark substructure through gravitational imaging. MNRAS 408 (4), pp. 1969–1981. External Links: Document, 0910.0760 Cited by: §1.
  • S. Vegetti, D. J. Lagattuta, J. P. McKean, M. W. Auger, C. D. Fassnacht, and L. V. E. Koopmans (2012) Gravitational detection of a low-mass dark satellite galaxy at cosmological distance. Nature 481 (7381), pp. 341–343. External Links: Document, 1201.3643 Cited by: §1.
  • S. Vegetti, O. Czoske, and L. V. E. Koopmans (2010) Quantifying dwarf satellites through gravitational imaging: the case of SDSSJ120602.09+514229.5. MNRAS 407 (1), pp. 225–231. External Links: Document, 1002.4708 Cited by: §1.
  • A. Verma, T. Collett, G. P. Smith, Strong Lensing Science Collaboration, and the DESC Strong Lensing Science Working Group (2019) Strong Lensing considerations for the LSST observing strategy. , pp. . External Links: 1902.05141 Cited by: §1.
  • M. Vogelsberger, J. Zavala, F. Cyr-Racine, C. Pfrommer, T. Bringmann, and K. Sigurdson (2016) ETHOS - an effective theory of structure formation: dark matter physics as a possible explanation of the small-scale CDM problems. MNRAS 460 (2), pp. 1399–1416. External Links: Document, 1512.05349 Cited by: §1.
  • M. Vogelsberger, J. Zavala, and A. Loeb (2012) Subhaloes in self-interacting galactic dark matter haloes. MNRAS 423 (4), pp. 3740–3752. External Links: Document, 1201.5892 Cited by: §1.
  • M. Vogelsberger, J. Zavala, K. Schutz, and T. R. Slatyer (2019) Evaporating the Milky Way halo and its satellites with inelastic self-interacting dark matter. MNRAS 484 (4), pp. 5437–5452. External Links: Document, 1805.03203 Cited by: §1.
  • A. Wald (1943) Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transactions of the American Mathematical Society 54 (3), pp. 426–482. External Links: ISSN 00029947 Cited by: §3.4.
  • S. J. Warren and S. Dye (2003) Semilinear Gravitational Lens Inversion. ApJ 590 (2), pp. 673–682. External Links: Document, astro-ph/0302587 Cited by: §5.
  • R. H. Wechsler and J. L. Tinker (2018) The Connection Between Galaxies and Their Dark Matter Halos. ARA&A 56, pp. 435–487. External Links: Document, 1804.03097 Cited by: §1.
  • S. S. Wilks (1938) The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses. Annals Math. Statist. 9 (1), pp. 60–62. External Links: Document Cited by: §3.4.
  • K. C. Wong, S. H. Suyu, G. C. -F. Chen, C. E. Rusu, M. Millon, D. Sluse, V. Bonvin, C. D. Fassnacht, S. Taubenberger, M. W. Auger, S. Birrer, J. H. H. Chan, F. Courbin, S. Hilbert, O. Tihhonova, T. Treu, A. Agnello, X. Ding, I. Jee, E. Komatsu, A. J. Shajib, A. Sonnenfeld, R. D. Bland ford, L. V. E. Koopmans, P. J. Marshall, and G. Meylan (2019) H0LiCOW XIII. A 2.4% measurement of from lensed quasars: tension between early and late-Universe probes. , pp. . External Links: 1907.04869 Cited by: §6.
  • N. Yoshida, V. Springel, S. D. M. White, and G. Tormen (2000) Weakly Self-interacting Dark Matter and the Structure of Dark Halos. ApJ 544 (2), pp. L87–L90. External Links: Document, astro-ph/0006134 Cited by: