Machine Learning Accelerated Likelihood-Free Event Reconstruction in Dark Matter Direct Detection

10/23/2018 ∙ by U. Simola, et al. ∙ Helsingin yliopisto Stockholms universitet 0

Reconstructing the position of an interaction for any dual-phase time projection chamber (TPC) with the best precision is key to directly detect Dark Matter. Using the likelihood-free framework, a new algorithm to reconstruct the 2-D (x; y) position and the size of the charge signal (e) of an interaction is presented. The algorithm uses the charge signal (S2) light distribution obtained by simulating events using a waveform generator. To deal with the computational effort required by the likelihood-free approach, we employ the Bayesian Optimization for Likelihood-Free Inference (BOLFI) algorithm. Together with BOLFI, prior distributions for the parameters of interest (x; y; e) and highly informative discrepancy measures to perform the analyses are introduced. We evaluate the quality of the proposed algorithm by a comparison against the currently existing alternative methods using a large-scale simulation study. BOLFI provides a natural probabilistic uncertainty measure for the reconstruction and it improved the accuracy of the reconstruction over the next best algorithm by up to 15 cm). In addition, BOLFI provides the smallest uncertainties among all the tested methods.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 13

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The existence of Dark Matter has been gravitationally inferred on a macroscopic scale thanks to astronomical and cosmological surveys that rely on the cold Dark Matter model [kolb1990early, corbelli2000extended, dodelson2003modern, markevitch2004direct, roszkowski2018wimp]. On the other hand, Dark Matter has not been directly detected yet. Under the assumption that Dark Matter is a particle, several theoretical models have been proposed and an overview can be found in [bertone2005g]. The most popular and promising model is based on the Weakly Interactive Massive Particle (WIMP) hypothesis [jungman1996supersymmetric, roszkowski2018wimp, strigari2013strigari, undagoitia2015dark]. If the assumption stating that the universe is composed of WIMPs holds, then there are three known processes through which ordinary matter can interact with Dark Matter: creation, scattering and annihilation. While creation and annihilation are referred to as indirect detection, scattering is referred to as direct detection. Projects such as CERN’s Large Hadron Collider [aad2008atlas] and Fermi–LAT [biswas2015nonthermal] are trying to detect Dark Matter respectively via the creation and the annihilation process. The goal of direct detection experiments is detecting a Dark Matter particle scattering with ordinary matter. A direct measure of a Dark Matter particle can in principle be obtained employing underground detectors and by measuring the recoiling atom [goodman1985detectability, undagoitia2015dark].

As is the case in any rare event search, to achieve a high sensitivity to signal events (nuclear recoil events from WIMPs) the reduction of background events is crucial. Background events in Dark Matter Direct Detection experiments are electronic or nuclear recoils from beta particles, gamma photons or neutrons from a variety of sources. Dual–phase Time Projection Chambers (TPCs), using high density detector media such as liquid xenon (LXe), have been particularly successful in exploring increasing regions of the WIMP parametric space. The TPC design, in combination with the xenon target, provides two main ways to reduce background events. The first way is the ability the infer the type of recoil (electronic or nuclear) by accurately reconstructing the size of the charge signal; this allows for the reduction of electronic recoil background events in the analysis. The second is the ability to reconstruct the 3–D spatial position of events. Most of the background events will interact in the outer region of the TPC due to the self-shielding properties of the high density liquid xenon. By selecting only events from an inner region of the detector, the signal over background ratio is increased. This motivates the need for accurate position and size of the charge signal reconstruction. Position reconstruction is an important part of experiments such as LUX [Akerib:2017riv] and XENON [aprile2017xenon1t, pelssers2015position].

In the present work we implement a novel position and size of the charge signal reconstruction algorithm, based on the likelihood–free framework. The presented approach is applicable to any dual–phase TPC. To provide concrete examples and simulations we use the specifications of the XENON1T TPC and employ the open source Processor for Analyzing XENON (PAX) [xenon_collaboration_2018_1195785] data processor and waveform simulator developed by the XENON collaboration.

The article is structured as follows: in Section 2 we present the detector physics of direct detection as well as some of the already available algorithms used for reconstructing the position of an event in dual–phase TPCs. In Section 3

we introduce the likelihood–free framework, justifying the reasons for its recent success and motivating its use in position reconstruction of events in a dual–phase TPC. Several advantages can be obtained by using the likelihood–free framework, such as the availability of a posterior distribution for the parameters of interest, as well as retrieving credible intervals straightforwardly from those posteriors. In order to deal with the computationally expensive calculations, the Bayesian Optimization for Likelihood–Free Inference (BOLFI)

[gutmann2016bayesian] is introduced and properly designed for executing the proposed analyses. In Section 4 we present three examples to explain how the likelihood–free approach works and to show its advantages over the currently existing alternative methods. The first example uses a simplified model to generate simulated data. For the other two examples, we use the full event simulator PAX processor. In particular, for the third and final example, beyond reconstructing the 2–D (x,y) position, we additionally reconstruct the size of the charge signal () of the event. Our conclusions are presented in Section 5.

2 Direct Detection Experiments

Dark Matter direct detection experiments aim to detect Dark Matter from particles scattering on the detector medium. Both nuclear and electronic recoils lead to three physical processes, known respectively as phonon emission, scintillation and ionization, where the proportion for each process is different depending on the type of recoil. By using a detector which is sensitive to two of these processes, nuclear and electronic recoils can be discriminated. Since most WIMP signal models only predict nuclear scattering, the removal of electronic recoil events greatly reduces the number of background events [bertone_silk_2010]. Two types of detectors are typically used in direct detection experiments: crystal detectors and Nobel–liquid TPCs [undagoitia2015dark]. The method presented in this work can be applied to any liquid–gas TPC.

A dual–phase (liquid–gas) TPC uses both the scintillation and the ionization signals to detect particles scattering on atoms in the detector. Besides providing the ability to distinguish between nuclear and electronic recoil events, a dual–phase TPC can also reconstruct the spatial position of the events. Properly reconstructing the 3–D () position of events is crucial in order to discard background events at the edges of the TPC and to perform spatially dependent corrections that are caused by nonuniform detector responses. The ratio in energy between the scintillation and the ionization signal is used to discriminate between nuclear and electronic recoils, further reducing the background.

The XENON1T TPC is suspended in a cryostat filled with 3.2 t of ultra-pure liquid xenon, 2 t of which is in the sensitive region of the TPC where particles scattering on the xenon can be detected and their position and size of the charge signal reconstructed. The TPC is of cylindrical shape with a height and diameter of about 1 m. The walls of the TPC are made of highly reflective PTFE panels to maximize light collection. The TPC also consists of a field cage and various electrodes such that an electric field can be created inside the TPC to drift charges to the top (gaseous region) of the TPC. Here a second stronger electric field extracts the electrons from the liquid into the gaseous xenon where the electrons generate another scintillation light signal from their interactions with the xenon gas. To detect both the prompt scintillation light (S1) from the initial scatter as well as the secondary scintillation light (S2) from the electrons extracted at the top of the TPC, the TPC is instrumented with 248 photomultiplier tubes (PMTs). These circular 3-inch diameter Hamamatsu PMTs are placed in two arrays at the top and at the bottom of the TPC. The top array contains 127 PMTs, which are positioned in concentric rings in order to maximize the radial resolution of the position reconstruction. The bottom array contains 121 PMTs arranged in a hexagonal pattern; this closer packing maximizes the light collection and therefore lowers the energy threshold for S1 signals. Further details can be found in [aprile2017xenon1t].

An interaction in the TPC, which can be either an electronic or nuclear scatter, produces two light signals. The prompt scintillation light is labeled as S1 signal and the secondary scintillation light is labeled as S2 signal. The S1 signal is seen almost immediately (within several nanoseconds) after the interaction occurs. The S2 signal is delayed by the drift time of the electrons through the TPC and this drift time is proportional to the depth at which the interaction occurred. This proportionality makes the drift time a very accurate estimator of the depth (or z coordinate) of the interaction. In XENON the resolution of the vertical position of the interaction is typically a factor

better than the resolution of the coordinates [pelssers2015position]. Since the S2 signal is always generated at the top of the TPC, very close to the top PMT array, the distribution of light over the top PMTs (also known as the S2 hit pattern) contains information about the location of the S2 signal. Figure 1 shows the resulting S2 hit pattern for an event having input coordinates .

Figure 1: The S2 hit pattern resulting from the secondary oscillation light S2 given the input coordinates . The size of the charge signal of the event () is fixed equal to .

The goal of position reconstruction is to infer the 3–D (x,y,z) position of the interaction by combining the 2–D (x,y) position of the S2 signal with the depth of the interaction, the latter being estimated by measuring the drift time. As already noted, the estimation of is in general easier and more precise than the 2–D (x,y) position reconstruction. With this regard, one important assumption is that the drift field is uniform throughout the TPC. A uniform drift field ensures that electrons drift up straight, so that the 2–D (x,y) position at which they are generated is the same as where they are extracted into the gas phase and produce the S2 signal. When this assumption does not hold, the field distortion needs to be modelled and the reconstructed position needs to be corrected for this distortion. As will be shown in Section 4, the method introduced in the present work directly infers the 2–D (x,y) position without the need for field distortion correction. Moreover, position reconstruction typically involves reconstructing the 2–D (x,y) position from the S2 hit pattern and then combining that with the drift time in order to find the 3–D (x,y,z) position of the interaction. However, there are also cases where estimating the z coordinate from the S2 signal alone is desirable. In the case of a S2–only driven analysis (where only S2 signals are used, allowing for a lower energy threshold) no drift time is in general available [Aprile:2016wwo]. In fact, the shape of the S2 signals might be used to provide information on the depth of the interaction but is typically an order of magnitude less accurate than using the drift time and is not used for detector fiducialization. With the proposed method the drift time can also in principle be estimated for cases in which only the S2 signal is available. Finally, in order to reconstruct the recoil energy of an event the number of prompt scintillation photons and the number of ionization electrons needs to be determined. In the case of ionization electrons, this involves finding the total charge signal of the S2 event (number of photo–electrons observed), and then applying various corrections, taking into account the position of the interaction. In the last example of Section 4 the charge signal of the event is directly inferred from the waveform and from the S2 hit pattern, using as well the bottom PMT array in order to retrieve additional information.

Using the S2 hit pattern as the one displayed in Figure 1, various position reconstruction algorithms have been designed to infer the 2–D (x,y) position of an interaction event. A quick overview of those already available reconstruction algorithms is presented in Section 2.1.

2.1 Position Reconstruction Algorithms in XENON1T

Several methods have been developed to reconstruct the 2–D (x,y) position using a S2 signal. Next, we briefly describe some of the main algorithms provided in PAX: the “maximum PMT” method, a likelihood based method called Top Pattern Fit (TPF) and a method that uses artificial neural networks (NN). All these algorithms are part of PAX, the open source event reconstruction and data processing software developed by the XENON collaboration 

[xenon_collaboration_2018_1195785]. The version used in the following analyses is PAX v6.8.0, the same version that was used for the most recent XENON1T results [aprile2018dark].

The fastest and most naive method to reconstruct the 2–D (x,y) position, the “maximum PMT” method, consists on looking at the PMT that captures the largest count of photoelectrons among the top PMT array, and taking that PMT and their corresponding 2–D coordinates as the position from which the S2 signal originated. This method has an uncertainty on the order of the distance between the top PMTs. The “maximum PMT” method can be used to give a rough estimate of the position of the S2 signal. We will however employ it in our first introductory example in Section 4.1, when using a simplified model.

The TPF method consists on an algorithm that, given a S2 hit pattern, returns the most likely 2–D (x,y) position, estimated by optimizing a likelihood function. The employed likelihood function is the Poisson likelihood chi–square [likelihood_chisquare]

. This likelihood function assumes that the number of photoelectrons counted in a certain PMT is distributed according to a Poisson distribution with mean:

(1)

where is the total number of observed photoelectrons in the S2 hit pattern, is the light collection efficiency (LCE) of PMT

for photons produced at position (x,y) and the sum is taken over all working PMTs in the top PMT array. The LCE functions are not analytically known but are rather numerically estimated using optical photon Monte Carlo simulations. Those simulations take into account for both the geometry of the detector and the optical and reflective properties of the employed materials. In practice, the LCE maps evaluate the probability for a photon to reach a certain PMT given a set of assumed known 2–D position (x,y). For each live PMT on the top PMT array, a corresponding LCE map is implemented. By sampling from the LCE maps, an expected S2 hit pattern that originated from known 2–D positions (x,y) can be simulated, allowing to use the LCE maps to reconstruct the unknown 2–D positions (x,y) of events. Besides the most likely position, the TPF algorithm also returns the likelihood function values, defining the goodness–of–fit measure of the reconstructed S2 hit pattern. The method can also be used for retrieving confidence regions, since the likelihood surface is (partially) evaluated during the reconstruction. The TPF method was the main position reconstruction algorithm used in

[aprile2017first].

The last position reconstruction method is NN, trained with simulated S2 hit patterns from the same optical photon Monte Carlo simulation used to retrieve the LCE functions [aprile2018dark]. NN provided the best resolution in XENON100 ( mm) among all the methods used to reconstruct the 2–D (x,y) position at that time. The performance of NN strongly depends on the resulting training set, and while the TPF method is physically motivated, the trained NN method uses an ad–hoc likelihood. NN method was the main position reconstruction algorithm used in [aprile2018dark].

In the following Sections of the present work we focus on a novel method, based on the likelihood–free framework, to reconstruct the 2–D (x,y) position. Beyond providing pointwise estimates, our method automatically provides credible regions for the parameters of interest. To retrieve a measure of the uncertainties related to each parameter of interest is crucial in order to discard events belonging to the background of the TPC. A key difference respect to the TPF method is that with our proposed algorithm none LCE maps are needed as input to the optimization algorithm. In fact the LCE maps are not defined on the continuum but rather simulated on a grid, after which an interpolation is used. Finally, one of the advantages of the proposed method lies in the fact that its procedure can be easily extended to more than two parameters. We show this feature in the last example where, together with the 2–D (x,y) position, also the size of the charge signal (

), a proxy for energy of the interaction, will be reconstructed.

3 Statistical Methodology

In this Section 3 we introduce the likelihood–free framework, motivating its use and explaining the reasons for its success. In Section 3.1 the Approximate Bayesian Computation (ABC) and the ABC–Population Monte Carlo (ABC–PMC) algorithms are presented, while in Section 3.2 we introduce the Bayesian Optimization for Likelihood–Free Inference (BOLFI), a tool for accelerating likelihood–free inference that will be largely used in the examples presented in Section 4.

3.1 Approximate Bayesian Computation

Bayesian inference has become an increasingly popular alternative to the frequentist approach over the last 20 years, thanks to several algorithmic advances that allow complex models to be fitted to data. In the frequentist framework the relation between observed data

and the vector of the parameters

(where is the dimension of the parameter space) can be fully described by the likelihood function . In the Bayesian framework a prior distribution is assigned to the vector of parameters, . A Bayesian analyses is based on the so–called posterior distribution for

, defined according to the Bayes theorem:

(2)

where the denominator of Equation (2) is a normalizing constant, often referred to as the evidence in computer science literature. The evaluation of Equation (2

) relies on the ability to calculate the normalizing constant, but since in the vast majority of cases this quantity cannot be analytically calculated, the posterior is usually approximated using various sampling techniques such as Markov Chain Monte–Carlo (MCMC) algorithms

[hastings1970monte, metropolis1953equation]. As long as the likelihood is known, MCMC techniques are feasible. However this might not be the case, for example when the relationship between the data and the parameters is highly complex or unknown or if there are observational limitations. ABC is a framework of statistical inference designed for situations in which the likelihood function is intractable, but simulation through the forward model is possible. Recently, ABC has been applied in many different fields of science, such as astronomy [akeret2015approximate, bruderer2016calibrated, CameronPettitt2012, hahn2017approximate, jennings2017astroabc, jennings2016new, WeyantEtAl2013, ShaferFreeman2012], biology [ThorntonEtAl2006], ecology [Beaumont2010], epidemiology [McKinleyEtAl2009], population genetic problems [beaumont2002approximate, cornuet2008inferring, PitchardEtAl1999, TavareEtAl1997] and population modeling [TonyEtAl]. The basic ABC algorithm [PitchardEtAl1999, TavareEtAl1997] consists in the following four steps iterative procedure:

(1) Sample from the prior distribution, .
(2) Simulate a sample of the data by using in the forward model: .
(3) Compare the observed data, , with the simulated sample, , using a distance function, , letting , where is the summary statistic of the data.
(4) If the distance, , is smaller than a fixed tolerance, , then is retained, otherwise it is discarded. Repeat until the desired particle sample size is achieved.
Algorithm 1 Basic reject ABC algorithm for

The resulting ABC posterior distribution, following the notation from [MarinEtAl2012], can be written as:

(3)

where is the indicator function for the set .

According to the definitions provided through Equation (2) and Equation (3) for the true posterior distribution and the ABC posterior distribution respectively, it follows that for and sufficient. In practice, however, for computational reasons, some tolerance has to be allowed, which causes an approximation error in the ABC procedure. Secondly, looking at Algorithm 1, rather than comparing the entire observed data with the simulated sample , the similarity between the observed and the simulated data is based on suitably selected summary statistics. While a desirable situation involves selecting summary statistics that are sufficient [cox1979theoretical], this rarely happens when facing real problems necessitating ABC. As pointed out in [MarinEtAl2012], for most situations the summary statistics are usually determined by the problem at hand and selected by the experimenters in the field, making the implementation of a general procedure for retrieving highly informative summary statistics challenging. Provided that the summary statistics and the tolerance are properly selected, the ABC posterior distribution suitably matches the true posterior distribution [beaumont2002approximate, BlumEtAl2013, blum2010choosing, FearnheadPrangle2012, IshidaEtAl2015, prangle2015summary].

The basic ABC rejection algorithm presented in Algorithm 1 can require a huge computational time in order to produce the desired number of posterior samples, because it only uses prior distributions throughout the procedure [MarinEtAl2012]. Recently, many algorithms that extend the basic ABC algorithm have been proposed [Blum2010, BlumEtAl2013, CsilleryEtAl2010, DrovandiEtAl2011, FearnheadPrangle2012, gutmann2016bayesian, JoyceMarjoram2008, MarinEtAl2012, DelMoralEtAl2012, RatmannEtAl2013], but at least at for the first example presented in this work we focus on the ABC–PMC algorithm [BeaumontEtAl2009].

The ABC–PMC algorithm, in order to improve the efficiency of the statistical inference, constructs a series of intermediate distributions rather than only using the prior distributions. The first iteration of the ABC–PMC algorithm uses tolerance and draws proposals from the specified prior distribution. Starting from the second iteration, the algorithm draws proposals from the previous iteration’s ABC posterior distribution. After a particle is selected from the previous iterations particle system, it is moved according to some kernel. Since the obtained proposals are not directly drawn from the prior distributions, importance weights are used. The importance weight for particle at iteration is defined as:

where

is a Gaussian kernel with variance

(i.e. twice the weighted sample variance of the particles from iteration , as originally suggested by [BeaumontEtAl2009]). The steps required to run the ABC–PMC algorithm are displayed in Algorithm 2.

if  then
     for  do
         Set
         while  do
              Propose by drawing ,
              Generate
              Calculate distance
         end while
         Set weight
     end for
else if  then
     Set
     for  do
         Set quantile of
         Set
         while  do
              Select from with probabilities
              Propose
              Generate
              Calculate distance
         end while
         Set weight
     end for
end if
Algorithm 2 ABC–PMC algorithm for

The series of tolerances decreases such that , where is the final iteration of the ABC–PMC algorithm. Both the rule to reduce the tolerances and the total number of iterations are selected in advance and must be properly tuned for the considered application [BeaumontEtAl2009, IshidaEtAl2015, Lenormand2013, McKinleyEtAl2009, SissonEtAl2007, TonyEtAl, WeyantEtAl2013]. We will discuss our choices to run the ABC–PMC algorithm in Section 4. The result of the ABC–PMC analysis consists on the approximate posterior distribution, used to provide both pointwise estimates and credible regions on the parameters of interest.

3.2 Bayesian Optimization for Likelihood–Free Inference

One of the major obstacles to likelihood–free inference is the computational cost of the method. In fact ABC is based on the idea of identifying relevant regions of the parameters of interest of the model by finding those estimates such that , or more often its summary , is comparable with , or more often its summary . It is clear than that the role played by the summary statistics and by the discrepancy measure defined to compare and is key in order to retrieve useful statistical properties, and in particular making the ABC posterior distribution a suitable approximation of the true posterior. Most of the parameters proposed during the resampling step result in large distances between and and those estimates for the parameters for which the distances would be small are unknown, as pointed out in [lintusaari2017fundamentals]. This latter fact means that, when using the ABC–PMC algorithm presented in Section 3.1, the number of datasets to simulate through the forward model is usually at least on the order of , since weak information is generally available in advance about relevant regions of the parametric space.

In order to address the issue related to the decision about the similarity between and , an alternative to the first class of algorithms able to reduce the computational burden by several order of magnitude is provided by BOLFI [gutmann2016bayesian]. BOLFI combines probabilistic modelling of the distance function that compares with via optimization, to facilitate likelihood-free inference. The basic idea behind BOLFI is to find, avoiding unnecessary computations, relevant regions of the parametric space where the discrepancy between and is small. This is done by using a probabilistic model to learn on the stochastic relation between the parameter estimates and the similarity between and . Once the probabilistic model is ready, it can be used to retrieve a suitable approximation to the true posterior distribution [gutmann2016bayesian]. Therefore, the problem becomes to infer the regression function of the discrepancies, which is unknown, focusing on regions of the parametric space where those discrepancies are small. In their work, [gutmann2016bayesian] proposed to use Bayesian optimization [brochu2010tutorial], modelling the function of the discrepancies with a Gaussian process. The choice for using Gaussian processes to model the function of the discrepancies is not mandatory. Recently Gaussian processes have been used as surrogate models for evaluating generative models that are expensive to compute [meeds2014gps, wilkinson2014accelerating, rasmussen2004gaussian]. In the present work, we followed the specifications suggested by [gutmann2016bayesian]. Further details on BOLFI can be found in [gutmann2016bayesian] and the Python package called ELFI [ELFI] is freely available for likelihood–free inference, offering both the ABC–PMC algorithm and BOLFI as available inference options.

4 Examples and discussions

In this Section we show the advantages of using the likelihood–free framework to reconstruct the 2–D (x,y) position over the currently existing alternative methods quickly reviewed in Section 2.1. We do so by presenting three numerical examples. In the first example we use a simple forward model in order to reconstruct the 2–D (x,y) position starting from a S2 hit pattern. The second and the third example use a more complex and realistic forward model. Therefore, for computational reasons, BOLFI is required to perform the likelihood–free analyses. In the second example we are interested just in reconstructing the 2–D (x,y) position of an event, while in the third and final example we add a third variable, the size of the charge signal (), making a three parameter inference problem (x,y,e). The softwares used to perform the analyses and to produce the Figures are Python111http://www.python.org and R.222https://cran.r-project.org

4.1 A first simple simulation example

In this first example we start investigating the feasibility for reconstructing the 2–D (x,y) position given the S2 hit pattern by using the likelihood–free framework and in particular by implementing the ABC–PMC algorithm introduced in Section 3.1. In order to do so, we use a simple forward model that samples from the LCE maps. This forward model, that returns for a given set of 2–D coordinates (x,y) the corresponding S2 hit pattern, requires to specify the LCE map binning (zoom factor) and the number of photons to be specified. In the present work we fix the zoom factor equal to twice the LCE map zoom and the number of photons to

. Given the discrete nature of this simple forward model, we add stochastic noise from a standard Normal distribution to the resulting S2 hit pattern.

Beyond the definition of the forward model, prior distributions have to be assigned to the parameters of interest, that are in this case the 2–D coordinates x and y. For (x,y), according to Equation (4), a Bivariate Normal prior distribution is used, where the means are the coordinates (

) estimated by the “maximum PMT” method and the covariance matrix is diagonal with standard deviations equal to

cm. The parameters previously defined are also known as the hyper–parameters of the prior distribution. We recall that the “maximum PMT” method estimates as reconstructed position that PMT with the largest count of photonelectrons.

(4)

A coordinate is valid if the following constraint, related to the dimension of the TPC, is satisfied:

(5)

Once the constraint is satisfied, can be used to obtain the simulated S2 hit pattern, , and the steps outlined in Algorithm 2 can take place.

As outlined in Section 3.1, one, if not the most important choice when working in the likelihood–free framework, is the selection of summary statistics highly informative on the parameters of interest. Since the comparison between the entire simulated S2 hit pattern, , and the entire observed S2 hit pattern, , is computationally unfeasible, reducing the parametric space without losing information on the 2–D (x,y) position is crucial. For this specific simple forward model, we perform two separated ABC–PMC analyses. The first analysis uses as summary statistic the well known Euclidean distance while for the second analysis the Bray–Curtis dissimilarity is used. The Bray–Curtis dissimilarity is not a true metric, since it does not have the triangle inequality property. Despite this, for our goal of properly comparing with , we found the Bray–Curtis extremely useful. One of the advantages of the Bray–Curtis relies on its interpretability: a Bray–Curtis dissimilarity equal to means that and are exactly the same, while a value equal to defines the maximum difference that can be observed between two S2 hit patterns. Moreover the Bray–Curtis dissimilarity, originally used in ecology [clarke2006resemblance], works under the assumption that the samples are taken from the same physical size, such that for instance the PMTs that count the number of photoelectrons. Given the observed S2 hit pattern, , and a simulated one, , the summary statistics based on the Euclidean distance and the Bray–Curtis dissimilarity are respectively defined as:

(6)

and

(7)

where is the total number of PMTs composing the S2 hit pattern, assuming all the PMTs are operational. By using Equation (6) or Equation (7) we reduce the parametric space from 128 to 1, while preserving the information carried out by the entire sample, as will be shown in the coming examples.

The last required step before running the ABC–PMC algorithm consists of tuning some of the parameters originally introduced by [BeaumontEtAl2009]. In particular four parameters need to be properly selected: the desired particle sample size , the total number of iterations , the initial tolerance and finally the rule to reduce the tolerance through the iterations. These parameters are not tuned once, but they are rather the result of successive attempts and adjustments that combine computational savings with a suitable approximation by the ABC posterior distribution to the true posterior. We defined the desired particle sample size , the total number of iterations , and finally we adaptively selected the next tolerance of the algorithm by calculating the percentile of , the distances of the previous step accepted particles system. There are of course other possible choices: when choosing and , the most important aspect to consider is to avoid stopping the algorithm to soon (i.e. the ABC posterior distribution is a poor approximation of the true posterior) as well as to avoid to run the algorithm for too long (i.e. the overall efficiency of the algorithm is low). We found that, with the chosen quantile, after 40 iterations further reductions of the tolerance did not lead to a better approximation of the ABC posterior distribution, suggesting that a suitable approximation to the true posterior is reached. As outlined in Algorithm 2, starting from the second iteration, rather than using the prior distribution, a perturbation kernel is used. We used the Gaussian perturbation kernel proposed in [BeaumontEtAl2009], taking into account of the previously defined constraint on the parameters and calculating the importance weights accordingly.

Once all the necessary quantities are defined, the ABC–PMC algorithm can be executed in order to reconstruct the 2–D (x,y) position. As an example of that, we consider an event whose true position is . The z coordinate is fixed equal to . By entering in the forward model we obtain the observed S2 hit pattern, . The goal is, using only , to reconstruct the position of the S2 event. The results of our ABC–PMC analysis are displayed in Figure 2 and summarized in Table 1. The summary statistic used to produce the present plots is the Euclidean distance defined in Equation (6).

Figure 2: (left) ABC posterior distributions for x (top) and y (bottom) estimated at the end of each iteration. At the beginning of the procedure (cyan solid lines) the ABC posterior distributions are broader because of the large used tolerance . Once is sufficiently small, the ABC posterior distribution stabilizes and further reductions of the tolerance do not improve the posterior (bluer solid lines). The final tolerance is . For the final ABC posterior distributions, obtained after iterations, the corresponding HPD 95% interval are displayed for both x and y(brown dashed–point lines). (right) Bivariate contour plot of the joint ABC posterior distribution (x,y) and its corresponding pointwise estimates, obtained with ABC–PMC (black square), TPF (orange cross) and NN (green x). Those reconstructed positions are compared with the original input (red triangle). The reconstructed positions obtained by TPF and NN are outside the 2–D surface of the posterior and do not belong to the HPD 95% interval.

Looking at the left plot of Figure 2 we can evaluate the final ABC marginal posterior distributions, obtained after iterations and non–parametrically estimated, for x and y. The ABC marginal posteriors exhibit two local modes apart from the global maximum. Note that both the TPF and the NN methods seem to find reconstructed position coordinates near the local modes. Through the iterations, the tolerance decreases, allowing the procedure to move from poorly informative ABC posterior distributions (cyan solid lines, corresponding to the first iterations) to stable and informative ones (bluer solid lines), once the tolerance has sufficiently decreased. The final tolerance is . Together with the posterior distributions, we used as pointwise statistic that defines the reconstructed coordinate the posterior mean. In this case, since the ABC–PMC uses importance weights, the posterior means for x and y are the result of weighted means. There are of course other possible choices when picking a pointwise indicator from a posterior distribution, such as the maximum at posteriori (MAP) estimate. We found for the present analyses the posterior mean to provide the closest results to the input coordinates. Once samples from the posterior distribution are available, it is possible to retrieve a credible interval for the parameters of interests. In this case we calculated a Highest Posterior Density (HPD) 95% credible interval for both x and y. The HPD interval provides useful information: first of all we can see how these intervals contain the input coordinates. As second, the HPD interval can be used to evaluate the accuracy of the position reconstructed with other methods, in this case TPF and NN. The estimated ABC posterior means for (x,y) are more accurate than the corresponding reconstructed positions obtained by using TPF, NN or “maximum PMT”, as also summarized in Table 1. The right side plot of Figure 2 shows the bivariate contour plot, where it is possible to observe that the positions reconstructed using ABC are much closer to the input coordinates that any reconstructed position obtained with the currently existing alternative methods.

Input Values ABC posterior means (HPD 95% interval) TPF method NN method maximum PMT method
x [cm] 2.63 2.69 (2.60; 2.85) 2.52 2.54 4.12
y [cm] -17.96 -17.99 (-18.16; -17.92) -17.88 -18.20 -15.36
Table 1: Comparison between the ABC posterior means, used as pointwise statistics, and the estimates obtained with the currently existing alternative methods. Together with the pointwise estimates for the ABC–PMC analysis, a HPD 95% interval is displayed. The last column defines the reconstructed (x,y) position estimated with the “maximum PMT” method, employed to define the hyper–parameters for the prior distribution used in the ABC–PMC analyses.

We performed the ABC–PMC positioning reconstruction analysis over 6297 independent events. Then, we calculated the Euclidean distance from the input position for each of the four used methods. Defined and as respectively the input coordinates and the reconstructed coordinates with one of the different methods, the Euclidean distance in is defined as:

(8)

The results, displayed in Figure 3, show that the proposed ABC–PMC algorithm improves the accuracy of the reconstruction with respect to the currently existing alternative methods both when using the Euclidean distance or the Bray–Curtis dissimilarity as summary statistic. The average Euclidean distance for the positions reconstructed with ABC–PMC that use as summary statistic to compare and Equation (6) is . When using Equation (7) as a summary statistic the average Euclidean distance is as well . The average Euclidean distances for the positions reconstructed with TPF and NN are respectively equal to and . In this simple initial example the smallest distances from the input coordinates are obtained when using the ABC–PMC algorithm and as summary statistic to compare and Equation (6) or Equation (7). The main explanation for why the ABC–PMC analysis outperformed the TPF method lies in the fact that, at least for this simple forward model, TPF gives dicrete reconstructed positions, while ABC gives continuous ones.

Figure 3: Euclidean distances obtained using Equation (8) for all the four methods resulting from the reconstruction of 6297 independent events. The proposed ABC–PMC algorithm improves the reconstruction respect the currently existing alternative methods for both the selected summary statistics. The average Euclidean distances from the input for the positions reconstructed with ABC–PMC (and by using as summary statistic Equation (6)), ABC–PMC (and by using as summary statistic Equation (7)), TPF and NN are respectively equal to , , and .

By using a simple forward model that samples from the LCE maps, we introduced the ABC–PMC algorithm as a novel framework to reconstruct the 2–D () position using as information the S2 hit pattern, showing that the reconstruction can be improved with respect to the currently existing alternative methods. The posterior distribution also allows for the calculation of credible intervals to quantify the uncertainty of the reconstructed 2–D (

) position. This is similar to the TPF method whose likelihood surface can be used to provide confidence intervals. The NN method only provides pointwise estimates. With this first example we tested the feasibility for a likelihood–free analysis, defining prior distributions for the parameters of interest and retrieving highly informative summary statistics. In the next two examples we will investigate further the performance of the proposed likelihood–free approach, by using a forward model that employs the full waveform simulation and BOLFI as the statistical tool for the likelihood–free inference.

4.2 Position Reconstruction using full waveform simulation

The first example introduced in Section 4.1 is useful to show from a practical standpoint how to implement the ABC–PMC algorithm and to show the potential of the likelihood–free framework in reconstructing the 2–D (x,y) position. Aimed by the same goal, the second example presented below, rather than using a simple forward model that samples from the LCE maps, employs the complete waveform simulation and reconstruction. Once provided the input coordinates , beyond the S2 top hit pattern this forward model also returns the S2 bottom hit pattern and the time required by the photo–electrons to reach the top PMTs, known with the term timing. The forward model returns the S2 bottom hit pattern together with the S2 top hit pattern by considering the light distribution as well as the light collection efficiency of the S2 signal over all PMTs in the TPC. We note that in the previous example we called the S2 top hit pattern simply S2 hit pattern. From now on we will clearly distinguish the type of S2 hit pattern we are referring to, specifying the terms “bottom” or “top”. Since TPF and NN only consider the S2 top hit pattern to reconstruct the 2–D (x,y) position, we consider for this second example only the S2 top hit pattern as well, in order to present a comparison that uses the same amount of information among the employed methods.

The new forward model is much more complex and realistic and therefore slower than the one used in Section 4.1, making the use of the ABC–PMC algorithm computationally unfeasible. For this reason, rather than the ABC–PMC algorithm, the likelihood–free analysis is carried out using BOLFI, introduced in Section 3.2. In the following we consider all the PMTs to be operational. We consider two different sizes of the charge signal, one releasing electrons, another one releasing electrons. As prior distribution for () we use Equation (4), but the mean hyper–parameters are taken from the TPF method rather than from the “maximum PMT” method. Also in this case, when using BOLFI to reconstruct the 2–D (x,y) position, we perform the analyses for two different summary statistics: in the first analysis we use the Euclidean distance defined through Equation (6), while in the second analysis the Bray–Curtis dissimilarity defined through Equation (7) is used.

As pointed out in Section 3.2, in order to reduce the computational effort required to perform the likelihood–free analysis, the function for the discrepancies is modeled through a Gaussian process and its minimum is inferred by using Bayesian optimization. Rather than using the discrepancies directly, [gutmann2016bayesian] suggested to use the logarithm of the discrepancies in cases where the underlying function is expected to be very peaked. For our study, a complication is provided by the constraint on the parameters (x,y) introduced in Equation (5). We modified the acquisition function in BOLFI in order to discard those proposed coordinates for which the constraint of Equation (5) does not hold. A typical output provided by BOLFI is displayed in Figure 4, where the input coordinates are . By looking at the top plot of Figure 4, it is possible to appreciate the behavior of the logarithm of the discrepancies. If the summary statistic is properly defined (i.e. is highly informative on the parameters of interest), then the minimum of the logarithm of the discrepancies will tend to the input coordinates. As a result of the Gaussian process, the target surface is inferred as shown in the right bottom left plot of Figure 4. BOLFI, using any proper sampling algorithm such as MCMC or SMC, will reconstruct the posterior distribution for (x,y) by sampling from those regions of the logarithm of the discrepancies having the smallest values, as shown in the bottom plots of Figure 4.

Figure 4: (top) Logarithm of the discrepancies returned by BOLFI for a generic 2–D (x,y) position, using as summary statistic the Bray–Curtis dissimilarity defined in Equation (7). The input coordinates are . For both x and y the minimum was obtained when the proposed coordinates were close to the input coordinates, meaning that the used summary statistic preserved relevant features about the parameters of interest. (bottom left) The acquisition function resulting from the BOLFI analysis. After acquiring the set of evidence points an MCMC algorithm samples from relevant region of the parameter space. (bottom right) Contour plot and marginal posterior distributions for x and y obtained after the MCMC sampling. For both x and y and HPD interval is retrieved: and .

When calling BOLFI, five of the parameters originally introduced by [gutmann2016bayesian] need to be properly tuned. The first parameter, called initial evidence, gives the number of initialization points sampled straight from the prior distributions before starting to optimize the acquisition of points. A second parameter, named update interval, defines how often the GP hyper–parameters are optimized, while the parameter acquisition noise variance defines the diagonal covariance of noise added to the acquired points. The fourth parameter, n evidence, is used to build the Gaussian process to the logarithm of the discrepancies. Finally, a fifth parameter, n samples, is used to define the length of the MCMC algorithm used to sample from the posterior distribution (the burn–in is for default equal to half of n samples). Our choices for those parameters are summarized in Table 2.

initial evidence update interval acq noise var n evidence n sample
Table 2: Definition of the parameters used to initialize BOLFI. We found these choices to perform at best from both a computational and an inferential standpoint.

Once the five parameters according to Table 2 are defined, BOLFI can be used to reconstruct the 2–D (x,y) position. Considering the two different low levels for the size of the charge signal and the two different summary statistics, we perform four analyses. The location parameter from the posterior distribution used to provide the reconstructed position is the posterior mean. We tried to use also the posterior mode and the posterior median, but these two statistics did not perform as well as the posterior mean.

When the input size of the charge signal is fixed at , BOLFI reconstructs the 2–D (x,y) position better than TPF and NN, as shown in Figure 5. In particular, using the Bray–Curtis dissimilarity to compare the observed and the simulated S2 top hit pattern leads to an overall improvement, over reconstructed positions, of with respect to TPF. When focusing to events on the edges of the TPC (i.e. ), the 2–D (x,y) position reconstructed with BOLFI is more accurate with respect to TPF. The Euclidean distance performs slightly worse than the Bray–Curtis dissimilarity, but the results are still more accurate respect the currently available methods ( more accurate than TPF when focusing on all the events and more accurate than TPF when focusing on events at ). The right plot of Figure 5 shows the mean distance from the input positions as function of the radius . It is possible to note that, when using BOLFI and the Bray–Curtis dissimilarity, the results are more accurate than TPF and NN for any level of the radius .

Figure 5: Size of the charge signal = 25; (left) Distribution of the Euclidean distances obtained using Equation (8) for all the methods once events are reconstructed. For each method the corresponding average Euclidean distance and standard deviation is displayed. When using BOLFI and the Bray–Curtis dissimilarity, the accuracy respect TPF improves of . (middle) Distribution of the Euclidean distances obtained using Equation (8) for all the methods on the subset of the reconstructed events whose . When using BOLFI and the Bray–Curtis dissimilarity, the accuracy respect TPF improves of . The standard deviations retrieved with BOLFI are smaller than the ones obtained with the commonly employed methods. (right) Mean distances from the input positions as function of the radius . When using BOLFI and the Bray–Curtis dissimilarity, the results are more accurate than TPF and NN for any level of the radius .

When the size of the charge signal is fixed at , BOLFI also reconstructs the 2–D (x,y) position better with respect to TPF and NN, as shown in Figure 6. In particular, using the Bray–Curtis dissimilarity to compare the observed and the simulated S2 hit pattern leads to an overall improvement, over reconstructed positions, of with respect to TPF. When focusing to edge events at of the TPC, the accuracy with respect to TPF improves to . Also in this case the Euclidean distance performs slightly worse than the Bray–Curtis dissimilarity, but the results are still more accurate than the currently available methods ( more accurate than TPF when focusing on all the events and more accurate than TPF when focusing on events at ). The right plot of Figure 6 shows the mean distance from the input positions as function of the radius . It is possible to note that, when using BOLFI and the Bray–Curtis dissimilarity, the results are more accurate than TPF and NN for any level of the radius .

Figure 6: Size of the charge signal = 10; (left) Distribution of the Euclidean distances obtained using Equation (8) for all the methods once events are reconstructed. For each method the corresponding average Euclidean distance and standard deviation is displayed. When using BOLFI and the Bray–Curtis dissimilarity, the accuracy respect TPF improves of . (middle) Distribution of the Euclidean distances obtained using Equation (8) for all the methods on the subset of the reconstructed events whose . When using BOLFI and the Bray–Curtis dissimilarity, the accuracy respect TPF improves of . The standard deviations retrieved with BOLFI are smaller than the ones obtained with the commonly employed methods. (right) Mean distances from the input positions as function of the radius . When using BOLFI and the Bray–Curtis dissimilarity, the results are more accurate than TPF and NN for any level of the radius .

4.3 Position and Energy Reconstruction using full waveform simulation

In this last example we use similar specifications to the previously presented example but, beyond the 2–D (x,y) position, we consider also the size of the charge signal () as a parameter of interest whose posterior distribution has to be retrieved. The reconstruction of the event is defined by the 3–D coordinates (). While for () the same prior distribution defined in the previous example is used, a prior distribution has to be assigned to the

. Since we are reconstructing the number of ionization electrons, the natural choice for the prior distribution is a Poisson random variable having rate parameter

. This discrete density cannot be used in the ELFI Python package both when inferring the function of the discrepancies and when using the MCMC algorithm to retrieve samples from the posterior distributions. In order to solve this problem, we used as prior distribution for a lognormal distribution having mean the size of the charge signal reconstructed with PAX (PAX e) and standard deviation equal to cm:

(9)

Because of the presence of a third parameter, we found both the Euclidean distance and the Bray–Curtis dissimilarity defined respectively through Equations (6) and (7) to be alone not informative enough to properly reconstruct the 3–D event . For this reason we decided to add as further distance function the energy distance, to be combined with the previously defined metrics. The energy distance is defined as:

(10)

where and are the densities estimated respectively using and .

Although the S2 bottom hit pattern does not contain valuable information about the 2–D (x,y) position, it could be useful to better reconstruct . For this reason we use as information not only the S2 top hit pattern but also the S2 bottom hit pattern. From a practical standpoint, including the S2 bottom hit pattern increases the dimension of vector returned by PAX from 127 to 248. However, we note that using the S2 bottom hit pattern to reconstruct the 2–D (x,y) position leads to worse results than using the S2 top hit pattern only, because the gain in information is smaller than the noise introduced by those further 121 PMTs. For this reason, we use both the S2 top hit pattern and S2 bottom hit pattern to calculate the Energy Distance defined in Equation (10) and only the S2 top hit pattern to calculate the Bray–Curtis dissimilarity defined through Equation (7). The overall distance function selected to compare with is therefore the sum between the Bray–Curtis dissimilarity that uses as information on and only the S2 top hit pattern, and the Energy distance that uses as information on and all the 248 PMTs. The logarithm of the discrepancies returned by BOLFI for a generic 3–D (x,y,e) event is displayed in Figure 7. We finally note that, since in the previous example the Bray–Curtis dissimilarity performed better than the Euclidean Distance to reconstruct the 2–D (x,y) position, for this example the analyses with BOLFI only considers the Bray–Curtis dissimilarity, combined with the Energy distance as just described.

Figure 7: Logarithm of the discrepancies returned by BOLFI for a generic 3–D (x,y,e) event. The input coordinates are . For all three parameters the minimum was obtained when the proposed coordinates were close to the input coordinates, meaning that the used summary statistics preserved relevant features about the parameters of interest.

Figures 8 and 9 show the results once 1000 events have been reconstructed by using the new specifications and the settings from Table 2 required to initialize BOLFI. To quantify the goodness of the 3–D reconstruction, the Euclidean distance in between the reconstructed event and the input coordinates is retrieved, according to Equation (11):

(11)

When the size of the charge signal is fixed at electrons, BOLFI is more accurate than TPF and NN in reconstructing the 3–D (x,y,e) coordinates, as shown in Figure 8. The overall improvement, over reconstructed positions, is of respect TPF. When focusing to events on the edges of the TPC (i.e. ), the accuracy respect to TPF improves of . The right plot of Figure 8 shows that the results obtained with BOLFI are more accurate than TPF and NN for any level of the radius .

Figure 8: Size of the charge signal = 25; (left) Distribution of the Euclidean distances defined in Equation (11) for all the methods once events are reconstructed. For each method the corresponding average Euclidean distance and standard deviation is displayed. When using BOLFI the accuracy respect TPF improves of . (middle) Distribution of the Euclidean distances defined in Equation (11) for all the methods on the subset of the reconstructed events whose . When using BOLFI the accuracy respect TPF improves of . The standard deviations retrieved with BOLFI are smaller than the ones obtained with the commonly employed methods. (right) Mean distance from the input positions as function of the radius . When using BOLFI the results are more accurate than TPF and NN for any level of the radius .

When the size of the charge signal is fixed at electrons, BOLFI keeps being more accurate in reconstructing the 3–D (x,y,e) coordinates than TPF and NN, as shown in Figure 9, although the performances are not as good as the previous case in which the size of the charge signal was equal to electrons. The overall improvement, over reconstructed positions, is with respect to TPF. When focusing on events at high radii of the TPC (i.e. ), the accuracy improves of with respect to TPF. The right plot of Figure 9 shows that the results obtained with BOLFI are more accurate than TPF and NN for any level of the radius .

Figure 9: Size of the charge signal = 10; ((left) Distribution of the Euclidean distances defined in Equation (11) for all the methods once events are reconstructed. For each method the corresponding average Euclidean distance and standard deviation is displayed. When using BOLFI the accuracy respect TPF improves of . (middle) Distribution of the Euclidean distances defined in Equation (11) for all the methods on the subset of the reconstructed events whose . When using BOLFI the accuracy respect TPF improves of . The standard deviations retrieved with BOLFI are smaller than the ones obtained with the commonly employed methods. (right) Mean distance from the input positions as function of the radius . When using BOLFI the results are more accurate than TPF and NN for any level of the radius .

5 Summary

We presented a novel method, based on the likelihood–free framework, to reconstruct the 2–D () position and the 3–D () coordinates of an interaction, using the S2 signal and its corresponding S2 top and bottom hit patterns. Although for the presented examples we used the XENON1T TPC, the proposed algorithm can be used for any dual–phase TPC. In order to run both the ABC–PMC and BOLFI algorithms, we defined suitable prior distributions for the parameters of interest () and selected highly informative discrepancy measures, such as the Euclidean distance, the Bray–Curtis dissimilarity and for the last example the Energy distance.

We evaluated the quality of the proposed algorithm by reconstructing at first the 2–D () position and then adding as further parameter of interest the size of the charge signal . We performed a comparison with the currently existing alternative methods on a sample of independent events, obtained by using the open source PAX data processor and waveform simulator developed by the XENON collaboration. When focusing only on the 2–D () position reconstruction, BOLFI improved the accuracy of the reconstruction over TPF of and , respectively when the size of the charge signal was fixed to electrons and electrons. We found even larger improvements if focusing on events having radius . For those cases BOLFI improved the accuracy of the reconstruction over TPF by and , respectively when the size of the charge signal was assumed known and fixed to electrons and electrons. When focusing on the 3–D () reconstruction, BOLFI kept improving the accuracy of the reconstruction over TPF. In particular the gain in accuracy was and , respectively when the input size of the charge signal was equal to electrons and electrons. Also in this more complex example we found slightly better improvements if focusing on events having radius , with BOLFI improving the accuracy of the reconstruction over TPF by and , respectively when the input size of the charge signal was equal to electrons and electrons. Finally, the uncertainties associated to the parameters of interest retrieved by BOLFI are always the smallest among all the tested methods.

The proposed likelihood–free method presents several advantages with respect to the currently existing alternative methods. While TPF method relies on the ability to retrieve the LCE functions, that must be numerically estimated, BOLFI just requires a simulator that given a set of input coordinates provides a simulated S2 hit pattern (and as further feature the timing). BOLFI is also preferable with respect to the NN method, since for the latter none physical assumptions are available. We note also that BOLFI does not need any correction for reconstructing the size of the charge signal as it directly infers the number of ionization electrons. Moreover, the drift time and hence the depth coordinate can in principle be reconstructed. In the present work we used because simulating deep events requires a model of the drift field which is very detector specific. However if we were to reconstruct the position of deep events no field distortion correction is needed since it is included in the forward model. One possible limitation of likelihood–free inference is the computational burden of its analysis. BOLFI reconstructs events several orders of magnitude faster than the ABC–PMC algorithm but still requires order one minute reconstruction time, most of which is used by the waveform generator. The TPF method is faster, although the likelihood region is evaluated only on a x grid. BOLFI automatically allows for properly defined credible intervals on the parameters of interest, similar to TPFs confidence intervals for the frequentist framework. Ultimately the speed and accuracy of the simulator determines the speed and accuracy of the reconstruction. We note however that the gain in accuracy presented in this work justifies the applicability of BOLFI over TPF even if our algorithm is computationally more expensive.

We gratefully acknowledge and thank the XENON collaboration for developing and maintaining their open source data processor and waveform generator PAX. U. Simola was supported by the Academy of Finland grant no. 1313197. J. Conrad acknowledges support from the Knut and Alice Wallenberg Foundation and Swedish Research Council. J. Corander was supported by the ERC grant no. 742158.

References