Introduction
Neutrinos are tiny subatomic particles that carry no electrical charge and interact with matter only through the weak nuclear force, which makes them extremely hard to detect. There are three distinct types of neutrinos, called “flavors”: (, ,
), each of which can “oscillate” into the other with a detectable probability. Many experiments
[Abe and others2015] [Adamson and others2016] have been setup to measure the parameters governing the oscillation probabilities accurately, with implications for the fundamental structure of the universe. Very often, this involves inferences from tiny samples of data which have complicated dependencies on multiple oscillation parameters simultaneously. This is typically carried out using the unified approach of Feldman and Cousins [Feldman and Cousins1998] which is very computationally expensive, on the order of tens of millions of CPU hours. In this work, we propose an iterative method using Gaussian Process to efficiently find a confidence contour for the oscillation parameters and show that it produces the same results at a fraction of the computation cost. To our knowledge, the most similar existing work is using a Gaussian Process surrogate in the approximate Bayesian computation framework [Meeds and Welling2014] but it may not achieve desired frequentist coverage.Oscillation Parameter Inference
The probability of oscillations, is determined by . To infer , a typical neutrino oscillation experiment sends a beam of neutrinos into a detector and observes a handful of oscillated neutrinos from their interactions with the detector. The observed neutrinos are binned by their energy as the oscillation probability is a function of energy. For each energy bin , there is an observed neutrino count and an expected count , which naturally gives rise to a Poisson distribution. However, for a given , the expectation is also influenced by the underlying model of the experiment, such as the beam configuration and the physics of neutrino interactions. Each of these have their own associated uncertainties, . Since the relationship between and is not available analytically, is deduced from Monte Carlo. Denote the implicit mapping between and by . The loglikelihood of is given by:
(1) 
where is a penalty term for systematic error. The best fit parameters can be obtained by maximizing the loglikelihood but a confidence contour is also needed to quantify the statistical uncertainty in . This can be done by an inverted likelihood ratio test (LRT) as a particular value should only be included in the confidence contour if we fail to reject the null at the level. However, the asymptotic distribution of the likelihood ratio statistic is unreliable when the sample size of observed neutrinos is small. Moreover, the distribution of can vary drastically as a function of due to the complexity of . Therefore, for a given , Monte Carlo experiments are used to simulate the LRT critical value . Since the parameter space is bounded, this is done in a grid for a large number of values. Known as the FeldmanCousins method in high energy physics, this LRT based approach has high statistical power by the NeymanPearson lemma but exhausting the grid is computationally inefficient.
Gaussian Process Iterative Method
With applications to hyperparameter tuning [Snoek, Larochelle, and
Adams2012], Bayesian optimization can be used to find the optimum of any blackbox function that is expensive to evaluate. Bayesian optimization is an iterative procedure. In each iteration, a number of points are evaluated to update an approximation of . Then based on the approximation, the points in the next iteration are proposed by an acquisition function . The approximation model is usually a zeromean Gaussian process . A assumes any finite collection of points to be jointly Normal with covariance matrix , which is parametrized by a kernel function that defines pairwise covariance. Given observed data, a model can be fitted by optimizing parameters of the kernel. The acquisition function aims to balance between “exploration,” reducing approximation uncertainty, and “exploitation,” reaching the optimum.
In our context, the expensive blackbox function is the mapping of LRT critical values . What we want to find is the confidence contour, the set of points where the observed likelihood ratio statistic is equal to the critical value. The acquisition function is given by
(2) 
where is the approximated critical value at and is the standard deviation at . Iteratively, the model will seek those points for which it is unsure about being within the confidence contour. The other points would be either included or rejected with some certainty. Figure 1 illustrates one iteration of the proposed method. With more iterations, the certainty will increase so that the approximated confidence contour converges to the one produced by a full grid search.
Results
We set up a toy experiment with an oscillated signal binned uniformly between GeV and no backgrounds. Toy uncertainties for the beam configuration and the interaction probability are included in the nuisance parameters, . More details are given in the supplement. With treated as a nuisance parameter as well, the confidence contour of interest is of in the range of .
points on an evenlyspaced grid are used to find the “true” confidence contour using standard FeldmanCousins method, with Monte Carlo experiments performed at each point. For the proposed method, points are proposed in each iteration. For evaluation, we compare the proposed method and standard FeldmanCousins method by calculating the percentage of overlap between the respective confidence contours (set of points)
(3) 
where and are indicator functions for whether is included in the confidence contour. The same comparison is performed with typical posthoc smoothing (of confidence contour) as well. As shown in Figure 2, on simulated data sets, the proposed method produces the same results as standard FeldmanCousins method using 30% of the computation budget. In high energy physics and other fields, there are a wide range of similar inference problems and the proposed method can be applied to them as well.
References
 [Abe and others2015] Abe, K., et al. 2015. Measurements of neutrino oscillation in appearance and disappearance channels by the T2K experiment with E protons on target. Physical Review D 91(7):072010.
 [Adamson and others2016] Adamson, P., et al. 2016. First measurement of electron neutrino appearance in NOvA. Physical Review Letters 116(15):151806.
 [Feldman and Cousins1998] Feldman, G. J., and Cousins, R. D. 1998. Unified approach to the classical statistical analysis of small signals. Physical Review D 57(7):3873.
 [Meeds and Welling2014] Meeds, E., and Welling, M. 2014. Gpsabc: Gaussian process surrogate approximate bayesian computation. arXiv preprint arXiv:1401.2838.

[Snoek, Larochelle, and
Adams2012]
Snoek, J.; Larochelle, H.; and Adams, R. P.
2012.
Practical bayesian optimization of machine learning algorithms.
In Advances in neural information processing systems, 2951–2959.
Comments
There are no comments yet.