Neural Networks for Full Phase-space Reweighting and Parameter Tuning

07/18/2019 ∙ by Anders Andreassen, et al. ∙ berkeley college Berkeley Lab 0

Precise scientific analysis in collider-based particle physics is possible because of complex simulations that connect fundamental theories to observable quantities. The significant computational cost of these programs limits the scope, precision, and accuracy of Standard Model measurements and searches for new phenomena. We therefore introduce Deep neural networks using Classification for Tuning and Reweighting (DCTR), a neural network-based approach to reweight and fit simulations using the full phase space. DCTR can perform tasks that are currently not possible with existing methods, such as estimating non-perturbative fragmentation uncertainties. The core idea behind the new approach is to exploit powerful high-dimensional classifiers to reweight phase space as well as to identify the best parameters for describing data. Numerical examples from e^+e^-→jets demonstrate the fidelity of these methods for simulation parameters that have a big and broad impact on phase space as well as those that have a minimal and/or localized impact. The high fidelity of the full phase-space reweighting enables a new paradigm for simulations, parameter tuning, and model systematic uncertainties across particle physics and possibly beyond.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 5

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

References

Appendix A Optimal Functions

The results presented here can be found (as exercises) in textbooks, but are repeated here for easy access. Let be some discriminating features and

is another random variable representing class membership. Consider the general problem of minimizing some average loss for the function

:

(5)

where means ‘expected value’, i.e. average value or mean (sometimes represented as ). The expectation values are performed over the joint probability density of . One can rewrite Eq. 5 as

(6)

The advantage222The derivation below for the mean-squared error was partially inspired by Appendix A in Ref. Cranmer et al. (2015). of writing the loss as in Eq. 6 is that one can see that it is sufficient to minimize the function (and not functional) for all . To see this, let and suppose that is a function with a strictly smaller loss in Eq. 5 than . Since the average loss for is below that of , by the intermediate value theorem, there must be an for which the average loss for is below that of , contradicting the construction of .

Now, consider the case where the loss is cross-entropy:

(7)
(8)

where is fixed. Equation 7 is maximized for . Coincidentally, the exact same result holds if using mean squared error loss. When using either loss function with two outputs and the softmax activation for the last neural network layer, the first output will asymptotically approach and the other by construction will be . The ratio of these two outputs is then:

(9)
(10)
(11)
(12)

Therefore, the output is proportional to the likelihood ratio. The proportionality constant is the ratio of fractions of the two classes used during the training. In the paper, the two classes always have the same number of examples and thus this factor is unity.