Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics

08/17/2019
by   Noëlie Cherrier, et al.
0

A good feature representation is a determinant factor to achieve high performance for many machine learning algorithms in terms of classification. This is especially true for techniques that do not build complex internal representations of data (e.g. decision trees, in contrast to deep neural networks). To transform the feature space, feature construction techniques build new high-level features from the original ones. Among these techniques, Genetic Programming is a good candidate to provide interpretable features required for data analysis in high energy physics. Classically, original features or higher-level features based on physics first principles are used as inputs for training. However, physicists would benefit from an automatic and interpretable feature construction for the classification of particle collision events. Our main contribution consists in combining different aspects of Genetic Programming and applying them to feature construction for experimental physics. In particular, to be applicable to physics, dimensional consistency is enforced using grammars. Results of experiments on three physics datasets show that the constructed features can bring a significant gain to the classification accuracy. To the best of our knowledge, it is the first time a method is proposed for interpretable feature construction with units of measurement, and that experts in high-energy physics validate the overall approach as well as the interpretability of the built features.

READ FULL TEXT
research
12/17/2019

Embedded Constrained Feature Construction for High-Energy Physics Data Classification

Before any publication, data analysis of high-energy physics experiments...
research
02/08/2019

Can Genetic Programming Do Manifold Learning Too?

Exploratory data analysis is a fundamental aspect of knowledge discovery...
research
10/11/2018

Energy Flow Networks: Deep Sets for Particle Jets

A key question for machine learning approaches in particle physics is ho...
research
12/17/2015

Unsupervised Feature Construction for Improving Data Representation and Semantics

Feature-based format is the main data representation format used by mach...
research
05/09/2018

The Power of Genetic Algorithms: what remains of the pMSSM?

Genetic Algorithms (GAs) are explored as a tool for probing new physics ...
research
01/16/2020

Extracting more from boosted decision trees: A high energy physics case study

Particle identification is one of the core tasks in the data analysis pi...

Please sign up or login with your details

Forgot password? Click here to reset