Extracting more from boosted decision trees: A high energy physics case study

01/16/2020
by   Vidhi Lalchand, et al.
15

Particle identification is one of the core tasks in the data analysis pipeline at the Large Hadron Collider (LHC). Statistically, this entails the identification of rare signal events buried in immense backgrounds that mimic the properties of the former. In machine learning parlance, particle identification represents a classification problem characterized by overlapping and imbalanced classes. Boosted decision trees (BDTs) have had tremendous success in the particle identification domain but more recently have been overshadowed by deep learning (DNNs) approaches. This work proposes an algorithm to extract more out of standard boosted decision trees by targeting their main weakness, susceptibility to overfitting. This novel construction harnesses the meta-learning techniques of boosting and bagging simultaneously and performs remarkably well on the ATLAS Higgs (H) to tau-tau data set (ATLAS et al., 2014) which was the subject of the 2014 Higgs ML Challenge (Adam-Bourdarios et al., 2015). While the decay of Higgs to a pair of tau leptons was established in 2018 (CMS collaboration et al., 2017) at the 4.9σ significance based on the 2016 data taking period, the 2014 public data set continues to serve as a benchmark data set to test the performance of supervised classification schemes. We show that the score achieved by the proposed algorithm is very close to the published winning score which leverages an ensemble of deep neural networks (DNNs). Although this paper focuses on a single application, it is expected that this simple and robust technique will find wider applications in high energy physics.

READ FULL TEXT
research
04/28/2021

Deep Neural Network as an alternative to Boosted Decision Trees for PID

In this paper we recreate, and improve, the binary classification method...
research
01/19/2020

A meta-algorithm for classification using random recursive tree ensembles: A high energy physics application

The aim of this work is to propose a meta-algorithm for automatic classi...
research
08/20/2016

Reweighting with Boosted Decision Trees

Machine learning tools are commonly used in modern high energy physics (...
research
02/16/2015

Particle Gibbs for Bayesian Additive Regression Trees

Additive regression trees are flexible non-parametric models and popular...
research
09/13/2019

Neural Oblivious Decision Ensembles for Deep Learning on Tabular Data

Nowadays, deep neural networks (DNNs) have become the main instrument fo...
research
08/17/2019

Consistent Feature Construction with Constrained Genetic Programming for Experimental Physics

A good feature representation is a determinant factor to achieve high pe...
research
03/29/2021

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks

Can deep learning solve multiple tasks simultaneously, even when they ar...

Please sign up or login with your details

Forgot password? Click here to reset