Feature Selection with Distance Correlation

11/30/2022
by   Ranit Das, et al.
0

Choosing which properties of the data to use as input to multivariate decision algorithms – a.k.a. feature selection – is an important step in solving any problem with machine learning. While there is a clear trend towards training sophisticated deep networks on large numbers of relatively unprocessed inputs (so-called automated feature engineering), for many tasks in physics, sets of theoretically well-motivated and well-understood features already exist. Working with such features can bring many benefits, including greater interpretability, reduced training and run time, and enhanced stability and robustness. We develop a new feature selection method based on Distance Correlation (DisCo), and demonstrate its effectiveness on the tasks of boosted top- and W-tagging. Using our method to select features from a set of over 7,000 energy flow polynomials, we show that we can match the performance of much deeper architectures, by using only ten features and two orders-of-magnitude fewer model parameters.

READ FULL TEXT
research
09/25/2022

Deep Feature Selection Using a Novel Complementary Feature Mask

Feature selection has drawn much attention over the last decades in mach...
research
02/18/2013

Feature Multi-Selection among Subjective Features

When dealing with subjective, noisy, or otherwise nebulous features, the...
research
10/08/2017

Structural Feature Selection for Event Logs

We consider the problem of classifying business process instances based ...
research
05/11/2021

Two novel feature selection algorithms based on crowding distance

In this paper, two novel algorithms for features selection are proposed....
research
01/25/2011

Using Feature Weights to Improve Performance of Neural Networks

Different features have different relevance to a particular learning pro...
research
07/27/2016

Network-Guided Biomarker Discovery

Identifying measurable genetic indicators (or biomarkers) of a specific ...
research
12/15/2021

Online Feature Selection for Efficient Learning in Networked Systems

Current AI/ML methods for data-driven engineering use models that are mo...

Please sign up or login with your details

Forgot password? Click here to reset