Forward-Backward Selection with Early Dropping

05/30/2017
by   Giorgos Borboudakis, et al.
0

Forward-backward selection is one of the most basic and commonly-used feature selection algorithms available. It is also general and conceptually applicable to many different types of data. In this paper, we propose a heuristic that significantly improves its running time, while preserving predictive accuracy. The idea is to temporarily discard the variables that are conditionally independent with the outcome given the selected variable set. Depending on how those variables are reconsidered and reintroduced, this heuristic gives rise to a family of algorithms with increasingly stronger theoretical guarantees. In distributions that can be faithfully represented by Bayesian networks or maximal ancestral graphs, members of this algorithmic family are able to correctly identify the Markov blanket in the sample limit. In experiments we show that the proposed heuristic increases computational efficiency by about two orders of magnitude in high-dimensional problems, while selecting fewer variables and retaining predictive performance. Furthermore, we show that the proposed algorithm and feature selection with LASSO perform similarly when restricted to select the same number of variables, making the proposed algorithm an attractive alternative for problems where no (efficient) algorithm for LASSO exists.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2020

A generalised OMP algorithm for feature selection with application to gene expression data

Feature selection for predictive analytics is the problem of identifying...
research
01/03/2022

Cluster Stability Selection

Stability selection (Meinshausen and Buhlmann, 2010) makes any feature s...
research
12/31/2013

Forward-Backward Greedy Algorithms for General Convex Smooth Functions over A Cardinality Constraint

We consider forward-backward greedy algorithms for solving sparse featur...
research
03/12/2021

Causal Markov Boundaries

Feature selection is an important problem in machine learning, which aim...
research
03/25/2015

Stable Feature Selection from Brain sMRI

Neuroimage analysis usually involves learning thousands or even millions...
research
11/10/2016

Feature Selection with the R Package MXM: Discovering Statistically-Equivalent Feature Subsets

The statistically equivalent signature (SES) algorithm is a method for f...
research
10/24/2014

Median Selection Subset Aggregation for Parallel Inference

For massive data sets, efficient computation commonly relies on distribu...

Please sign up or login with your details

Forgot password? Click here to reset