Efficient Feature Selection With Large and High-dimensional Data

09/23/2016
by   Néhémy Lim, et al.
0

Driven by the advances in technology, large and high-dimensional data have become the rule rather than the exception. Approaches that allow for feature selection with such data are thus highly sought after, in particular, since standard methods, like cross-validated Lasso, can be computationally intractable and, in any case, lack theoretical guarantees. In this paper, we propose a novel approach to feature selection in regression. Consisting of simple optimization steps and tests, it is computationally more efficient than existing methods and, therefore, suited even for very large data sets. Moreover, in contrast to standard methods, it is equipped with sharp statistical guarantees. We thus expect that our algorithm can help to leverage the increasing volume of data in Biology, Public Health, Astronomy, Economics, and other fields.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2017

Autoencoder Feature Selector

High-dimensional data in many areas such as computer vision and machine ...
research
10/09/2018

Data-dependent compression of random features for large-scale kernel approximation

Kernel methods offer the flexibility to learn complex relationships in m...
research
06/12/2023

DRCFS: Doubly Robust Causal Feature Selection

Knowing the features of a complex system that are highly relevant to a p...
research
11/10/2014

N^3LARS: Minimum Redundancy Maximum Relevance Feature Selection for Large and High-dimensional Data

We propose a feature selection method that finds non-redundant features ...
research
10/08/2013

Feature Selection Strategies for Classifying High Dimensional Astronomical Data Sets

The amount of collected data in many scientific fields is increasing, al...
research
10/26/2020

BEAR: Sketching BFGS Algorithm for Ultra-High Dimensional Feature Selection in Sublinear Memory

We consider feature selection for applications in machine learning where...
research
10/10/2013

Feature Selection with Annealing for Computer Vision and Big Data Learning

Many computer vision and medical imaging problems are faced with learnin...

Please sign up or login with your details

Forgot password? Click here to reset