N^3LARS: Minimum Redundancy Maximum Relevance Feature Selection for Large and High-dimensional Data

11/10/2014
by   Makoto Yamada, et al.
0

We propose a feature selection method that finds non-redundant features from a large and high-dimensional data in nonlinear way. Specifically, we propose a nonlinear extension of the non-negative least-angle regression (LARS) called N^3LARS, where the similarity between input and output is measured through the normalized version of the Hilbert-Schmidt Independence Criterion (HSIC). An advantage of N^3LARS is that it can easily incorporate with map-reduce frameworks such as Hadoop and Spark. Thus, with the help of distributed computing, a set of features can be efficiently selected from a large and high-dimensional data. Moreover, N^3LARS is a convex method and can find a global optimum solution. The effectiveness of the proposed method is first demonstrated through feature selection experiments for classification and regression with small and high-dimensional datasets. Finally, we evaluate our proposed method over a large and high-dimensional biology dataset.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2017

Autoencoder Feature Selector

High-dimensional data in many areas such as computer vision and machine ...
research
02/02/2012

High-Dimensional Feature Selection by Feature-Wise Non-Linear Lasso

The goal of supervised feature selection is to find a subset of input fe...
research
08/14/2016

Ultra High-Dimensional Nonlinear Feature Selection for Big Biological Data

Machine learning methods are used to discover complex nonlinear relation...
research
03/02/2019

FRI - Feature Relevance Intervals for Interpretable and Interactive Data Exploration

Most existing feature selection methods are insufficient for analytic pu...
research
01/28/2012

Feature selection using nearest attributes

Feature selection is an important problem in high-dimensional data analy...
research
09/23/2016

Efficient Feature Selection With Large and High-dimensional Data

Driven by the advances in technology, large and high-dimensional data ha...
research
10/31/2019

Sobolev Independence Criterion

We propose the Sobolev Independence Criterion (SIC), an interpretable de...

Please sign up or login with your details

Forgot password? Click here to reset