A General Framework of Nonparametric Feature Selection in High-Dimensional Data

03/30/2021
by   Hang Yu, et al.
0

Nonparametric feature selection in high-dimensional data is an important and challenging problem in statistics and machine learning fields. Most of the existing methods for feature selection focus on parametric or additive models which may suffer from model misspecification. In this paper, we propose a new framework to perform nonparametric feature selection for both regression and classification problems. In this framework, we learn prediction functions through empirical risk minimization over a reproducing kernel Hilbert space. The space is generated by a novel tensor product kernel which depends on a set of parameters that determine the importance of the features. Computationally, we minimize the empirical risk with a penalty to estimate the prediction and kernel parameters at the same time. The solution can be obtained by iteratively solving convex optimization problems. We study the theoretical property of the kernel feature space and prove both the oracle selection property and the Fisher consistency of our proposed method. Finally, we demonstrate the superior performance of our approach compared to existing methods via extensive simulation studies and application to a microarray study of eye disease in animals.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2019

Sparse Feature Selection in Kernel Discriminant Analysis via Optimal Scoring

We consider the two-group classification problem and propose a kernel cl...
research
07/24/2023

Nonparametric Linear Feature Learning in Regression Through Regularisation

Representation learning plays a crucial role in automated feature select...
research
10/04/2022

Robust self-healing prediction model for high dimensional data

Owing to the advantages of increased accuracy and the potential to detec...
research
07/31/2017

Consistent Nonparametric Different-Feature Selection via the Sparsest k-Subgraph Problem

Two-sample feature selection is the problem of finding features that des...
research
08/28/2021

Feature Selection in High-dimensional Space Using Graph-Based Methods

High-dimensional feature selection is a central problem in a variety of ...
research
07/12/2020

Simultaneous Feature Selection and Outlier Detection with Optimality Guarantees

Sparse estimation methods capable of tolerating outliers have been broad...
research
03/31/2014

Sparse K-Means with ℓ_∞/ℓ_0 Penalty for High-Dimensional Data Clustering

Sparse clustering, which aims to find a proper partition of an extremely...

Please sign up or login with your details

Forgot password? Click here to reset