Data-Centric AI: Deep Generative Differentiable Feature Selection via Discrete Subsetting as Continuous Embedding Space Optimization

02/26/2023
by   Xiao Meng, et al.
0

Feature Selection (FS), such as filter, wrapper, and embedded methods, aims to find the optimal feature subset for a given downstream task. However, in many real-world practices, 1) the criteria of FS vary across domains; 2) FS is brittle when data is a high-dimensional and small sample size. Can selected feature subsets be more generalized, accurate, and input dimensionality agnostic? We generalize this problem into a deep differentiable feature selection task and propose a new perspective: discrete feature subsetting as continuous embedding space optimization. We develop a generic and principled framework including a deep feature subset encoder, accuracy evaluator, decoder, and gradient ascent optimizer. This framework implements four steps: 1) features-accuracy training data preparation; 2) deep feature subset embedding; 3) gradient-optimized search; 4) feature subset reconstruction. We develop new technical insights: reinforcement as a training data generator, ensembles of diverse peer and exploratory feature selector knowledge for generalization, an effective embedding from feature subsets to continuous space along with joint optimizing reconstruction and accuracy losses to select accurate features. Experimental results demonstrate the effectiveness of the proposed method.

READ FULL TEXT

page 3

page 7

research
08/08/2017

An Effective Feature Selection Method Based on Pair-Wise Feature Proximity for High Dimensional Low Sample Size Data

Feature selection has been studied widely in the literature. However, th...
research
04/04/2019

Cost-Sensitive Feature Selection by Optimizing F-Measures

Feature selection is beneficial for improving the performance of general...
research
04/01/2022

i-Razor: A Neural Input Razor for Feature Selection and Dimension Search in Large-Scale Recommender Systems

Input features play a crucial role in the predictive performance of DNN-...
research
07/21/2021

Differentiable Feature Selection, a Reparameterization Approach

We consider the task of feature selection for reconstruction which consi...
research
08/27/2019

Feature Gradients: Scalable Feature Selection via Discrete Relaxation

In this paper we introduce Feature Gradients, a gradient-based search al...
research
10/17/2020

DIFER: Differentiable Automated Feature Engineering

Feature engineering, a crucial step of machine learning, aims to extract...
research
07/09/2020

Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

Scientific observations often consist of a large number of variables (fe...

Please sign up or login with your details

Forgot password? Click here to reset