Let the Data Choose its Features: Differentiable Unsupervised Feature Selection

07/09/2020
by   Ofir Lindenbaum, et al.
7

Scientific observations often consist of a large number of variables (features). Identifying a subset of meaningful features is often ignored in unsupervised learning, despite its potential for unraveling clear patterns hidden in the ambient space. In this paper, we present a method for unsupervised feature selection, tailored for the task of clustering. We propose a differentiable loss function which combines the graph Laplacian with a gating mechanism based on continuous approximation of Bernoulli random variables. The Laplacian is used to define a scoring term that favors low-frequency features, while the parameters of the Bernoulli variables are trained to enable selection of the most informative features. We mathematically motivate the proposed approach and demonstrate that in the high noise regime, it is crucial to compute the Laplacian on the gated inputs, rather than on the full feature set. Experimental demonstration of the efficacy of the proposed approach and its advantage over current baselines is provided using several real-world examples.

READ FULL TEXT

page 11

page 18

page 19

page 20

research
10/11/2021

Deep Unsupervised Feature Selection by Discarding Nuisance and Correlated Features

Modern datasets often contain large subsets of correlated features and n...
research
03/16/2023

Multi-modal Differentiable Unsupervised Feature Selection

Multi-modal high throughput biological data presents a great scientific ...
research
07/05/2020

Block Model Guided Unsupervised Feature Selection

Feature selection is a core area of data mining with a recent innovation...
research
11/08/2018

Spectral Simplicial Theory for Feature Selection and Applications to Genomics

The scale and complexity of modern data sets and the limitations associa...
research
07/02/2021

Few-shot Learning for Unsupervised Feature Selection

We propose a few-shot learning method for unsupervised feature selection...
research
06/10/2017

Stepwise regression for unsupervised learning

I consider unsupervised extensions of the fast stepwise linear regressio...
research
02/26/2023

Data-Centric AI: Deep Generative Differentiable Feature Selection via Discrete Subsetting as Continuous Embedding Space Optimization

Feature Selection (FS), such as filter, wrapper, and embedded methods, a...

Please sign up or login with your details

Forgot password? Click here to reset