Stochastic Subset Selection

06/25/2020
by   Tuan A. Nguyen, et al.
24

Current machine learning algorithms are designed to work with huge volumes of high dimensional data such as images. However, these algorithms are being increasingly deployed to resource constrained systems such as mobile devices and embedded systems. Even in cases where large computing infrastructure is available, the size of each data instance, as well as datasets, can provide a huge bottleneck in data transfer across communication channels. Also, there is a huge incentive both in energy and monetary terms in reducing both the computational and memory requirements of these algorithms. For non-parametric models that require to leverage the stored training data at the inference time, the increased cost in memory and computation could be even more problematic. In this work, we aim to reduce the volume of data these algorithms must process through an end-to-end two-stage neural subset selection model, where the first stage selects a set of candidate points using a conditionally independent Bernoulli mask followed by an iterative coreset selection via a conditional Categorical distribution. The subset selection model is trained by meta-learning with a distribution of sets. We validate our method on set reconstruction and classification tasks with feature selection as well as the selection of representative samples from a given dataset, on which our method outperforms relevant baselines. We also show in our experiments that our method enhances scalability of non-parametric models such as Neural Processes.

READ FULL TEXT

page 8

page 13

page 14

page 15

research
03/10/2023

Supervised Feature Selection with Neuron Evolution in Sparse Neural Networks

Feature selection that selects an informative subset of variables from d...
research
10/09/2018

Deep supervised feature selection using Stochastic Gates

In this study, we propose a novel non-parametric embedded feature select...
research
02/28/2019

AFS: An Attention-based mechanism for Supervised Feature Selection

As an effective data preprocessing step, feature selection has shown its...
research
12/01/2020

Quick and Robust Feature Selection: the Strength of Energy-efficient Sparse Training for Autoencoders

Major complications arise from the recent increase in the amount of high...
research
06/28/2022

Parallel Instance Filtering for Malware Detection

Machine learning algorithms are widely used in the area of malware detec...
research
03/22/2022

Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation

Building models of natural language processing (NLP) is challenging in l...
research
09/23/2021

Federated Feature Selection for Cyber-Physical Systems of Systems

Autonomous systems generate a huge amount of multimodal data that are co...

Please sign up or login with your details

Forgot password? Click here to reset