Leveraging Importance Weights in Subset Selection

01/28/2023
by   Gui Citovsky, et al.
0

We present a subset selection algorithm designed to work with arbitrary model families in a practical batch setting. In such a setting, an algorithm can sample examples one at a time but, in order to limit overhead costs, is only able to update its state (i.e. further train model weights) once a large enough batch of examples is selected. Our algorithm, IWeS, selects examples by importance sampling where the sampling probability assigned to each example is based on the entropy of models trained on previously selected batches. IWeS admits significant performance improvement compared to other subset selection algorithms for seven publicly available datasets. Additionally, it is competitive in an active learning setting, where the label information is not available at selection time. We also provide an initial theoretical analysis to support our importance weighting approach, proving generalization and sampling rate bounds.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2021

Batch Active Learning at Scale

The ability to train complex and highly effective models often requires ...
research
01/17/2019

Diverse mini-batch Active Learning

We study the problem of reducing the amount of labeled training data req...
research
07/05/2023

Privacy Amplification via Importance Sampling

We examine the privacy-enhancing properties of subsampling a data set vi...
research
08/01/2017

Active Learning for Convolutional Neural Networks: A Core-Set Approach

Convolutional neural networks (CNNs) have been successfully applied to m...
research
06/09/2022

ScatterSample: Diversified Label Sampling for Data Efficient Graph Neural Network Learning

What target labels are most effective for graph neural network (GNN) tra...
research
05/30/2023

Optimal Dynamic Subset Sampling: Theory and Applications

We study the fundamental problem of sampling independent events, called ...
research
04/28/2021

Diversity-Aware Batch Active Learning for Dependency Parsing

While the predictive performance of modern statistical dependency parser...

Please sign up or login with your details

Forgot password? Click here to reset