Training Subset Selection for Weak Supervision

06/06/2022
by   Hunter Lang, et al.
0

Existing weak supervision approaches use all the data covered by weak signals to train a classifier. We show both theoretically and empirically that this is not always optimal. Intuitively, there is a tradeoff between the amount of weakly-labeled data and the precision of the weak labels. We explore this tradeoff by combining pretrained data representations with the cut statistic (Muhlenbach et al., 2004) to select (hopefully) high-quality subsets of the weakly-labeled training data. Subset selection applies to any label model and classifier and is very simple to plug in to existing weak supervision pipelines, requiring just a few lines of code. We show our subset selection method improves the performance of weak supervision for a wide range of label models, classifiers, and datasets. Using less weakly-labeled data improves the accuracy of weak supervision pipelines by up to 19 tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/05/2020

Limitations of weak labels for embedding and tagging

While many datasets and approaches in ambient sound analysis use weakly ...
research
10/26/2020

Meta-Learning for Neural Relation Classification with Distant Supervision

Distant supervision provides a means to create a large number of weakly ...
research
03/30/2023

Mitigating Source Bias for Fairer Weak Supervision

Weak supervision overcomes the label bottleneck, enabling efficient deve...
research
11/16/2018

Coupling weak and strong supervision for classification of prostate cancer histopathology images

Automated grading of prostate cancer histopathology images is a challeng...
research
01/20/2022

Predictive Inference with Weak Supervision

The expense of acquiring labels in large-scale statistical machine learn...
research
11/01/2019

Novelty Detection and Learning from Extremely Weak Supervision

In this paper we offer a method and algorithm, which make possible fully...
research
09/05/2018

Learning Concept Abstractness Using Weak Supervision

We introduce a weakly supervised approach for inferring the property of ...

Please sign up or login with your details

Forgot password? Click here to reset