The Weak Supervision Landscape

03/30/2022
by   Rafael Poyiadzi, et al.
3

Many ways of annotating a dataset for machine learning classification tasks that go beyond the usual class labels exist in practice. These are of interest as they can simplify or facilitate the collection of annotations, while not greatly affecting the resulting machine learning model. Many of these fall under the umbrella term of weak labels or annotations. However, it is not always clear how different alternatives are related. In this paper we propose a framework for categorising weak supervision settings with the aim of: (1) helping the dataset owner or annotator navigate through the available options within weak supervision when prescribing an annotation process, and (2) describing existing annotations for a dataset to machine learning practitioners so that we allow them to understand the implications for the learning process. To this end, we identify the key elements that characterise weak supervision and devise a series of dimensions that categorise most of the existing approaches. We show how common settings in the literature fit within the framework and discuss its possible uses in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/21/2023

Full or Weak annotations? An adaptive strategy for budget-constrained annotation campaigns

Annotating new datasets for machine learning tasks is tedious, time-cons...
research
08/28/2021

WALNUT: A Benchmark on Weakly Supervised Learning for Natural Language Understanding

Building quality machine learning models for natural language understand...
research
02/08/2022

Data Consistency for Weakly Supervised Learning

In many applications, training machine learning models involves using la...
research
06/19/2022

Integrated Weak Learning

We introduce Integrated Weak Learning, a principled framework that integ...
research
11/03/2020

Weakly- and Semi-supervised Evidence Extraction

For many prediction tasks, stakeholders desire not only predictions but ...
research
06/08/2021

Learning from Multiple Noisy Partial Labelers

Programmatic weak supervision creates models without hand-labeled traini...
research
05/15/2020

Beyond MeSH: Fine-Grained Semantic Indexing of Biomedical Literature based on Weak Supervision

In this work, we propose a method for the automated refinement of subjec...

Please sign up or login with your details

Forgot password? Click here to reset