Improve Learning from Crowds via Generative Augmentation

07/22/2021
by   Zhendong Chu, et al.
0

Crowdsourcing provides an efficient label collection schema for supervised machine learning. However, to control annotation cost, each instance in the crowdsourced data is typically annotated by a small number of annotators. This creates a sparsity issue and limits the quality of machine learning models trained on such data. In this paper, we study how to handle sparsity in crowdsourced data using data augmentation. Specifically, we propose to directly learn a classifier by augmenting the raw sparse annotations. We implement two principles of high-quality augmentation using Generative Adversarial Networks: 1) the generated annotations should follow the distribution of authentic ones, which is measured by a discriminator; 2) the generated annotations should have high mutual information with the ground-truth labels, which is measured by an auxiliary network. Extensive experiments and comparisons against an array of state-of-the-art learning from crowds methods on three real-world datasets proved the effectiveness of our data augmentation framework. It shows the potential of our algorithm for low-budget crowdsourcing in general.

READ FULL TEXT
research
07/11/2021

Learning from Crowds with Sparse and Imbalanced Annotations

Traditional supervised learning requires ground truth labels for the tra...
research
11/02/2017

Data Augmentation in Emotion Classification Using Generative Adversarial Networks

It is a difficult task to classify images with multiple class labels usi...
research
12/24/2020

Learning from Crowds by Modeling Common Confusions

Crowdsourcing provides a practical way to obtain large amounts of labele...
research
04/07/2020

Learning from Imperfect Annotations

Many machine learning systems today are trained on large amounts of huma...
research
09/06/2019

An Auxiliary Classifier Generative Adversarial Framework for Relation Extraction

Relation extraction models suffer from limited qualified training data. ...
research
01/13/2021

Sequential IoT Data Augmentation using Generative Adversarial Networks

Sequential data in industrial applications can be used to train and eval...
research
10/10/2019

Unconstrained Road Marking Recognition with Generative Adversarial Networks

Recent road marking recognition has achieved great success in the past f...

Please sign up or login with your details

Forgot password? Click here to reset