Multi-Label Annotation Aggregation in Crowdsourcing

06/19/2017
by   Xuan Wei, et al.
0

As a means of human-based computation, crowdsourcing has been widely used to annotate large-scale unlabeled datasets. One of the obvious challenges is how to aggregate these possibly noisy labels provided by a set of heterogeneous annotators. Another challenge stems from the difficulty in evaluating the annotator reliability without even knowing the ground truth, which can be used to build incentive mechanisms in crowdsourcing platforms. When each instance is associated with many possible labels simultaneously, the problem becomes even harder because of its combinatorial nature. In this paper, we present new flexible Bayesian models and efficient inference algorithms for multi-label annotation aggregation by taking both annotator reliability and label dependency into account. Extensive experiments on real-world datasets confirm that the proposed methods outperform other competitive alternatives, and the model can recover the type of the annotators with high accuracy. Besides, we empirically find that the mixture of multiple independent Bernoulli distribution is able to accurately capture label dependency in this unsupervised multi-label annotation aggregation scenario.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2021

Evaluating Multi-label Classifiers with Noisy Labels

Multi-label classification (MLC) is a generalization of standard classif...
research
06/12/2022

Mining Multi-Label Samples from Single Positive Labels

Conditional generative adversarial networks (cGANs) have shown superior ...
research
05/17/2019

MiSC: Mixed Strategies Crowdsourcing

Popular crowdsourcing techniques mostly focus on evaluating workers' lab...
research
06/18/2016

An Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels

Multi-label classification has received considerable interest in recent ...
research
11/19/2022

A Light-weight, Effective and Efficient Model for Label Aggregation in Crowdsourcing

Due to the noises in crowdsourced labels, label aggregation (LA) has eme...
research
12/20/2020

Bayesian Semi-supervised Crowdsourcing

Crowdsourcing has emerged as a powerful paradigm for efficiently labelin...
research
12/10/2019

Practice of Efficient Data Collection via Crowdsourcing at Large-Scale

Modern machine learning algorithms need large datasets to be trained. Cr...

Please sign up or login with your details

Forgot password? Click here to reset