Regularized Minimax Conditional Entropy for Crowdsourcing

03/25/2015
by   Dengyong Zhou, et al.
0

There is a rapidly increasing interest in crowdsourcing for data labeling. By crowdsourcing, a large number of labels can be often quickly gathered at low cost. However, the labels provided by the crowdsourcing workers are usually not of high quality. In this paper, we propose a minimax conditional entropy principle to infer ground truth from noisy crowdsourced labels. Under this principle, we derive a unique probabilistic labeling model jointly parameterized by worker ability and item difficulty. We also propose an objective measurement principle, and show that our method is the only method which satisfies this objective measurement principle. We validate our method through a variety of real crowdsourcing datasets with binary, multiclass or ordinal labels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2018

Analysis of Minimax Error Rate for Crowdsourcing and Its Application to Worker Clustering Model

While crowdsourcing has become an important means to label data, crowdwo...
research
03/01/2020

GPM: A Generic Probabilistic Model to Recover Annotator's Behavior and Ground Truth Labeling

In the big data era, data labeling can be obtained through crowdsourcing...
research
02/14/2023

A Provably Improved Algorithm for Crowdsourcing with Hard and Easy Tasks

Crowdsourcing is a popular method used to estimate ground-truth labels b...
research
04/30/2013

Inferring ground truth from multi-annotator ordinal data: a probabilistic approach

A popular approach for large scale data annotation tasks is crowdsourcin...
research
12/29/2022

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

Crowdsourcing has emerged as an effective platform to label a large volu...
research
03/07/2023

Crowdsourcing in Precision Healthcare: Short Review

The age of deep learning has brought high-performing diagnostic models f...
research
08/01/2018

How Does Tweet Difficulty Affect Labeling Performance of Annotators?

Crowdsourcing is a popular means to obtain labeled data at moderate cost...

Please sign up or login with your details

Forgot password? Click here to reset