A Provably Improved Algorithm for Crowdsourcing with Hard and Easy Tasks

02/14/2023
by   Seo Taek Kong, et al.
0

Crowdsourcing is a popular method used to estimate ground-truth labels by collecting noisy labels from workers. In this work, we are motivated by crowdsourcing applications where each worker can exhibit two levels of accuracy depending on a task's type. Applying algorithms designed for the traditional Dawid-Skene model to such a scenario results in performance which is limited by the hard tasks. Therefore, we first extend the model to allow worker accuracy to vary depending on a task's unknown type. Then we propose a spectral method to partition tasks by type. After separating tasks by type, any Dawid-Skene algorithm (i.e., any algorithm designed for the Dawid-Skene model) can be applied independently to each type to infer the truth values. We theoretically prove that when crowdsourced data contain tasks with varying levels of difficulty, our algorithm infers the true labels with higher accuracy than any Dawid-Skene algorithm. Experiments show that our method is effective in practical applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/26/2022

Treating Crowdsourcing as Examination: How to Score Tasks and Online Workers?

Crowdsourcing is an online outsourcing mode which can solve the current ...
research
03/25/2015

Regularized Minimax Conditional Entropy for Crowdsourcing

There is a rapidly increasing interest in crowdsourcing for data labelin...
research
02/05/2023

Crowdsourcing Utilizing Subgroup Structure of Latent Factor Modeling

Crowdsourcing has emerged as an alternative solution for collecting larg...
research
12/29/2022

Recovering Top-Two Answers and Confusion Probability in Multi-Choice Crowdsourcing

Crowdsourcing has emerged as an effective platform to label a large volu...
research
11/01/2021

Robust Deep Learning from Crowds with Belief Propagation

Crowdsourcing systems enable us to collect noisy labels from crowd worke...
research
01/30/2017

Dynamic Task Allocation for Crowdsourcing Settings

We consider the problem of optimal budget allocation for crowdsourcing p...
research
03/21/2020

Crowdsourced Labeling for Worker-Task Specialization Block Model

We consider crowdsourced labeling under a worker-task specialization blo...

Please sign up or login with your details

Forgot password? Click here to reset