Unifying Framework for Crowd-sourcing via Graphon Estimation

by   Christina E. Lee, et al.

We consider the question of inferring true answers associated with tasks based on potentially noisy answers obtained through a micro-task crowd-sourcing platform such as Amazon Mechanical Turk. We propose a generic, non-parametric model for this setting: for a given task i, 1≤ i ≤ T, the response of worker j, 1≤ j≤ W for this task is correct with probability F_ij, where matrix F = [F_ij]_i≤ T, j≤ W may satisfy one of a collection of regularity conditions including low rank, which can express the popular Dawid-Skene model; piecewise constant, which occurs when there is finitely many worker and task types; monotonic under permutation, when there is some ordering of worker skills and task difficulties; or Lipschitz with respect to an associated latent non-parametric function. This model, contains most, if not all, of the previously proposed models to the best of our knowledge. We show that the question of estimating the true answers to tasks can be reduced to solving the Graphon estimation problem, for which there has been much recent progress. By leveraging these techniques, we provide a crowdsourcing inference algorithm along with theoretical bounds on the fraction of incorrectly estimated tasks. Subsequently, we have a solution for inferring the true answers for tasks using noisy answers collected from crowd-sourcing platform under a significantly larger class of models. Concretely, we establish that if the (i,j)th element of F, F_ij, is equal to a Lipschitz continuous function over latent features associated with the task i and worker j for all i, j, then all task answers can be inferred correctly with high probability by soliciting Õ((T)^3/2) responses per task even without any knowledge of the Lipschitz function, task and worker features, or the matrix F.


page 1

page 2

page 3

page 4


A Worker-Task Specialization Model for Crowdsourcing: Efficient Inference and Fundamental Limits

Crowdsourcing system has emerged as an effective platform to label data ...

Distinguishing Question Subjectivity from Difficulty for Improved Crowdsourcing

The questions in a crowdsourcing task typically exhibit varying degrees ...

Modelisation de l'incertitude et de l'imprecision de donnees de crowdsourcing : MONITOR

Crowdsourcing is defined as the outsourcing of tasks to a crowd of contr...

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

We consider the question of learning Q-function in a sample efficient ma...

Crowdsourcing Control: Moving Beyond Multiple Choice

To ensure quality results from crowdsourced tasks, requesters often aggr...

Iterative Bayesian Learning for Crowdsourced Regression

Crowdsourcing platforms emerged as popular venues for purchasing human i...

Localization in 1D non-parametric latent space models from pairwise affinities

We consider the problem of estimating latent positions in a one-dimensio...

Please sign up or login with your details

Forgot password? Click here to reset