BUOCA: Budget-Optimized Crowd Worker Allocation

01/11/2019
by   Mehrnoosh Sameki, et al.
0

Due to concerns about human error in crowdsourcing, it is standard practice to collect labels for the same data point from multiple internet workers. We here show that the resulting budget can be used more effectively with a flexible worker assignment strategy that asks fewer workers to analyze easy-to-label data and more workers to analyze data that requires extra scrutiny. Our main contribution is to show how the allocations of the number of workers to a task can be computed optimally based on task features alone, without using worker profiles. Our target tasks are delineating cells in microscopy images and analyzing the sentiment toward the 2016 U.S. presidential candidates in tweets. We first propose an algorithm that computes budget-optimized crowd worker allocation (BUOCA). We next train a machine learning system (BUOCA-ML) that predicts an optimal number of crowd workers needed to maximize the accuracy of the labeling. We show that the computed allocation can yield large savings in the crowdsourcing budget (up to 49 percent points) while maintaining labeling accuracy. Finally, we envisage a human-machine system for performing budget-optimized data analysis at a scale beyond the feasibility of crowdsourcing.

READ FULL TEXT
research
08/31/2016

Dynamic Allocation of Crowd Contributions for Sentiment Analysis during the 2016 U.S. Presidential Election

Opinions about the 2016 U.S. Presidential Candidates have been expressed...
research
11/07/2021

Crowdsourcing with Meta-Workers: A New Way to Save the Budget

Due to the unreliability of Internet workers, it's difficult to complete...
research
09/01/2016

Crowdsourcing with Unsure Option

One of the fundamental problems in crowdsourcing is the trade-off betwee...
research
02/03/2015

Cheaper and Better: Selecting Good Workers for Crowdsourcing

Crowdsourcing provides a popular paradigm for data collection at scale. ...
research
01/30/2017

Dynamic Task Allocation for Crowdsourcing Settings

We consider the problem of optimal budget allocation for crowdsourcing p...
research
08/01/2018

How Does Tweet Difficulty Affect Labeling Performance of Annotators?

Crowdsourcing is a popular means to obtain labeled data at moderate cost...
research
02/14/2016

Embracing Error to Enable Rapid Crowdsourcing

Microtask crowdsourcing has enabled dataset advances in social science a...

Please sign up or login with your details

Forgot password? Click here to reset