DeepAI AI Chat
Log In Sign Up

Exploring Effectiveness of Inter-Microtask Qualification Tests in Crowdsourcing

by   Masaya Morinaga, et al.

Qualification tests in crowdsourcing are often used to pre-filter workers by measuring their ability in executing microtasks.While creating qualification tests for each task type is considered as a common and reasonable way, this study investigates into its worker-filtering performance when the same qualification test is used across multiple types of tasks.On Amazon Mechanical Turk, we tested the annotation accuracy in six different cases where tasks consisted of two different difficulty levels, arising from the identical real-world domain: four combinatory cases in which the qualification test and the actual task were the same or different from each other, as well as two other cases where workers with Masters Qualification were asked to perform the actual task only.The experimental results demonstrated the two following findings: i) Workers that were assigned to a difficult qualification test scored better annotation accuracy regardless of the difficulty of the actual task; ii) Workers with Masters Qualification scored better annotation accuracy on the low-difficulty task, but were not as accurate as those who passed a qualification test on the high-difficulty task.


page 1

page 2


Treating Crowdsourcing as Examination: How to Score Tasks and Online Workers?

Crowdsourcing is an online outsourcing mode which can solve the current ...

Needle in a Haystack: An Analysis of Finding Qualified Workers on MTurk for Summarization

The acquisition of high-quality human annotations through crowdsourcing ...

Distinguishing Question Subjectivity from Difficulty for Improved Crowdsourcing

The questions in a crowdsourcing task typically exhibit varying degrees ...

Role of Intrinsic Motivation in User Interface Design to Enhance Worker Performance in Amazon MTurk

Biologists and scientists have been tackling the problem of marine life ...

Investigating Crowdsourcing to Generate Distractors for Multiple-Choice Assessments

We present and analyze results from a pilot study that explores how crow...

A Provably Improved Algorithm for Crowdsourcing with Hard and Easy Tasks

Crowdsourcing is a popular method used to estimate ground-truth labels b...

Beyond monetary incentives: experiments in paid microtask contests modelled as continuous-time markov chains

In this paper, we aim to gain a better understanding into how paid micro...