A General-Purpose Crowdsourcing Computational Quality Control Toolkit for Python

09/17/2021
by   Dmitry Ustalov, et al.
0

Quality control is a crux of crowdsourcing. While most means for quality control are organizational and imply worker selection, golden tasks, and post-acceptance, computational quality control techniques allow parameterizing the whole crowdsourcing process of workers, tasks, and labels, inferring and revealing relationships between them. In this paper, we demonstrate Crowd-Kit, a general-purpose crowdsourcing computational quality control toolkit. It provides efficient implementations in Python of computational quality control algorithms for crowdsourcing, including uncertainty measures and crowd consensus methods. We focus on aggregation methods for all the major annotation tasks, from the categorical annotation in which latent label assumption is met to more complex tasks like image and sequence aggregation. We perform an extensive evaluation of our toolkit on several datasets of different nature, enabling benchmarking computational quality control methods in a uniform, systematic, and reproducible way using the same codebase. We release our code and data under an open-source license at https://github.com/Toloka/crowd-kit.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2020

CoRefi: A Crowd Sourcing Suite for Coreference Annotation

Coreference annotation is an important, yet expensive and time consuming...
research
08/25/2015

Visualizing NLP annotations for Crowdsourcing

Visualizing NLP annotation is useful for the collection of training data...
research
09/21/2022

Clustering Without Knowing How To: Application and Evaluation

Crowdsourcing allows running simple human intelligence tasks on a large ...
research
01/08/2018

Quality Control in Crowdsourcing: A Survey of Quality Attributes, Assessment Techniques and Assurance Actions

Crowdsourcing enables one to leverage on the intelligence and wisdom of ...
research
01/17/2019

Beyond monetary incentives: experiments in paid microtask contests modelled as continuous-time markov chains

In this paper, we aim to gain a better understanding into how paid micro...
research
12/05/2018

A Technical Survey on Statistical Modelling and Design Methods for Crowdsourcing Quality Control

Online crowdsourcing provides a scalable and inexpensive means to collec...
research
01/17/2017

Une mesure d'expertise pour le crowdsourcing

Crowdsourcing, a major economic issue, is the fact that the firm outsour...

Please sign up or login with your details

Forgot password? Click here to reset