DeepAI AI Chat
Log In Sign Up

Easy, Reproducible and Quality-Controlled Data Collection with Crowdaq

10/06/2020
by   Qiang Ning, et al.
0

High-quality and large-scale data are key to success for AI systems. However, large-scale data annotation efforts are often confronted with a set of common challenges: (1) designing a user-friendly annotation interface; (2) training enough annotators efficiently; and (3) reproducibility. To address these problems, we introduce Crowdaq, an open-source platform that standardizes the data collection pipeline with customizable user-interface components, automated annotator qualification, and saved pipelines in a re-usable format. We show that Crowdaq simplifies data annotation significantly on a diverse set of data collection use cases and we hope it will be a convenient tool for the community.

READ FULL TEXT

page 5

page 16

12/22/2019

Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning

A growing body of work shows that many problems in fairness, accountabil...
01/12/2023

Mephisto: A Framework for Portable, Reproducible, and Iterative Crowdsourcing

We introduce Mephisto, a framework to make crowdsourcing for research mo...
08/24/2023

Whombat: An open-source annotation tool for machine learning development in bioacoustics

1. Automated analysis of bioacoustic recordings using machine learning (...
08/31/2020

Simulation Framework for Realistic Large-scale Individual-level Health Data Generation

We propose a general framework for realistic data generation and simulat...
09/29/2020

Aligning Intraobserver Agreement by Transitivity

Annotation reproducibility and accuracy rely on good consistency within ...
08/03/2020

ContentWise Impressions: An Industrial Dataset with Impressions Included

In this article, we introduce the ContentWise Impressions dataset, a col...
05/25/2021

Task allocation interface design and personalization in gamified participatory sensing for tourism

The collection of spatiotemporal tourism information is important in sma...