Exploring Stereotypes and Biased Data with the Crowd

01/10/2018
by   Zeyuan Hu, et al.
0

The goal of our research is to contribute information about how useful the crowd is at anticipating stereotypes that may be biasing a data set without a researcher's knowledge. The results of the crowd's prediction can potentially be used during data collection to help prevent the suspected stereotypes from introducing bias to the dataset. We conduct our research by asking the crowd on Amazon's Mechanical Turk (AMT) to complete two similar Human Intelligence Tasks (HITs) by suggesting stereotypes relating to their personal experience. Our analysis of these responses focuses on determining the level of diversity in the workers' suggestions and their demographics. Through this process we begin a discussion on how useful the crowd can be in tackling this difficult problem within machine learning data collection.

READ FULL TEXT

page 8

page 9

research
12/07/2020

The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Challenges around collecting and processing quality data have hampered p...
research
03/17/2019

TurkScanner: Predicting the Hourly Wage of Microtasks

Workers in crowd markets struggle to earn a living. One reason for this ...
research
11/13/2018

Crowd Coach: Peer Coaching for Crowd Workers' Skill Growth

Traditional employment usually provides mechanisms for workers to improv...
research
11/07/2019

SIMMC: Situated Interactive Multi-Modal Conversational Data Collection And Evaluation Platform

As digital virtual assistants become ubiquitous, it becomes increasingly...
research
12/14/2017

A Data-Driven Analysis of Workers' Earnings on Amazon Mechanical Turk

A growing number of people are working as part of on-line crowd work, wh...
research
02/14/2022

ArgSciChat: A Dataset for Argumentative Dialogues on Scientific Papers

The applications of conversational agents for scientific disciplines (as...
research
10/27/2021

IndoNLI: A Natural Language Inference Dataset for Indonesian

We present IndoNLI, the first human-elicited NLI dataset for Indonesian....

Please sign up or login with your details

Forgot password? Click here to reset