Stereotyping and Bias in the Flickr30K Dataset

05/19/2016
by   Emiel van Miltenburg, et al.
0

An untested assumption behind the crowdsourced descriptions of the images in the Flickr30K dataset (Young et al., 2014) is that they "focus only on the information that can be obtained from the image alone" (Hodosh et al., 2013, p. 859). This paper presents some evidence against this assumption, and provides a list of biases and unwarranted inferences that can be found in the Flickr30K dataset. Finally, it considers methods to find examples of these, and discusses how we should deal with stereotype-driven descriptions in future applications.

READ FULL TEXT

page 1

page 2

research
12/09/2017

Improved Space-efficient Linear Time Algorithms for Some Classical Graph Problems

This short note provides space-efficient linear time algorithms for comp...
research
05/24/2022

Accuracy on In-Domain Samples Matters When Building Out-of-Domain detectors: A Reply to Marek et al. (2021)

We have noticed that Marek et al. (2021) try to re-implement our paper Z...
research
03/10/2020

Towards Clarifying the Theory of the Deconfounder

Wang and Blei (2019) studies multiple causal inference and proposes the ...
research
01/10/2020

Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View

The Touchdown dataset (Chen et al., 2019) provides instructions by human...
research
09/23/2019

NLVR2 Visual Bias Analysis

NLVR2 (Suhr et al., 2019) was designed to be robust for language bias th...
research
10/14/2022

A comparative study of the performance of different search algorithms on FOON graphs

A robot finds it really hard to learn creatively and adapt to new unseen...
research
06/18/2023

Assumption-lean falsification tests of rate double-robustness of double-machine-learning estimators

In this article we develop a feasible version of the assumption-lean tes...

Please sign up or login with your details

Forgot password? Click here to reset