VisoGender: A dataset for benchmarking gender bias in image-text pronoun resolution

06/21/2023
by   Siobhan Mackenzie Hall, et al.
2

We introduce VisoGender, a novel dataset for benchmarking gender bias in vision-language models. We focus on occupation-related gender biases, inspired by Winograd and Winogender schemas, where each image is associated with a caption containing a pronoun relationship of subjects and objects in the scene. VisoGender is balanced by gender representation in professional roles, supporting bias evaluation in two ways: i) resolution bias, where we evaluate the difference between gender resolution accuracies for men and women and ii) retrieval bias, where we compare ratios of male and female professionals retrieved for a gender-neutral search query. We benchmark several state-of-the-art vision-language models and find that they lack the reasoning abilities to correctly resolve gender in complex scenes. While the direction and magnitude of gender bias depends on the task and the model being evaluated, captioning models generally are more accurate and less biased than CLIP-like models. Dataset and code are available at https://github.com/oxai/visogender

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2021

Stereotype and Skew: Quantifying Gender Bias in Pre-trained and Fine-tuned Language Models

This paper proposes two intuitive metrics, skew and stereotype, that qua...
research
08/24/2023

CALM : A Multi-task Benchmark for Comprehensive Assessment of Language Model Bias

As language models (LMs) become increasingly powerful, it is important t...
research
05/24/2023

Balancing the Picture: Debiasing Vision-Language Datasets with Synthetic Contrast Sets

Vision-language models are growing in popularity and public visibility t...
research
06/06/2023

LEACE: Perfect linear concept erasure in closed form

Concept erasure aims to remove specified features from a representation....
research
09/08/2022

Data Feedback Loops: Model-driven Amplification of Dataset Biases

Datasets scraped from the internet have been critical to the successes o...
research
06/06/2023

MISGENDERED: Limits of Large Language Models in Understanding Pronouns

Content Warning: This paper contains examples of misgendering and erasur...
research
09/28/2021

Second Order WinoBias (SoWinoBias) Test Set for Latent Gender Bias Detection in Coreference Resolution

We observe an instance of gender-induced bias in a downstream applicatio...

Please sign up or login with your details

Forgot password? Click here to reset