On Evaluating and Mitigating Gender Biases in Multilingual Settings

07/04/2023
by   Aniket Vashishtha, et al.
0

While understanding and removing gender biases in language models has been a long-standing problem in Natural Language Processing, prior research work has primarily been limited to English. In this work, we investigate some of the challenges with evaluating and mitigating biases in multilingual settings which stem from a lack of existing benchmarks and resources for bias evaluation beyond English especially for non-western context. In this paper, we first create a benchmark for evaluating gender biases in pre-trained masked language models by extending DisCo to different Indian languages using human annotations. We extend various debiasing methods to work beyond English and evaluate their effectiveness for SOTA massively multilingual models on our proposed metric. Overall, our work highlights the challenges that arise while studying social biases in multilingual settings and provides resources as well as mitigation techniques to take a step toward scaling to more languages.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2023

Fairness in Language Models Beyond English: Gaps and Challenges

With language models becoming increasingly ubiquitous, it has become ess...
research
04/07/2022

Mapping the Multilingual Margins: Intersectional Biases of Sentiment Analysis Systems in English, Spanish, and Arabic

As natural language processing systems become more widespread, it is nec...
research
05/18/2023

Comparing Biases and the Impact of Multilingual Training across Multiple Languages

Studies in bias and fairness in natural language processing have primari...
research
12/20/2022

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

Generated texts from large pretrained language models have been shown to...
research
05/24/2023

This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models

We introduce the notion of geopolitical bias – a tendency to report diff...
research
08/31/2023

The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages

Gender biases in language generation systems are challenging to mitigate...
research
12/19/2021

LUC at ComMA-2021 Shared Task: Multilingual Gender Biased and Communal Language Identification without using linguistic features

This work aims to evaluate the ability that both probabilistic and state...

Please sign up or login with your details

Forgot password? Click here to reset