Fairness in Language Models Beyond English: Gaps and Challenges

02/24/2023
by   Krithika Ramesh, et al.
0

With language models becoming increasingly ubiquitous, it has become essential to address their inequitable treatment of diverse demographic groups and factors. Most research on evaluating and mitigating fairness harms has been concentrated on English, while multilingual models and non-English languages have received comparatively little attention. This paper presents a survey of fairness in multilingual and non-English contexts, highlighting the shortcomings of current research and the difficulties faced by methods designed for English. We contend that the multitude of diverse cultures and languages across the world makes it infeasible to achieve comprehensive coverage in terms of constructing fairness datasets. Thus, the measurement and mitigation of biases must evolve beyond the current dataset-driven practices that are narrowly focused on specific dimensions and types of biases and, therefore, impossible to scale across languages and cultures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2023

On Evaluating and Mitigating Gender Biases in Multilingual Settings

While understanding and removing gender biases in language models has be...
research
09/27/2021

Mitigating Racial Biases in Toxic Language Detection with an Equity-Based Ensemble Framework

Recent research has demonstrated how racial biases against users who wri...
research
09/15/2022

Measuring Geographic Performance Disparities of Offensive Language Classifiers

Text classifiers are applied at scale in the form of one-size-fits-all s...
research
04/12/2023

ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning

Over the last few years, large language models (LLMs) have emerged as th...
research
05/22/2023

llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology

This study constructed a Japanese chat dataset for tuning large language...
research
05/22/2023

Multilingual Holistic Bias: Extending Descriptors and Patterns to Unveil Demographic Biases in Languages at Scale

We introduce a multilingual extension of the HOLISTICBIAS dataset, the l...
research
01/20/2022

CUF-Links: Continuous and Ubiquitous FAIRness Linkages for reproducible research

Despite much creative work on methods and tools, reproducibility – the a...

Please sign up or login with your details

Forgot password? Click here to reset