Cross-lingual Transfer Can Worsen Bias in Sentiment Analysis

Sentiment analysis (SA) systems are widely deployed in many of the world's languages, and there is well-documented evidence of demographic bias in these systems. In languages beyond English, scarcer training data is often supplemented with transfer learning using pre-trained models, including multilingual models trained on other languages. In some cases, even supervision data comes from other languages. Does cross-lingual transfer also import new biases? To answer this question, we use counterfactual evaluation to test whether gender or racial biases are imported when using cross-lingual transfer, compared to a monolingual transfer setting. Across five languages, we find that systems using cross-lingual transfer usually become more biased than their monolingual counterparts. We also find racial biases to be much more prevalent than gender biases. To spur further research on this topic, we release the sentiment models we used for this study, and the intermediate checkpoints throughout training, yielding 1,525 distinct models; we also release our evaluation code.

READ FULL TEXT

page 7

page 13

research
05/19/2023

Bias Beyond English: Counterfactual Tests for Bias in Sentiment Analysis in Four Languages

Sentiment analysis (SA) systems are used in many products and hundreds o...
research
05/08/2018

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction

Gender prediction has typically focused on lexical and social network fe...
research
04/27/2022

LyS_ACoruña at SemEval-2022 Task 10: Repurposing Off-the-Shelf Tools for Sentiment Analysis as Semantic Dependency Parsing

This paper addressed the problem of structured sentiment analysis using ...
research
05/24/2023

This Land is Your, My Land: Evaluating Geopolitical Biases in Language Models

We introduce the notion of geopolitical bias – a tendency to report diff...
research
05/18/2023

Comparing Biases and the Impact of Multilingual Training across Multiple Languages

Studies in bias and fairness in natural language processing have primari...
research
12/16/2020

Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism

This article investigates multilingual evidence retrieval and fact verif...
research
06/13/2023

Monolingual and Cross-Lingual Knowledge Transfer for Topic Classification

This article investigates the knowledge transfer from the RuQTopics data...

Please sign up or login with your details

Forgot password? Click here to reset