Automatically Identifying Gender Issues in Machine Translation using Perturbations

04/29/2020
by   Hila Gonen, et al.
0

The successful application of neural methods to machine translation has realized huge quality advances for the community. With these improvements, many have noted outstanding challenges, including the modeling and treatment of gendered language. Where previous studies have identified concerns using manually-curated synthetic examples, we develop a novel technique to leverage real world data to explore challenges for deployed systems. We use our new method to compile an evaluation benchmark spanning examples relating to four languages from three language families, which we will publicly release to facilitate research. The examples in our benchmark expose the ways in which gender is represented in a model and the unintended consequences these gendered representations can have in downstream applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

Screening Gender Transfer in Neural Machine Translation

This paper aims at identifying the information flow in state-of-the-art ...
research
11/02/2022

MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

As generic machine translation (MT) quality has improved, the need for t...
research
09/08/2021

Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation

Recent works have found evidence of gender bias in models of machine tra...
research
08/05/2016

Winograd Schemas and Machine Translation

A Winograd schema is a pair of sentences that differ in a single word an...
research
06/16/2020

Scalable Cross Lingual Pivots to Model Pronoun Gender for Translation

Machine translation systems with inadequate document understanding can m...
research
03/20/2022

Mitigating Gender Bias in Machine Translation through Adversarial Learning

Machine translation and other NLP systems often contain significant bias...
research
04/06/2018

Understanding Actors and Evaluating Personae with Gaussian Embeddings

Understanding narrative content has become an increasingly popular topic...

Please sign up or login with your details

Forgot password? Click here to reset