Investigating Failures of Automatic Translation in the Case of Unambiguous Gender

04/16/2021
by   Adithya Renduchintala, et al.
0

Transformer based models are the modern work horses for neural machine translation (NMT), reaching state of the art across several benchmarks. Despite their impressive accuracy, we observe a systemic and rudimentary class of errors made by transformer based models with regards to translating from a language that doesn't mark gender on nouns into others that do. We find that even when the surrounding context provides unambiguous evidence of the appropriate grammatical gender marking, no transformer based model we tested was able to accurately gender occupation nouns systematically. We release an evaluation scheme and dataset for measuring the ability of transformer based NMT models to translate gender morphology correctly in unambiguous contexts across syntactically diverse sentences. Our dataset translates from an English source into 20 languages from several different language families. With the availability of this dataset, our hope is that the NMT community can iterate on solutions for this class of especially egregious errors.

READ FULL TEXT
research
10/11/2020

Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It

Neural Machine Translation (NMT) has been shown to struggle with grammat...
research
09/11/2019

Getting Gender Right in Neural Machine Translation

Speakers of different languages must attend to and encode strikingly dif...
research
05/30/2022

Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation

Unlike literal expressions, idioms' meanings do not directly follow from...
research
01/26/2022

Learning to Recommend Method Names with Global Context

In programming, the names for the program entities, especially for the m...
research
02/26/2018

Gender Aware Spoken Language Translation Applied to English-Arabic

Spoken Language Translation (SLT) is becoming more widely used and becom...
research
07/13/2021

Generating Gender Augmented Data for NLP

Gender bias is a frequent occurrence in NLP-based applications, especial...

Please sign up or login with your details

Forgot password? Click here to reset