Getting Gender Right in Neural Machine Translation

09/11/2019
by   Eva Vanmassenhove, et al.
0

Speakers of different languages must attend to and encode strikingly different aspects of the world in order to use their language correctly (Sapir, 1921; Slobin, 1996). One such difference is related to the way gender is expressed in a language. Saying "I am happy" in English, does not encode any additional knowledge of the speaker that uttered the sentence. However, many other languages do have grammatical gender systems and so such knowledge would be encoded. In order to correctly translate such a sentence into, say, French, the inherent gender information needs to be retained/recovered. The same sentence would become either "Je suis heureux", for a male speaker or "Je suis heureuse" for a female one. Apart from morphological agreement, demographic factors (gender, age, etc.) also influence our use of language in terms of word choices or even on the level of syntactic constructions (Tannen, 1991; Pennebaker et al., 2003). We integrate gender information into NMT systems. Our contribution is two-fold: (1) the compilation of large datasets with speaker information for 20 language pairs, and (2) a simple set of experiments that incorporate gender information into NMT for multiple language pairs. Our experiments show that adding a gender feature to an NMT system significantly improves the translation quality for some language pairs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2020

Neural Machine Translation Doesn't Translate Gender Coreference Right Unless You Make It

Neural Machine Translation (NMT) has been shown to struggle with grammat...
research
02/26/2018

Gender Aware Spoken Language Translation Applied to English-Arabic

Spoken Language Translation (SLT) is becoming more widely used and becom...
research
04/16/2021

Investigating Failures of Automatic Translation in the Case of Unambiguous Gender

Transformer based models are the modern work horses for neural machine t...
research
05/26/2023

Gender Lost In Translation: How Bridging The Gap Between Languages Affects Gender Bias in Zero-Shot Multilingual Translation

Neural machine translation (NMT) models often suffer from gender biases ...
research
02/11/2021

Towards Personalised and Document-level Machine Translation of Dialogue

State-of-the-art (SOTA) neural machine translation (NMT) systems transla...
research
12/09/2020

Breeding Gender-aware Direct Speech Translation Systems

In automatic speech translation (ST), traditional cascade approaches inv...

Please sign up or login with your details

Forgot password? Click here to reset