Improving Gender Translation Accuracy with Filtered Self-Training

04/15/2021
by   Prafulla Kumar Choubey, et al.
0

Targeted evaluations have found that machine translation systems often output incorrect gender, even when the gender is clear from context. Furthermore, these incorrectly gendered translations have the potential to reflect or amplify social biases. We propose a gender-filtered self-training technique to improve gender translation accuracy on unambiguously gendered inputs. This approach uses a source monolingual corpus and an initial model to generate gender-specific pseudo-parallel corpora which are then added to the training data. We filter the gender-specific corpora on the source and target sides to ensure that sentence pairs contain and correctly translate the specified gender. We evaluate our approach on translation from English into five languages, finding that our models improve gender translation accuracy without any cost to generic translation quality. In addition, we show the viability of our approach on several settings, including re-training from scratch, fine-tuning, controlling the balance of the training data, forward translation, and back-translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/13/2020

Mitigating Gender Bias in Machine Translation with Target Gender Annotations

When translating "The secretary asked for details." to a language with g...
research
11/02/2022

MT-GenEval: A Counterfactual and Contextual Dataset for Evaluating Gender Accuracy in Machine Translation

As generic machine translation (MT) quality has improved, the need for t...
research
04/15/2021

First the worst: Finding better gender translations during beam search

Neural machine translation inference procedures like beam search generat...
research
09/24/2021

Faithful Target Attribute Prediction in Neural Machine Translation

The training data used in NMT is rarely controlled with respect to speci...
research
05/18/2023

Exploiting Biased Models to De-bias Text: A Gender-Fair Rewriting Model

Natural language generation models reproduce and often amplify the biase...
research
05/23/2023

Target-Agnostic Gender-Aware Contrastive Learning for Mitigating Bias in Multilingual Machine Translation

Gender bias is a significant issue in machine translation, leading to on...

Please sign up or login with your details

Forgot password? Click here to reset