Self-Induced Curriculum Learning in Neural Machine Translation

04/07/2020
by   Dana Ruiter, et al.
0

Self-supervised neural machine translation (SS-NMT) learns how to extract/select suitable training data from comparable – rather than parallel – corpora and how to translate, in a way that the two tasks support each other in a virtuous circle. SS-NMT has been shown to be competitive with state-of-the-art unsupervised NMT. In this study we provide an in-depth analysis of the sampling choices the SS-NMT model takes during training. We show that, without it having been told to do so, the model selects samples of increasing (i) complexity and (ii) task-relevance in combination with (iii) a denoising curriculum. We observe that the dynamics of the mutual-supervision of both system internal representation types is vital for the extraction and hence translation performance. We show that in terms of the human Gunning-Fog Readability index (GF), SS-NMT starts by extracting and learning from Wikipedia data suitable for high school (GF=10–11) and quickly moves towards content suitable for first year undergraduate students (GF=13).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/30/2018

Unsupervised Neural Machine Translation Initialized by Unsupervised Statistical Machine Translation

Recent work achieved remarkable results in training neural machine trans...
research
07/29/2017

Curriculum Learning and Minibatch Bucketing in Neural Machine Translation

We examine the effects of particular orderings of sentence pairs on the ...
research
03/25/2022

Data Selection Curriculum for Neural Machine Translation

Neural Machine Translation (NMT) models are typically trained on heterog...
research
10/09/2020

Self-Paced Learning for Neural Machine Translation

Recent studies have proven that the training of neural machine translati...
research
03/03/2021

Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation

Meta-learning has been sufficiently validated to be beneficial for low-r...
research
04/13/2020

Reinforced Curriculum Learning on Pre-trained Neural Machine Translation Models

The competitive performance of neural machine translation (NMT) critical...
research
01/27/2022

Learning How to Translate North Korean through South Korean

South and North Korea both use the Korean language. However, Korean NLP ...

Please sign up or login with your details

Forgot password? Click here to reset