Pseudo-Label Training and Model Inertia in Neural Machine Translation

05/19/2023
by   Benjamin Hsu, et al.
0

Like many other machine learning applications, neural machine translation (NMT) benefits from over-parameterized deep neural models. However, these models have been observed to be brittle: NMT model predictions are sensitive to small input changes and can show significant variation across re-training or incremental model updates. This work studies a frequently used method in NMT, pseudo-label training (PLT), which is common to the related techniques of forward-translation (or self-training) and sequence-level knowledge distillation. While the effect of PLT on quality is well-documented, we highlight a lesser-known effect: PLT can enhance a model's stability to model updates and input perturbations, a set of properties we call model inertia. We study inertia effects under different training settings and we identify distribution simplification as a mechanism behind the observed results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2019

On Compositionality in Neural Machine Translation

We investigate two specific manifestations of compositionality in Neural...
research
12/05/2020

Reciprocal Supervised Learning Improves Neural Machine Translation

Despite the recent success on image classification, self-training has on...
research
02/06/2017

Ensemble Distillation for Neural Machine Translation

Knowledge distillation describes a method for training a student network...
research
04/14/2021

The Curious Case of Hallucinations in Neural Machine Translation

In this work, we study hallucinations in Neural Machine Translation (NMT...
research
05/01/2020

Evaluating Robustness to Input Perturbations for Neural Machine Translation

Neural Machine Translation (NMT) models are sensitive to small perturbat...
research
05/03/2020

On the Inference Calibration of Neural Machine Translation

Confidence calibration, which aims to make model predictions equal to th...
research
05/25/2020

The Unreasonable Volatility of Neural Machine Translation Models

Recent works have shown that Neural Machine Translation (NMT) models ach...

Please sign up or login with your details

Forgot password? Click here to reset