Soft Alignment Objectives for Robust Adaptation in Machine Translation

11/29/2022
by   Michal Štefánik, et al.
0

Domain adaptation allows generative language models to address specific flaws caused by the domain shift of their application. However, the traditional adaptation by further training on in-domain data rapidly weakens the model's ability to generalize to other domains, making the open-ended deployments of the adapted models prone to errors. This work introduces novel training objectives built upon a semantic similarity of the predicted tokens to the reference. Our results show that (1) avoiding the common assumption of a single correct prediction by constructing the training target from tokens' semantic similarity can mitigate catastrophic forgetting during domain adaptation, while (2) preserving the quality of the adaptation, (3) with negligible additions to compute costs. In the broader perspective, the objectives grounded in a soft token alignment pioneer the exploration of the middle ground between the efficient but naive exact-match token-level objectives and expressive but computationally- and resource-intensive sequential objectives.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/24/2022

Chunk-based Nearest Neighbor Machine Translation

Semi-parametric models, which augment generation with retrieval, have le...
research
09/15/2023

Neural Machine Translation Models Can Learn to be Few-shot Learners

The emergent ability of Large Language Models to use a small number of e...
research
10/09/2020

Token-level Adaptive Training for Neural Machine Translation

There exists a token imbalance phenomenon in natural language as differe...
research
12/06/2022

Semantic-aware Message Broadcasting for Efficient Unsupervised Domain Adaptation

Vision transformer has demonstrated great potential in abundant vision t...
research
06/01/2022

Cross-domain Detection Transformer based on Spatial-aware and Semantic-aware Token Alignment

Detection transformers like DETR have recently shown promising performan...
research
10/05/2020

Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models

Recent work has shown the importance of adaptation of broad-coverage con...
research
12/08/2019

Cost-Sensitive Training for Autoregressive Models

Training autoregressive models to better predict under the test metric, ...

Please sign up or login with your details

Forgot password? Click here to reset