NVIDIA NeMo Neural Machine Translation Systems for English-German and English-Russian News and Biomedical Tasks at WMT21

11/16/2021
by   Sandeep Subramanian, et al.
0

This paper provides an overview of NVIDIA NeMo's neural machine translation systems for the constrained data track of the WMT21 News and Biomedical Shared Translation Tasks. Our news task submissions for English-German (En-De) and English-Russian (En-Ru) are built on top of a baseline transformer-based sequence-to-sequence model. Specifically, we use a combination of 1) checkpoint averaging 2) model scaling 3) data augmentation with backtranslation and knowledge distillation from right-to-left factorized models 4) finetuning on test sets from previous years 5) model ensembling 6) shallow fusion decoding with transformer language models and 7) noisy channel re-ranking. Additionally, our biomedical task submission for English-Russian uses a biomedically biased vocabulary and is trained from scratch on news task data, medically relevant text curated from the news task dataset, and biomedical data provided by the shared task. Our news system achieves a sacreBLEU score of 39.5 on the WMT'20 En-De test set outperforming the best submission from last year's task of 38.8. Our biomedical task Ru-En and En-Ru systems reach BLEU scores of 43.8 and 40.3 respectively on the WMT'20 Biomedical Task Test set, outperforming the previous year's best submissions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/06/2019

UFRGS Participation on the WMT Biomedical Translation Shared Task

This paper describes the machine translation systems developed by the Un...
research
08/06/2020

A Multilingual Neural Machine Translation Model for Biomedical Data

We release a multilingual neural machine translation model, which can be...
research
11/07/2019

Microsoft Research Asia's Systems for WMT19

We Microsoft Research Asia made submissions to 11 language directions in...
research
08/05/2021

WeChat Neural Machine Translation Systems for WMT21

This paper introduces WeChat AI's participation in WMT 2021 shared news ...
research
08/02/2017

The University of Edinburgh's Neural MT Systems for WMT17

This paper describes the University of Edinburgh's submissions to the WM...
research
03/19/2018

English-Catalan Neural Machine Translation in the Biomedical Domain through the cascade approach

This paper describes the methodology followed to build a neural machine ...
research
06/21/2019

CUNI System for the WMT19 Robustness Task

We present our submission to the WMT19 Robustness Task. Our baseline sys...

Please sign up or login with your details

Forgot password? Click here to reset