Microsoft's Submission to the WMT2018 News Translation Task: How I Learned to Stop Worrying and Love the Data

09/01/2018
by   Marcin Junczys-Dowmunt, et al.
0

This paper describes the Microsoft submission to the WMT2018 news translation shared task. We participated in one language direction -- English-German. Our system follows current best-practice and combines state-of-the-art models with new data filtering (dual conditional cross-entropy filtering) and sentence weighting methods. We trained fairly standard Transformer-big models with an updated version of Edinburgh's training scheme for WMT2017 and experimented with different filtering schemes for Paracrawl. According to automatic metrics (BLEU) we reached the highest score for this subtask with a nearly 2 BLEU point margin over the next strongest system. Based on human evaluation we ranked first among constrained systems. We believe this is mostly caused by our data filtering/weighting regime.

READ FULL TEXT
research
10/16/2020

DiDi's Machine Translation System for WMT2020

This paper describes DiDi AI Labs' submission to the WMT2020 news transl...
research
09/01/2018

Dual Conditional Cross-Entropy Filtering of Noisy Parallel Corpora

In this work we introduce dual conditional cross-entropy filtering for n...
research
07/15/2019

Facebook FAIR's WMT19 News Translation Task Submission

This paper describes Facebook FAIR's submission to the WMT19 shared news...
research
09/23/2021

The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21

This paper describes the Volctrans' submission to the WMT21 news transla...
research
11/16/2022

TSMind: Alibaba and Soochow University's Submission to the WMT22 Translation Suggestion Task

This paper describes the joint submission of Alibaba and Soochow Univers...
research
05/16/2016

The AMU-UEDIN Submission to the WMT16 News Translation Task: Attention-based NMT Models as Feature Functions in Phrase-based SMT

This paper describes the AMU-UEDIN submissions to the WMT 2016 shared ta...
research
07/01/2022

Swiss German Speech to Text system evaluation

We present an in-depth evaluation of four commercially available Speech-...

Please sign up or login with your details

Forgot password? Click here to reset