Data Processing Matters: SRPH-Konvergen AI's Machine Translation System for WMT'21

11/20/2021
by   Lintang Sutawika, et al.
0

In this paper, we describe the submission of the joint Samsung Research Philippines-Konvergen AI team for the WMT'21 Large Scale Multilingual Translation Task - Small Track 2. We submit a standard Seq2Seq Transformer model to the shared task without any training or architecture tricks, relying mainly on the strength of our data preprocessing techniques to boost performance. Our final submission model scored 22.92 average BLEU on the FLORES-101 devtest set, and scored 22.97 average BLEU on the contest's hidden test set, ranking us sixth overall. Despite using only a standard Transformer, our model ranked first in Indonesian to Javanese, showing that data preprocessing matters equally, if not more, than cutting edge model architectures and training techniques.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2022

The VolcTrans System for WMT22 Multilingual Machine Translation Task

This report describes our VolcTrans system for the WMT22 shared task on ...
research
08/15/2023

VBD-MT Chinese-Vietnamese Translation Systems for VLSP 2022

We present our systems participated in the VLSP 2022 machine translation...
research
06/13/2023

NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track

This paper presents NAVER LABS Europe's systems for Tamasheq-French and ...
research
11/03/2021

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task

This report describes Microsoft's machine translation systems for the WM...
research
09/24/2021

Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

This paper describes the methods behind the systems submitted by the Uni...
research
08/17/2020

BUT-FIT at SemEval-2020 Task 4: Multilingual commonsense

This paper describes work of the BUT-FIT's team at SemEval 2020 Task 4 -...
research
05/26/2023

TranSFormer: Slow-Fast Transformer for Machine Translation

Learning multiscale Transformer models has been evidenced as a viable ap...

Please sign up or login with your details

Forgot password? Click here to reset