Machine Translation of Restaurant Reviews: New Corpus for Domain Adaptation and Robustness

10/31/2019
by   Alexandre Berard, et al.
0

We share a French-English parallel corpus of Foursquare restaurant reviews (https://europe.naverlabs.com/research/natural-language-processing/machine-translation-of-restaurant-reviews), and define a new task to encourage research on Neural Machine Translation robustness and domain adaptation, in a real-world scenario where better-quality MT would be greatly beneficial. We discuss the challenges of such user-generated content, and train good baseline models that build upon the latest techniques for MT robustness. We also perform an extensive evaluation (automatic and human) that shows significant improvements over existing online systems. Finally, we propose task-specific metrics based on sentiment analysis or translation accuracy of domain-specific polysemous words.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/22/2020

Domain Adaptation of NMT models for English-Hindi Machine Translation Task at AdapMT ICON 2020

Recent advancements in Neural Machine Translation (NMT) models have prov...
research
07/15/2019

Naver Labs Europe's Systems for the WMT19 Machine Translation Robustness Task

This paper describes the systems that we submitted to the WMT19 Machine ...
research
07/09/2019

NTT's Machine Translation Systems for WMT19 Robustness Task

This paper describes NTT's submission to the WMT19 robustness task. This...
research
05/04/2018

Extreme Adaptation for Personalized Neural Machine Translation

Every person speaks or writes their own flavor of their native language,...
research
05/25/2022

Machine Translation Robustness to Natural Asemantic Variation

We introduce and formalize an under-studied linguistic phenomenon we cal...
research
06/27/2019

Findings of the First Shared Task on Machine Translation Robustness

We share the findings of the first shared task on improving robustness o...
research
11/08/2022

Review of coreference resolution in English and Persian

Coreference resolution (CR) is one of the most challenging areas of natu...

Please sign up or login with your details

Forgot password? Click here to reset