Improving Lexically Constrained Neural Machine Translation with Source-Conditioned Masked Span Prediction

05/12/2021
by   Gyubok Lee, et al.
0

Generating accurate terminology is a crucial component for the practicality and reliability of neural machine translation (NMT) systems. To address this, lexically constrained NMT explores various methods to ensure pre-specified words and phrases to appear in the translations. In many cases, however, those methods are evaluated on general domain corpora, where the terms are mostly uni- and bi-grams (>98 setup consisting of domain-specific corpora with much longer n-gram and highly specialized terms. To encourage span-level representations in generation, we additionally impose a source-sentence conditioned masked span prediction loss in the decoder and observe improvements on both terminology translation as well as BLEU scores. Experimental results on three domain-specific corpora in two language pairs demonstrate that the proposed training scheme can improve the performance of existing lexically constrained methods that can operate both with or without a term dictionary at test time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2016

Incorporating Discrete Translation Lexicons into Neural Machine Translation

Neural machine translation (NMT) often makes mistakes in translating low...
research
10/13/2022

DICTDIS: Dictionary Constrained Disambiguation for Improved NMT

Domain-specific neural machine translation (NMT) systems (e.g., in educa...
research
06/07/2018

Multi-Source Neural Machine Translation with Missing Data

Multi-source translation is an approach to exploit multiple inputs (e.g....
research
04/17/2021

Sentence Alignment with Parallel Documents Helps Biomedical Machine Translation

The existing neural machine translation system has achieved near human-l...
research
10/24/2016

Bridging Neural Machine Translation and Bilingual Dictionaries

Neural Machine Translation (NMT) has become the new state-of-the-art in ...
research
10/11/2022

Improving Robustness of Retrieval Augmented Translation via Shuffling of Suggestions

Several recent studies have reported dramatic performance improvements i...
research
04/19/2019

Code-Switching for Enhancing NMT with Pre-Specified Translation

Leveraging user-provided translation to constrain NMT has practical sign...

Please sign up or login with your details

Forgot password? Click here to reset