DICTDIS: Dictionary Constrained Disambiguation for Improved NMT

by   Ayush Maheshwari, et al.

Domain-specific neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. It is desirable that such NMT systems be lexically constrained and draw from domain-specific dictionaries. Dictionaries could present multiple candidate translations for a source words/phrases on account of the polysemous nature of words. The onus is then on the NMT model to choose the contextually most appropriate candidate. Prior work has largely ignored this problem and focused on the single candidate setting where the target word or phrase is replaced by a single constraint. In this work we present DICTDIS, a lexically constrained NMT system that disambiguates between multiple candidate translations derived from dictionaries. We achieve this by augmenting training data with multiple dictionary candidates to actively encourage disambiguation during training. We demonstrate the utility of DICTDIS via extensive experiments on English-Hindi sentences in a variety of domains including news, finance, medicine and engineering. We obtain superior disambiguation performance on all domains with improved fluency in some domains of up to 4 BLEU points, when compared with existing approaches for lexically constrained and unconstrained NMT.


Bridging Neural Machine Translation and Bilingual Dictionaries

Neural Machine Translation (NMT) has become the new state-of-the-art in ...

Phrase Table as Recommendation Memory for Neural Machine Translation

Neural Machine Translation (NMT) has drawn much attention due to its pro...

Improving Lexically Constrained Neural Machine Translation with Source-Conditioned Masked Span Prediction

Generating accurate terminology is a crucial component for the practical...

Code-Switching for Enhancing NMT with Pre-Specified Translation

Leveraging user-provided translation to constrain NMT has practical sign...

Multi-Source Neural Machine Translation with Missing Data

Multi-source translation is an approach to exploit multiple inputs (e.g....

Specializing Multi-domain NMT via Penalizing Low Mutual Information

Multi-domain Neural Machine Translation (NMT) trains a single model with...

Please sign up or login with your details

Forgot password? Click here to reset