Context Models for OOV Word Translation in Low-Resource Languages

by   Angli Liu, et al.

Out-of-vocabulary word translation is a major problem for the translation of low-resource languages that suffer from a lack of parallel training data. This paper evaluates the contributions of target-language context models towards the translation of OOV words, specifically in those cases where OOV translations are derived from external knowledge sources, such as dictionaries. We develop both neural and non-neural context models and evaluate them within both phrase-based and self-attention based neural machine translation systems. Our results show that neural language models that integrate additional context beyond the current sentence are the most effective in disambiguating possible OOV word translations. We present an efficient second-pass lattice-rescoring method for wide-context neural language models and demonstrate performance improvements over state-of-the-art self-attention based neural MT systems in five out of six low-resource language pairs.


page 1

page 2

page 3

page 4


A Preordered RNN Layer Boosts Neural Machine Translation in Low Resource Settings

Neural Machine Translation (NMT) models are strong enough to convey sema...

Neural Machine Translation for Extremely Low-Resource African Languages: A Case Study on Bambara

Low-resource languages present unique challenges to (neural) machine tra...

Urdu-English Machine Transliteration using Neural Networks

Machine translation has gained much attention in recent years. It is a s...

Neural Machine Translation based Word Transduction Mechanisms for Low-Resource Languages

Out-Of-Vocabulary (OOV) words can pose serious challenges for machine tr...

Low-Resource Contextual Topic Identification on Speech

In topic identification (topic ID) on real-world unstructured audio, an ...

Common Sense Knowledge Learning for Open Vocabulary Neural Reasoning: A First View into Chronic Disease Literature

In this paper, we address reasoning tasks from open vocabulary Knowledge...

FFR V1.0: Fon-French Neural Machine Translation

Africa has the highest linguistic diversity in the world. On account of ...