Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation

04/17/2018
by   Peyman Passban, et al.
0

Recently, neural machine translation (NMT) has emerged as a powerful alternative to conventional statistical approaches. However, its performance drops considerably in the presence of morphologically rich languages (MRLs). Neural engines usually fail to tackle the large vocabulary and high out-of-vocabulary (OOV) word rate of MRLs. Therefore, it is not suitable to exploit existing word-based models to translate this set of languages. In this paper, we propose an extension to the state-of-the-art model of Chung et al. (2016), which works at the character level and boosts the decoder with target-side morphological information. In our architecture, an additional morphology table is plugged into the model. Each time the decoder samples from a target vocabulary, the table sends auxiliary signals from the most relevant affixes in order to enrich the decoder's current state and constrain it to provide better predictions. We evaluated our model to translate English into German, Russian, and Turkish as three MRLs and observed significant improvements.

READ FULL TEXT
research
08/16/2016

An Efficient Character-Level Neural Machine Translation

Neural machine translation aims at building a single large neural networ...
research
07/31/2017

Linguistically Motivated Vocabulary Reduction for Neural Machine Translation from Turkish to English

The necessity of using a fixed-size word vocabulary in order to control ...
research
07/19/2017

Modeling Target-Side Inflection in Neural Machine Translation

NMT systems have problems with large vocabulary sizes. Byte-pair encodin...
research
11/06/2017

Synthetic and Natural Noise Both Break Neural Machine Translation

Character-based neural machine translation (NMT) models alleviate out-of...
research
03/25/2022

Modeling Target-Side Morphology in Neural Machine Translation: A Comparison of Strategies

Morphologically rich languages pose difficulties to machine translation....
research
10/30/2019

A Latent Morphology Model for Open-Vocabulary Neural Machine Translation

Translation into morphologically-rich languages challenges neural machin...
research
09/02/2021

How Suitable Are Subword Segmentation Strategies for Translating Non-Concatenative Morphology?

Data-driven subword segmentation has become the default strategy for ope...

Please sign up or login with your details

Forgot password? Click here to reset