A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

03/19/2016
by   Junyoung Chung, et al.
0

The existing machine translation systems, whether phrase-based or neural, have relied almost exclusively on word-level modelling with explicit segmentation. In this paper, we ask a fundamental question: can neural machine translation generate a character sequence without any explicit segmentation? To answer this question, we evaluate an attention-based encoder-decoder with a subword-level encoder and a character-level decoder on four language pairs--En-Cs, En-De, En-Ru and En-Fi-- using the parallel corpora from WMT'15. Our experiments show that the models with a character-level decoder outperform the ones with a subword-level decoder on all of the four language pairs. Furthermore, the ensembles of neural models with a character-level decoder outperform the state-of-the-art non-neural machine translation systems on En-Cs, En-De and En-Fi and perform comparably on En-Ru.

READ FULL TEXT
research
10/10/2016

Fully Character-Level Neural Machine Translation without Explicit Segmentation

Most existing machine translation systems operate at the level of words,...
research
09/06/2018

Character-Aware Decoder for Neural Machine Translation

Standard neural machine translation (NMT) systems operate primarily on w...
research
09/06/2017

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of lan...
research
10/20/2016

Neural Machine Translation with Characters and Hierarchical Encoding

Most existing Neural Machine Translation models use groups of characters...
research
06/13/2017

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

We investigate the integration of a planning mechanism into an encoder-d...
research
10/05/2019

How Transformer Revitalizes Character-based Neural Machine Translation: An Investigation on Japanese-Vietnamese Translation Systems

While translating between Chinese-centric languages, many works have dis...
research
08/12/2020

Approaching Neural Chinese Word Segmentation as a Low-Resource Machine Translation Task

Supervised Chinese word segmentation has been widely approached as seque...

Please sign up or login with your details

Forgot password? Click here to reset