Fully Character-Level Neural Machine Translation without Explicit Segmentation

10/10/2016
by   Jason Lee, et al.
0

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural machine translation (NMT) model that maps a source character sequence to a target character sequence without any segmentation. We employ a character-level convolutional network with max-pooling at the encoder to reduce the length of source representation, allowing the model to be trained at a speed comparable to subword-level models while capturing local regularities. Our character-to-character model outperforms a recently proposed baseline with a subword-level encoder on WMT'15 DE-EN and CS-EN, and gives comparable performance on FI-EN and RU-EN. We then demonstrate that it is possible to share a single character-level encoder across multiple languages by training a model on a many-to-one translation task. In this multilingual setting, the character-level encoder significantly outperforms the subword-level encoder on all the language pairs. We observe that on CS-EN, FI-EN and RU-EN, the quality of the multilingual character-level translation even surpasses the models specifically trained on that language pair alone, both in terms of BLEU score and human judgment.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/19/2016

A Character-Level Decoder without Explicit Segmentation for Neural Machine Translation

The existing machine translation systems, whether phrase-based or neural...
research
09/10/2020

On Target Segmentation for Direct Speech Translation

Recent studies on direct speech translation show continuous improvements...
research
09/06/2017

Towards Neural Machine Translation with Latent Tree Attention

Building models that take advantage of the hierarchical structure of lan...
research
10/31/2016

Neural Machine Translation in Linear Time

We present a novel neural network for processing sequences. The ByteNet ...
research
06/13/2017

Plan, Attend, Generate: Character-level Neural Machine Translation with Planning in the Decoder

We investigate the integration of a planning mechanism into an encoder-d...
research
05/23/2022

Local Byte Fusion for Neural Machine Translation

Subword tokenization schemes are the dominant technique used in current ...
research
10/23/2022

Additive Interventions Yield Robust Multi-Domain Machine Translation Models

Additive interventions are a recently-proposed mechanism for controlling...

Please sign up or login with your details

Forgot password? Click here to reset