Learning to Segment Inputs for NMT Favors Character-Level Processing

10/02/2018
by   Julia Kreutzer, et al.
0

Most modern neural machine translation (NMT) systems rely on presegmented inputs. Segmentation granularity importantly determines the input and output sequence lengths, hence the modeling depth, and source and target vocabularies, which in turn determine model size, computational costs of softmax normalization, and handling of out-of-vocabulary words. However, the current practice is to use static, heuristic-based segmentations that are fixed before NMT training. This begs the question whether the chosen segmentation is optimal for the translation task. To overcome suboptimal segmentation choices, we present an algorithm for dynamic segmentation based on the Adaptative Computation Time algorithm (Graves 2016), that is trainable end-to-end and driven by the NMT objective. In an evaluation on four translation tasks we found that, given the freedom to navigate between different segmentation levels, the model prefers to operate on (almost) character level, providing support for purely character-level NMT models from a novel angle.

READ FULL TEXT
research
10/02/2018

Optimally Segmenting Inputs for NMT Shows Preference for Character-Level Processing

Most modern neural machine translation (NMT) systems rely on presegmente...
research
08/29/2018

Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Translating characters instead of words or word-fragments has the potent...
research
11/13/2017

Word, Subword or Character? An Empirical Study of Granularity in Chinese-English NMT

Neural machine translation (NMT), a new approach to machine translation,...
research
10/15/2019

On the Importance of Word Boundaries in Character-level Neural Machine Translation

Neural Machine Translation (NMT) models generally perform translation us...
research
05/05/2018

Compositional Representation of Morphologically-Rich Input for Neural Machine Translation

Neural machine translation (NMT) models are typically trained with fixed...
research
10/07/2022

NMTSloth: Understanding and Testing Efficiency Degradation of Neural Machine Translation Systems

Neural Machine Translation (NMT) systems have received much recent atten...

Please sign up or login with your details

Forgot password? Click here to reset