Revisiting Character-Based Neural Machine Translation with Capacity and Compression

08/29/2018
by   Colin Cherry, et al.
0

Translating characters instead of words or word-fragments has the potential to simplify the processing pipeline for neural machine translation (NMT), and improve results by eliminating hyper-parameters and manual feature engineering. However, it results in longer sequences in which each symbol contains less information, creating both modeling and computational challenges. In this paper, we show that the modeling problem can be solved by standard sequence-to-sequence architectures of sufficient depth, and that deep models operating at the character level outperform identical models operating over word fragments. This result implies that alternative architectures for handling character input are better viewed as methods for reducing computation time than as improved ways of modeling longer sequences. From this perspective, we evaluate several techniques for character-level NMT, verify that they do not match the performance of our deep character baseline model, and evaluate the performance versus computation time tradeoffs they offer. Within this framework, we also perform the first evaluation for NMT of conditional computation over time, in which the model learns which timesteps can be skipped, rather than having them be dictated by a fixed schedule specified before training begins.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/02/2018

Learning to Segment Inputs for NMT Favors Character-Level Processing

Most modern neural machine translation (NMT) systems rely on presegmente...
research
10/02/2018

Optimally Segmenting Inputs for NMT Shows Preference for Character-Level Processing

Most modern neural machine translation (NMT) systems rely on presegmente...
research
10/15/2019

On the Importance of Word Boundaries in Character-level Neural Machine Translation

Neural Machine Translation (NMT) models generally perform translation us...
research
05/22/2020

Character-level Transformer-based Neural Machine Translation

Neural machine translation (NMT) is nowadays commonly applied at the sub...
research
02/28/2023

Are Character-level Translations Worth the Wait? An Extensive Comparison of Character- and Subword-level Models for Machine Translation

Pretrained large character-level language models have been recently revi...
research
11/06/2020

Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English

Recent work has shown that deeper character-based neural machine transla...

Please sign up or login with your details

Forgot password? Click here to reset