Character-level Transformer-based Neural Machine Translation

05/22/2020
by   Nikolay Banar, et al.
0

Neural machine translation (NMT) is nowadays commonly applied at the subword level, using byte-pair encoding. A promising alternative approach focuses on character-level translation, which simplifies processing pipelines in NMT considerably. This approach, however, must consider relatively longer sequences, rendering the training process prohibitively expensive. In this paper, we discuss a novel, Transformer-based approach, that we compare, both in speed and in quality to the Transformer at subword and character levels, as well as previously developed character-level models. We evaluate our models on 4 language pairs from WMT'15: DE-EN, CS-EN, FI-EN and RU-EN. The proposed novel architecture can be trained on a single GPU and is 34 character-level Transformer; still, the obtained results are at least on par with it. In addition, our proposed model outperforms the subword-level model in FI-EN and shows close results in CS-EN. To stimulate further research in this area and close the gap with subword-level NMT, we make all our code and models publicly available.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/08/2023

Character-level NMT and language similarity

We explore the effectiveness of character-level neural machine translati...
research
05/26/2023

TranSFormer: Slow-Fast Transformer for Machine Translation

Learning multiscale Transformer models has been evidenced as a viable ap...
research
08/17/2021

Learning C to x86 Translation: An Experiment in Neural Compilation

Deep learning has had a significant impact on many fields. Recently, cod...
research
09/30/2021

SCIMAT: Science and Mathematics Dataset

In this work, we announce a comprehensive well curated and opensource da...
research
08/29/2018

Revisiting Character-Based Neural Machine Translation with Capacity and Compression

Translating characters instead of words or word-fragments has the potent...
research
11/12/2019

Character-based NMT with Transformer

Character-based translation has several appealing advantages, but its pe...
research
05/26/2022

Dynamically Relative Position Encoding-Based Transformer for Automatic Code Edit

Adapting Deep Learning (DL) techniques to automate non-trivial coding ac...

Please sign up or login with your details

Forgot password? Click here to reset