Logographic Subword Model for Neural Machine Translation

09/07/2018
by   Yihao Fang, et al.
0

A novel logographic subword model is proposed to reinterpret logograms as abstract subwords for neural machine translation. Our approach drastically reduces the size of an artificial neural network, while maintaining comparable BLEU scores as those attained with the baseline RNN and CNN seq2seq models. The smaller model size also leads to shorter training and inference time. Experiments demonstrate that in the tasks of English-Chinese/Chinese-English translation, the reduction of those aspects can be from 11% to as high as 77%. Compared to previous subword models, abstract subwords can be applied to various logographic languages. Considering most of the logographic languages are ancient and very low resource languages, these advantages are very desirable for archaeological computational linguistic applications such as a resource-limited offline hand-held Demotic-English translator.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2015

On Using Monolingual Corpora in Neural Machine Translation

Recent work on end-to-end neural network-based architectures for machine...
research
04/01/2021

Low-Resource Neural Machine Translation for Southern African Languages

Low-resource African languages have not fully benefited from the progres...
research
11/07/2019

SubCharacter Chinese-English Neural Machine Translation with Wubi encoding

Neural machine translation (NMT) is one of the best methods for understa...
research
09/30/2022

Blur the Linguistic Boundary: Interpreting Chinese Buddhist Sutra in English via Neural Machine Translation

Buddhism is an influential religion with a long-standing history and pro...
research
10/30/2018

Machine Translation between Vietnamese and English: an Empirical Study

Machine translation is shifting to an end-to-end approach based on deep ...
research
03/29/2021

Unsupervised Machine Translation On Dravidian Languages

Unsupervised neural machine translation (UNMT) is beneficial especially ...
research
02/17/2021

Sparsely Factored Neural Machine Translation

The standard approach to incorporate linguistic information to neural ma...

Please sign up or login with your details

Forgot password? Click here to reset