Log In Sign Up

Improving Character-level Japanese-Chinese Neural Machine Translation with Radicals as an Additional Input Feature

by   Jinyi Zhang, et al.

In recent years, Neural Machine Translation (NMT) has been proven to get impressive results. While some additional linguistic features of input words improve word-level NMT, any additional character features have not been used to improve character-level NMT so far. In this paper, we show that the radicals of Chinese characters (or kanji), as a character feature information, can be easily provide further improvements in the character-level NMT. In experiments on WAT2016 Japanese-Chinese scientific paper excerpt corpus (ASPEC-JP), we find that the proposed method improves the translation quality according to two aspects: perplexity and BLEU. The character-level NMT with the radical input feature's model got a state-of-the-art result of 40.61 BLEU points in the test set, which is an improvement of about 8.6 BLEU points over the best system on the WAT2016 Japanese-to-Chinese translation subtask with ASPEC-JP. The improvements over the character-level NMT with no additional input feature are up to about 1.5 and 1.4 BLEU points in the development-test set and the test set of the corpus, respectively.


page 1

page 2

page 3

page 4


Apply Chinese Radicals Into Neural Machine Translation: Deeper Than Character Level

In neural machine translation (NMT), researchers face the challenge of u...

Combining Word and Character Vector Representation on Neural Machine Translation

This paper describes combinations of word vector representation and char...

Master Thesis: Neural Sign Language Translation by Learning Tokenization

In this thesis, we propose a multitask learning based method to improve ...

Adabot: Fault-Tolerant Java Decompiler

Reverse Engineering(RE) has been a fundamental task in software engineer...

Improving the Robustness of Speech Translation

Although neural machine translation (NMT) has achieved impressive progre...

Inference-only sub-character decomposition improves translation of unseen logographic characters

Neural Machine Translation (NMT) on logographic source languages struggl...

A Tree-based Decoder for Neural Machine Translation

Recent advances in Neural Machine Translation (NMT) show that adding syn...