A Hybrid Morpheme-Word Representation for Machine Translation of Morphologically Rich Languages

11/19/2019
by   Minh-Thang Luong, et al.
0

We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the basic unit of translation is the morpheme, but word boundaries are respected at all stages of the translation process. Our model extends the classic phrase-based model by means of (1) word boundary-aware morpheme-level phrase extraction, (2) minimum error-rate training for a morpheme-level translation model using word-level BLEU, and (3) joint scoring with morpheme- and word-level language models. Further improvements are achieved by combining our model with the classic one. The evaluation on English to Finnish using Europarl (714K sentence pairs; 15.5M English words) shows statistically significant improvements over the classic model based on BLEU and human judgments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2016

Reordering rules for English-Hindi SMT

Reordering is a preprocessing stage for Statistical Machine Translation ...
research
06/14/2016

Word Representation Models for Morphologically Rich Languages in Neural Machine Translation

Dealing with the complex word forms in morphologically rich languages is...
research
01/23/2014

Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages

We propose a novel language-independent approach for improving machine t...
research
08/03/2016

To Swap or Not to Swap? Exploiting Dependency Word Pairs for Reordering in Statistical Machine Translation

Reordering poses a major challenge in machine translation (MT) between t...
research
09/17/2013

Exploiting Similarities among Languages for Machine Translation

Dictionaries and phrase tables are the basis of modern statistical machi...
research
04/28/2015

Lexical Translation Model Using a Deep Neural Network Architecture

In this paper we combine the advantages of a model using global source s...
research
03/09/2015

Context-Dependent Translation Selection Using Convolutional Neural Network

We propose a novel method for translation selection in statistical machi...

Please sign up or login with your details

Forgot password? Click here to reset