Augmenting Phrase Table by Employing Lexicons for Pivot-based SMT

12/01/2015
by   Yiming Cui, et al.
0

Pivot language is employed as a way to solve the data sparseness problem in machine translation, especially when the data for a particular language pair does not exist. The combination of source-to-pivot and pivot-to-target translation models can induce a new translation model through the pivot language. However, the errors in two models may compound as noise, and still, the combined model may suffer from a serious phrase sparsity problem. In this paper, we directly employ the word lexical model in IBM models as an additional resource to augment pivot phrase table. In addition, we also propose a phrase table pruning method which takes into account both of the source and target phrasal coverage. Experimental result shows that our pruning method significantly outperforms the conventional one, which only considers source side phrasal coverage. Furthermore, by including the entries in the lexicon model, the phrase coverage increased, and we achieved improved results in Chinese-to-Japanese translation using English as pivot language.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2016

Morphological Constraints for Phrase Pivot Statistical Machine Translation

The lack of parallel data for many language pairs is an important challe...
research
02/05/2015

Beyond Word-based Language Model in Statistical Machine Translation

Language model is one of the most important modules in statistical machi...
research
04/09/2021

Design and Implementation of English To Yoruba Verb Phrase Machine Translation System

We aim to develop an English to Yoruba machine translation system which ...
research
06/15/2016

Agreement-based Learning of Parallel Lexicons and Phrases from Non-Parallel Corpora

We introduce an agreement-based approach to learning parallel lexicons a...
research
01/06/2019

A Comparative Study on Vocabulary Reduction for Phrase Table Smoothing

This work systematically analyzes the smoothing effect of vocabulary red...
research
03/07/2015

Identifying missing dictionary entries with frequency-conserving context models

In an effort to better understand meaning from natural language texts, w...
research
03/09/2015

Context-Dependent Translation Selection Using Convolutional Neural Network

We propose a novel method for translation selection in statistical machi...

Please sign up or login with your details

Forgot password? Click here to reset