A Novel Approach to Dropped Pronoun Translation

04/21/2016
by   Longyue Wang, et al.
0

Dropped Pronouns (DP) in which pronouns are frequently dropped in the source language but should be retained in the target language are challenge in machine translation. In response to this problem, we propose a semi-supervised approach to recall possibly missing pronouns in the translation. Firstly, we build training data for DP generation in which the DPs are automatically labelled according to the alignment information from a parallel corpus. Secondly, we build a deep learning-based DP generator for input sentences in decoding when no corresponding references exist. More specifically, the generation is two-phase: (1) DP position detection, which is modeled as a sequential labelling task with recurrent neural networks; and (2) DP prediction, which employs a multilayer perceptron with rich features. Finally, we integrate the above outputs into our translation system to recall missing pronouns by both extracting rules from the DP-labelled training data and translating the DP-generated input sentences. Experimental results show that our approach achieves a significant improvement of 1.58 BLEU points in translation performance with 66

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/10/2018

Translating Pro-Drop Languages with Reconstruction Models

Pronouns are frequently omitted in pro-drop languages, such as Chinese, ...
research
10/15/2018

Learning to Jointly Translate and Predict Dropped Pronouns with a Shared Reconstruction Mechanism

Pronouns are frequently omitted in pro-drop languages, such as Chinese, ...
research
08/14/2019

On The Evaluation of Machine Translation Systems Trained With Back-Translation

Back-translation is a widely used data augmentation technique which leve...
research
04/07/2018

Guiding Neural Machine Translation with Retrieved Translation Pieces

One of the difficulties of neural machine translation (NMT) is the recal...
research
08/24/2021

The Word is Mightier than the Label: Learning without Pointillistic Labels using Data Programming

Most advanced supervised Machine Learning (ML) models rely on vast amoun...
research
10/20/2022

Dense Paraphrasing for Textual Enrichment

Understanding inferences and answering questions from text requires more...
research
10/24/2022

Dual-Pixel Raindrop Removal

Removing raindrops in images has been addressed as a significant task fo...

Please sign up or login with your details

Forgot password? Click here to reset