Zero-pronoun Data Augmentation for Japanese-to-English Translation

07/01/2021
by   Ryokan Ri, et al.
0

For Japanese-to-English translation, zero pronouns in Japanese pose a challenge, since the model needs to infer and produce the corresponding pronoun in the target side of the English sentence. However, although fully resolving zero pronouns often needs discourse context, in some cases, the local context within a sentence gives clues to the inference of the zero pronoun. In this study, we propose a data augmentation method that provides additional training signals for the translation model to learn correlations between local context and zero pronouns. We show that the proposed method significantly improves the accuracy of zero pronoun translation with machine translation experiments in the conversational domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/29/2020

Syntax-aware Data Augmentation for Neural Machine Translation

Data augmentation is an effective performance enhancement in neural mach...
research
04/17/2021

Sentence Concatenation Approach to Data Augmentation for Neural Machine Translation

Neural machine translation (NMT) has recently gained widespread attentio...
research
11/05/2020

Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation

Out of vocabulary (OOV) is a problem in the context of Machine Translati...
research
11/07/2021

Developing neural machine translation models for Hungarian-English

I train models for the task of neural machine translation for English-Hu...
research
09/20/2021

Data Augmentation Methods for Anaphoric Zero Pronouns

In pro-drop language like Arabic, Chinese, Italian, Japanese, Spanish, a...
research
05/04/2021

Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution

In this paper, we investigate the driving factors behind concatenation, ...
research
02/19/2022

LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects

This paper proposes a novel linear prediction coding-based data aug-ment...

Please sign up or login with your details

Forgot password? Click here to reset