A Simple and Effective Approach to Automatic Post-Editing with Transfer Learning

06/14/2019
by   Gonçalo M. Correia, et al.
0

Automatic post-editing (APE) seeks to automatically refine the output of a black-box machine translation (MT) system through human post-edits. APE systems are usually trained by complementing human post-edited data with large, artificial data generated through back-translations, a time-consuming process often no easier than training an MT system from scratch. In this paper, we propose an alternative where we fine-tune pre-trained BERT models on both the encoder and decoder of an APE system, exploring several parameter sharing strategies. By only training on a dataset of 23K sentences for 3 hours on a single GPU, we obtain results that are competitive with systems that were trained on 5M artificial sentences. When we add this artificial data, our method obtains state-of-the-art results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2020

MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset

We present MLQE-PE, a new dataset for Machine Translation (MT) Quality E...
research
05/30/2019

Unbabel's Submission to the WMT2019 APE Shared Task: BERT-based Encoder-Decoder for Automatic Post-Editing

This paper describes Unbabel's submission to the WMT2019 APE Shared Task...
research
09/14/2021

Netmarble AI Center's WMT21 Automatic Post-Editing Shared Task Submission

This paper describes Netmarble's submission to WMT21 Automatic Post-Edit...
research
03/20/2018

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

Training models for the automatic correction of machine-translated text ...
research
07/01/2018

A Shared Attention Mechanism for Interpretation of Neural Automatic Post-Editing Systems

Automatic post-editing (APE) systems aim to correct the systematic error...
research
09/16/2022

An Empirical Study of Automatic Post-Editing

Automatic post-editing (APE) aims to reduce manual post-editing efforts ...
research
06/20/2023

Efficient Machine Translation Corpus Generation

This paper proposes an efficient and semi-automated method for human-in-...

Please sign up or login with your details

Forgot password? Click here to reset