Controlling Text Complexity in Neural Machine Translation

11/03/2019
by   Sweta Agrawal, et al.
0

This work introduces a machine translation task where the output is aimed at audiences of different levels of target language proficiency. We collect a high quality dataset of news articles available in English and Spanish, written for diverse grade levels and propose a method to align segments across comparable bilingual articles. The resulting dataset makes it possible to train multi-task sequence-to-sequence models that translate Spanish into English targeted at an easier reading grade level than the original Spanish. We show that these multi-task models outperform pipeline approaches that translate and simplify text independently.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/01/2019

JUCBNMT at WMT2018 News Translation Task: Character Based Neural Machine Translation of Finnish to English

In the current work, we present a description of the system submitted to...
research
08/01/2019

JUMT at WMT2019 News Translation Task: A Hybrid approach to Machine Translation for Lithuanian to English

In the current work, we present a description of the system submitted to...
research
06/01/2017

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

In this paper, we present nmtpy, a flexible Python toolkit based on Thea...
research
05/18/2019

A Case Study: Exploiting Neural Machine Translation to Translate CUDA to OpenCL

The sequence-to-sequence (seq2seq) model for neural machine translation ...
research
05/05/2020

It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information

The performance of neural machine translation systems is commonly evalua...
research
11/20/2019

Controlling Neural Machine Translation Formality with Synthetic Supervision

This work aims to produce translations that convey source language conte...
research
06/24/2020

A High-Quality Multilingual Dataset for Structured Documentation Translation

This paper presents a high-quality multilingual dataset for the document...

Please sign up or login with your details

Forgot password? Click here to reset