Can Your Context-Aware MT System Pass the DiP Benchmark Tests? : Evaluation Benchmarks for Discourse Phenomena in Machine Translation

04/30/2020
by   Prathyusha Jwalapuram, et al.
0

Despite increasing instances of machine translation (MT) systems including contextual information, the evidence for translation quality improvement is sparse, especially for discourse phenomena. Popular metrics like BLEU are not expressive or sensitive enough to capture quality improvements or drops that are minor in size but significant in perception. We introduce the first of their kind MT benchmark datasets that aim to track and hail improvements across four main discourse phenomena: anaphora, lexical consistency, coherence and readability, and discourse connective translation. We also introduce evaluation methods for these tasks, and evaluate several baseline MT systems on the curated datasets. Surprisingly, we find that existing context-aware models do not improve discourse-related translations consistently across languages and phenomena.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2023

Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

Several recent papers claim human parity at sentence-level Machine Trans...
research
10/11/2020

Lexically Cohesive Neural Machine Translation with Copy Mechanism

Lexically cohesive translations preserve consistency in word choices in ...
research
09/07/2021

Revisiting Context Choices for Context-aware Machine Translation

One of the most popular methods for context-aware machine translation (M...
research
04/05/2023

Document-Level Machine Translation with Large Language Models

Large language models (LLMs) such as Chat-GPT can produce coherent, cohe...
research
10/07/2018

Assessing Crosslingual Discourse Relations in Machine Translation

In an attempt to improve overall translation quality, there has been an ...
research
09/01/2019

One Model to Learn Both: Zero Pronoun Prediction and Translation

Zero pronouns (ZPs) are frequently omitted in pro-drop languages, but sh...
research
12/28/2020

Towards Fully Automated Manga Translation

We tackle the problem of machine translation of manga, Japanese comics. ...

Please sign up or login with your details

Forgot password? Click here to reset