Evaluating and Improving the Coreference Capabilities of Machine Translation Models

02/16/2023
by   Asaf Yehudai, et al.
0

Machine translation (MT) requires a wide range of linguistic capabilities, which current end-to-end models are expected to learn implicitly by observing aligned sentences in bilingual corpora. In this work, we ask: How well do MT models learn coreference resolution from implicit signal? To answer this question, we develop an evaluation methodology that derives coreference clusters from MT output and evaluates them without requiring annotations in the target language. We further evaluate several prominent open-source and commercial MT systems, translating from English to six target languages, and compare them to state-of-the-art coreference resolvers on three challenging benchmarks. Our results show that the monolingual resolvers greatly outperform MT models. Motivated by this result, we experiment with different methods for incorporating the output of coreference resolution models in MT, showing improvement over strong baselines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/06/2018

Système de traduction automatique statistique Anglais-Arabe

Machine translation (MT) is the process of translating text written in a...
research
06/27/2019

The Impact of Preprocessing on Arabic-English Statistical and Neural Machine Translation

Neural networks have become the state-of-the-art approach for machine tr...
research
09/05/2023

Automating Behavioral Testing in Machine Translation

Behavioral testing in NLP allows fine-grained evaluation of systems by e...
research
12/14/2016

Unsupervised Clustering of Commercial Domains for Adaptive Machine Translation

In this paper, we report on domain clustering in the ambit of an adaptiv...
research
10/24/2022

Bilingual Synchronization: Restoring Translational Relationships with Editing Operations

Machine Translation (MT) is usually viewed as a one-shot process that ge...
research
05/25/2022

Machine Translation Robustness to Natural Asemantic Variation

We introduce and formalize an under-studied linguistic phenomenon we cal...
research
05/17/2023

Bring More Attention to Syntactic Symmetry for Automatic Postediting of High-Quality Machine Translations

Automatic postediting (APE) is an automated process to refine a given ma...

Please sign up or login with your details

Forgot password? Click here to reset