PreQuEL: Quality Estimation of Machine Translation Outputs in Advance

05/18/2022
by   Shachar Don-Yehiya, et al.
0

We present the task of PreQuEL, Pre-(Quality-Estimation) Learning. A PreQuEL system predicts how well a given sentence will be translated, without recourse to the actual translation, thus eschewing unnecessary resource allocation when translation quality is bound to be low. PreQuEL can be defined relative to a given MT system (e.g., some industry service) or generally relative to the state-of-the-art. From a theoretical perspective, PreQuEL places the focus on the source text, tracing properties, possibly linguistic features, that make a sentence harder to machine translate. We develop a baseline model for the task and analyze its performance. We also develop a data augmentation method (from parallel corpora), that improves results substantially. We show that this augmentation method can improve the performance of the Quality-Estimation task as well. We investigate the properties of the input text that our model is sensitive to, by testing it on challenge sets and different languages. We conclude that it is aware of syntactic and semantic distinctions, and correlates and even over-emphasizes the importance of standard NLP features.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2022

An Empirical Study of Automatic Post-Editing

Automatic post-editing (APE) aims to reduce manual post-editing efforts ...
research
05/17/2021

Ensemble-based Transfer Learning for Low-resource Machine Translation Quality Estimation

Quality Estimation (QE) of Machine Translation (MT) is a task to estimat...
research
12/20/2022

Original or Translated? On the Use of Parallel Data for Translation Quality Estimation

Machine Translation Quality Estimation (QE) is the task of evaluating tr...
research
03/22/2023

Selective Data Augmentation for Robust Speech Translation

Speech translation (ST) systems translate speech in one language to text...
research
03/04/2022

From Simultaneous to Streaming Machine Translation by Leveraging Streaming History

Simultaneous Machine Translation is the task of incrementally translatin...
research
09/30/2015

A Sentence Meaning Based Alignment Method for Parallel Text Corpora Preparation

Text alignment is crucial to the accuracy of Machine Translation (MT) sy...
research
09/24/2014

Semantically-Informed Syntactic Machine Translation: A Tree-Grafting Approach

We describe a unified and coherent syntactic framework for supporting a ...

Please sign up or login with your details

Forgot password? Click here to reset