Conformalizing Machine Translation Evaluation

06/09/2023
by   Chrysoula Zerva, et al.
0

Several uncertainty estimation methods have been recently proposed for machine translation evaluation. While these methods can provide a useful indication of when not to trust model predictions, we show in this paper that the majority of them tend to underestimate model uncertainty, and as a result they often produce misleading confidence intervals that do not cover the ground truth. We propose as an alternative the use of conformal prediction, a distribution-free method to obtain confidence intervals with a theoretically established guarantee on coverage. First, we demonstrate that split conformal prediction can “correct” the confidence intervals of previous methods to yield a desired coverage level. Then, we highlight biases in estimated confidence intervals, both in terms of the translation language pairs and the quality of translations. We apply conditional conformal prediction techniques to obtain calibration subsets for each data subgroup, leading to equalized coverage.

READ FULL TEXT
research
06/24/2020

AutoNCP: Automated pipelines for accurate confidence intervals

Successful application of machine learning models to real-world predicti...
research
06/02/2023

Evaluating Machine Translation Quality with Conformal Predictive Distributions

This paper presents a new approach for assessing uncertainty in machine ...
research
05/31/2022

Simulation-Based Inference with WALDO: Perfectly Calibrated Confidence Regions Using Any Prediction or Posterior Estimation Algorithm

The vast majority of modern machine learning targets prediction problems...
research
06/08/2021

Singhing with Confidence: Visualising the Performance of Confidence Structures

Confidence intervals are an established means of portraying uncertainty ...
research
11/01/2021

Uncertainty quantification for wide-bin unfolding: one-at-a-time strict bounds and prior-optimized confidence intervals

Unfolding is an ill-posed inverse problem in particle physics aiming to ...
research
07/27/2020

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

In reinforcement learning, it is typical to use the empirically observed...
research
08/10/2020

Rapid Approximate Aggregation with Distribution-Sensitive Interval Guarantees

Aggregating data is fundamental to data analytics, data exploration, and...

Please sign up or login with your details

Forgot password? Click here to reset