Spurious Correlations in Reference-Free Evaluation of Text Generation

04/21/2022
by   Esin Durmus, et al.
0

Model-based, reference-free evaluation metrics have been proposed as a fast and cost-effective approach to evaluate Natural Language Generation (NLG) systems. Despite promising recent results, we find evidence that reference-free evaluation metrics of summarization and dialog generation may be relying on spurious correlations with measures such as word overlap, perplexity, and length. We further observe that for text summarization, these metrics have high error rates when ranking current state-of-the-art abstractive summarization systems. We demonstrate that these errors can be mitigated by explicitly designing evaluation metrics to avoid spurious features in reference-free evaluation.

READ FULL TEXT
research
10/14/2020

Re-evaluating Evaluation in Text Summarization

Automated evaluation metrics as a stand-in for manual evaluation are an ...
research
04/29/2020

Reference and Document Aware Semantic Evaluation Methods for Korean Language Summarization

Text summarization refers to the process that generates a shorter form o...
research
09/14/2021

Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation

Natural language generation (NLG) spans a broad range of tasks, each of ...
research
10/22/2022

On the Limitations of Reference-Free Evaluations of Generated Text

There is significant interest in developing evaluation metrics which acc...
research
08/01/2022

SMART: Sentences as Basic Units for Text Evaluation

Widely used evaluation metrics for text generation either do not work we...
research
04/14/2020

A Human Evaluation of AMR-to-English Generation Systems

Most current state-of-the art systems for generating English text from A...
research
12/20/2022

Toward Human-Like Evaluation for Natural Language Generation with Error Analysis

The state-of-the-art language model-based automatic metrics, e.g. BARTSc...

Please sign up or login with your details

Forgot password? Click here to reset