SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

07/11/2022
by   Wuhang Lin, et al.
0

Text summarization models are often trained to produce summaries that meet human quality requirements. However, the existing evaluation metrics for summary text are only rough proxies for summary quality, suffering from low correlation with human scoring and inhibition of summary diversity. To solve these problems, we propose SummScore, a comprehensive metric for summary quality evaluation based on CrossEncoder. Firstly, by adopting the original-summary measurement mode and comparing the semantics of the original text, SummScore gets rid of the inhibition of summary diversity. With the help of the text-matching pre-training Cross-Encoder, SummScore can effectively capture the subtle differences between the semantics of summaries. Secondly, to improve the comprehensiveness and interpretability, SummScore consists of four fine-grained submodels, which measure Coherence, Consistency, Fluency, and Relevance separately. We use semi-supervised multi-rounds of training to improve the performance of our model on extremely limited annotated data. Extensive experiments show that SummScore significantly outperforms existing evaluation metrics in the above four dimensions in correlation with human scoring. We also provide the quality evaluation results of SummScore on 16 mainstream summarization models for later research.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Understanding the Extent to which Summarization Evaluation Metrics Measure the Information Quality of Summaries

Reference-based metrics such as ROUGE or BERTScore evaluate the content ...
research
04/08/2020

Asking and Answering Questions to Evaluate the Factual Consistency of Summaries

Practical applications of abstractive summarization models are limited b...
research
09/14/2022

How to Find Strong Summary Coherence Measures? A Toolbox and a Comparative Study for Summary Coherence Measure Evaluation

Automatically evaluating the coherence of summaries is of great signific...
research
03/19/2021

Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation

The goal of a summary is to concisely state the most important informati...
research
10/25/2022

Towards Interpretable Summary Evaluation via Allocation of Contextual Embeddings to Reference Text Topics

Despite extensive recent advances in summary generation models, evaluati...
research
09/02/2019

SumQE: a BERT-based Summary Quality Estimation Model

We propose SumQE, a novel Quality Estimation model for summarization bas...
research
12/22/2021

Consistency and Coherence from Points of Contextual Similarity

Factual consistency is one of important summary evaluation dimensions, e...

Please sign up or login with your details

Forgot password? Click here to reset