WIDAR – Weighted Input Document Augmented ROUGE

01/23/2022
by   Raghav Jain, et al.
0

The task of automatic text summarization has gained a lot of traction due to the recent advancements in machine learning techniques. However, evaluating the quality of a generated summary remains to be an open problem. The literature has widely adopted Recall-Oriented Understudy for Gisting Evaluation (ROUGE) as the standard evaluation metric for summarization. However, ROUGE has some long-established limitations; a major one being its dependence on the availability of good quality reference summary. In this work, we propose the metric WIDAR which in addition to utilizing the reference summary uses also the input document in order to evaluate the quality of the generated summary. The proposed metric is versatile, since it is designed to adapt the evaluation score according to the quality of the reference summary. The proposed metric correlates better than ROUGE by 26 coherence, consistency, fluency, and relevance on human judgement scores provided in the SummEval dataset. The proposed metric is able to obtain comparable results with other state-of-the-art metrics while requiring a relatively short computational time.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2021

A Training-free and Reference-free Summarization Evaluation Metric via Centrality-weighted Relevance and Self-referenced Redundancy

In recent years, reference-based and supervised summarization evaluation...
research
05/13/2020

End-to-end Semantics-based Summary Quality Assessment for Single-document Summarization

ROUGE is the de facto criterion for summarization research. However, its...
research
11/22/2022

HaRiM^+: Evaluating Summary Quality with Hallucination Risk

One of the challenges of developing a summarization model arises from th...
research
12/22/2021

Consistency and Coherence from Points of Contextual Similarity

Factual consistency is one of important summary evaluation dimensions, e...
research
05/26/2023

UMSE: Unified Multi-scenario Summarization Evaluation

Summarization quality evaluation is a non-trivial task in text summariza...
research
06/16/2023

I Want This, Not That: Personalized Summarization of Scientific Scholarly Texts

In this paper, we present a proposal for an unsupervised algorithm, P-Su...
research
03/23/2021

SAFEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metri...

Please sign up or login with your details

Forgot password? Click here to reset