Answers Unite! Unsupervised Metrics for Reinforced Summarization Models

09/04/2019
by   Thomas Scialom, et al.
0

Abstractive summarization approaches based on Reinforcement Learning (RL) have recently been proposed to overcome classical likelihood maximization. RL enables to consider complex, possibly non-differentiable, metrics that globally assess the quality and relevance of the generated outputs. ROUGE, the most used summarization metric, is known to suffer from bias towards lexical similarity as well as from suboptimal accounting for fluency and readability of the generated abstracts. We thus explore and propose alternative evaluation measures: the reported human-evaluation analysis shows that the proposed metrics, based on Question Answering, favorably compares to ROUGE – with the additional property of not requiring reference summaries. Training a RL-based model on these metrics leads to improvements (both in terms of human or automated metrics) over current approaches that use ROUGE as a reward.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2016

Revisiting Summarization Evaluation for Scientific Articles

Evaluation of text summarization approaches have been mostly based on me...
research
12/11/2019

Quality of syntactic implication of RL-based sentence summarization

Work on summarization has explored both reinforcement learning (RL) opti...
research
08/31/2019

Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization

Deep reinforcement learning (RL) has been a commonly-used strategy for t...
research
10/27/2022

Improving abstractive summarization with energy-based re-ranking

Current abstractive summarization systems present important weaknesses w...
research
03/23/2021

SAFEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metri...
research
10/14/2021

MoFE: Mixture of Factual Experts for Controlling Hallucinations in Abstractive Summarization

Neural abstractive summarization models are susceptible to generating fa...
research
10/09/2020

Evaluating and Characterizing Human Rationales

Two main approaches for evaluating the quality of machine-generated rati...

Please sign up or login with your details

Forgot password? Click here to reset