Question Answering as an Automatic Evaluation Metric for News Article Summarization

06/02/2019
by   Matan Eyal, et al.
0

Recent work in the field of automatic summarization and headline generation focuses on maximizing ROUGE scores for various news datasets. We present an alternative, extrinsic, evaluation metric for this task, Answering Performance for Evaluation of Summaries. APES utilizes recent progress in the field of reading-comprehension to quantify the ability of a summary to answer a set of manually created questions regarding central entities in the source article. We first analyze the strength of this metric by comparing it to known manual evaluation metrics. We then present an end-to-end neural abstractive model that maximizes APES, while increasing ROUGE scores to competitive results.

READ FULL TEXT
research
05/07/2020

FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization

Neural abstractive summarization models are prone to generate content in...
research
05/10/2021

Improving Factual Consistency of Abstractive Summarization via Question Answering

A commonly observed problem with the state-of-the art abstractive summar...
research
10/07/2020

MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics

Posing reading comprehension as a generation problem provides a great de...
research
01/28/2023

MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization

State-of-the-art summarization systems can generate highly fluent summar...
research
04/04/2019

Guiding Extractive Summarization with Question-Answering Rewards

Highlighting while reading is a natural behavior for people to track sal...
research
09/26/2019

Read, Attend and Comment: A Deep Architecture for Automatic News Comment Generation

Automatic news comment generation is beneficial for real applications bu...
research
05/03/2022

Quiz Design Task: Helping Teachers Create Quizzes with Automated Question Generation

Question generation (QGen) models are often evaluated with standardized ...

Please sign up or login with your details

Forgot password? Click here to reset