DeepAI
Log In Sign Up

Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation

03/19/2021
by   Nicholas Egan, et al.
0

The goal of a summary is to concisely state the most important information in a document. With this principle in mind, we introduce new reference-free summary evaluation metrics that use a pretrained language model to estimate the information shared between a document and its summary. These metrics are a modern take on the Shannon Game, a method for summary quality scoring proposed decades ago, where we replace human annotators with language models. We also view these metrics as an extension of BLANC, a recently proposed approach to summary quality measurement based on the performance of a language model with and without the help of a summary. Using GPT-2, we empirically verify that the introduced metrics correlate with human judgement based on coverage, overall quality, and five summary dimensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

02/23/2020

Fill in the BLANC: Human-free quality estimation of document summaries

We present BLANC, a new approach to the automatic estimation of document...
07/11/2022

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

Text summarization models are often trained to produce summaries that me...
04/12/2021

Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings

We propose a new reference-free summary quality evaluation measure, with...
12/20/2022

DocAsRef: A Pilot Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

Summary quality assessment metrics have two categories: reference-based ...
03/23/2021

SAFEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metri...
09/10/2020

Classification of descriptions and summary using multiple passes of statistical and natural language toolkits

This document describes a possible approach that can be used to check th...
03/16/2022

Less is More: Summary of Long Instructions is Better for Program Synthesis

Despite the success of large pre-trained language models (LMs) such as C...