Log In Sign Up

Play the Shannon Game With Language Models: A Human-Free Approach to Summary Evaluation

by   Nicholas Egan, et al.

The goal of a summary is to concisely state the most important information in a document. With this principle in mind, we introduce new reference-free summary evaluation metrics that use a pretrained language model to estimate the information shared between a document and its summary. These metrics are a modern take on the Shannon Game, a method for summary quality scoring proposed decades ago, where we replace human annotators with language models. We also view these metrics as an extension of BLANC, a recently proposed approach to summary quality measurement based on the performance of a language model with and without the help of a summary. Using GPT-2, we empirically verify that the introduced metrics correlate with human judgement based on coverage, overall quality, and five summary dimensions.


page 1

page 2

page 3

page 4


Fill in the BLANC: Human-free quality estimation of document summaries

We present BLANC, a new approach to the automatic estimation of document...

SummScore: A Comprehensive Evaluation Metric for Summary Quality Based on Cross-Encoder

Text summarization models are often trained to produce summaries that me...

Estimation of Summary-to-Text Inconsistency by Mismatched Embeddings

We propose a new reference-free summary quality evaluation measure, with...

DocAsRef: A Pilot Empirical Study on Repurposing Reference-Based Summary Quality Metrics Reference-Freely

Summary quality assessment metrics have two categories: reference-based ...

SAFEval: Summarization Asks for Fact-based Evaluation

Summarization evaluation remains an open research problem: current metri...

Classification of descriptions and summary using multiple passes of statistical and natural language toolkits

This document describes a possible approach that can be used to check th...

Less is More: Summary of Long Instructions is Better for Program Synthesis

Despite the success of large pre-trained language models (LMs) such as C...