Summarization is (Almost) Dead

09/18/2023
by   Xiao Pu, et al.
0

How well can large language models (LLMs) generate summaries? We develop new datasets and conduct human evaluation experiments to evaluate the zero-shot generation capability of LLMs across five distinct summarization tasks. Our findings indicate a clear preference among human evaluators for LLM-generated summaries over human-written summaries and summaries generated by fine-tuned models. Specifically, LLM-generated summaries exhibit better factual consistency and fewer instances of extrinsic hallucinations. Due to the satisfactory performance of LLMs in summarization tasks (even surpassing the benchmark of reference summaries), we believe that most conventional works in the field of text summarization are no longer necessary in the era of LLMs. However, we recognize that there are still some directions worth exploring, such as the creation of novel datasets with higher quality and more reliable evaluation methods.

READ FULL TEXT
research
09/26/2022

News Summarization and Evaluation in the Era of GPT-3

The recent success of zero- and few-shot prompting with models like GPT-...
research
01/31/2023

Benchmarking Large Language Models for News Summarization

Large language models (LLMs) have shown promise for automatic summarizat...
research
02/16/2023

Exploring the Limits of ChatGPT for Query or Aspect-based Text Summarization

Text summarization has been a crucial problem in natural language proces...
research
05/24/2023

Is Summary Useful or Not? An Extrinsic Human Evaluation of Text Summaries on Downstream Tasks

Research on automated text summarization relies heavily on human and aut...
research
03/30/2023

Comparing Abstractive Summaries Generated by ChatGPT to Real Summaries Through Blinded Reviewers and Text Classification Algorithms

Large Language Models (LLMs) have gathered significant attention due to ...
research
06/01/2023

Multi-Dimensional Evaluation of Text Summarization with In-Context Learning

Evaluation of natural language generation (NLG) is complex and multi-dim...
research
12/17/2022

RISE: Leveraging Retrieval Techniques for Summarization Evaluation

Evaluating automatically-generated text summaries is a challenging task....

Please sign up or login with your details

Forgot password? Click here to reset