Improving the Faithfulness of Abstractive Summarization via Entity Coverage Control

07/05/2022
by   Haopeng Zhang, et al.
0

Abstractive summarization systems leveraging pre-training language models have achieved superior results on benchmark datasets. However, such models have been shown to be more prone to hallucinate facts that are unfaithful to the input context. In this paper, we propose a method to remedy entity-level extrinsic hallucinations with Entity Coverage Control (ECC). We first compute entity coverage precision and prepend the corresponding control code for each training example, which implicitly guides the model to recognize faithfulness contents in the training phase. We further extend our method via intermediate fine-tuning on large but noisy data extracted from Wikipedia to unlock zero-shot summarization. We show that the proposed method leads to more faithful and salient abstractive summarization in supervised fine-tuning and zero-shot settings according to our experimental results on three benchmark datasets XSum, Pubmed, and SAMSum of very different domains and styles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/11/2023

PROM: A Phrase-level Copying Mechanism with Pre-training for Abstractive Summarization

Based on the remarkable achievements of pre-trained language models in a...
research
10/24/2020

Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation

Models pretrained with self-supervised objectives on large text corpora ...
research
10/16/2021

PRIMER: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

Recently proposed pre-trained generation models achieve strong performan...
research
04/09/2022

Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization

The most advanced abstractive dialogue summarizers lack generalization a...
research
09/14/2023

Leveraging Contextual Information for Effective Entity Salience Detection

In text documents such as news articles, the content and key events usua...
research
10/15/2021

Training Dynamics for Text Summarization Models

Pre-trained language models (e.g. BART) have shown impressive results wh...
research
04/13/2023

Improving Few-Shot Prompts with Relevant Static Analysis Products

Large Language Models (LLM) are a new class of computation engines, "pro...

Please sign up or login with your details

Forgot password? Click here to reset