Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

05/12/2022
by   Prasetya Ajie Utama, et al.
21

Neural abstractive summarization models are prone to generate summaries which are factually inconsistent with their source documents. Previous work has introduced the task of recognizing such factual inconsistency as a downstream application of natural language inference (NLI). However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target task. In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries, introducing varying types of factual inconsistencies. Unlike previously introduced document-level NLI datasets, our generated dataset contains examples that are diverse and inconsistent yet plausible. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization. The code to obtain the dataset is available online at https://github.com/joshbambrick/Falsesum

READ FULL TEXT

page 4

page 5

page 8

page 14

research
05/04/2022

Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

Despite the recent advances in abstractive summarization systems, it is ...
research
11/18/2021

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

In the summarization domain, a key requirement for summaries is to be fa...
research
05/25/2021

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Professional summaries are written with document-level information, such...
research
06/22/2022

Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

With the advent of large language models, methods for abstractive summar...
research
06/15/2023

Neural models for Factual Inconsistency Classification with Explanations

Factual consistency is one of the most important requirements when editi...
research
04/15/2021

RefSum: Refactoring Neural Summarization

Although some recent works show potential complementarity among differen...
research
03/07/2023

Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction

Large language models (LLMs) show great potential for synthetic data gen...

Please sign up or login with your details

Forgot password? Click here to reset