Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

by   Prasetya Ajie Utama, et al.

Neural abstractive summarization models are prone to generate summaries which are factually inconsistent with their source documents. Previous work has introduced the task of recognizing such factual inconsistency as a downstream application of natural language inference (NLI). However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target task. In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries, introducing varying types of factual inconsistencies. Unlike previously introduced document-level NLI datasets, our generated dataset contains examples that are diverse and inconsistent yet plausible. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization. The code to obtain the dataset is available online at https://github.com/joshbambrick/Falsesum



There are no comments yet.


page 4

page 5

page 8

page 14


Masked Summarization to Generate Factually Inconsistent Summaries for Improved Factual Consistency Checking

Despite the recent advances in abstractive summarization systems, it is ...

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

In the summarization domain, a key requirement for summaries is to be fa...

Focus Attention: Promoting Faithfulness and Diversity in Summarization

Professional summaries are written with document-level information, such...

Enriching and Controlling Global Semantics for Text Summarization

Recently, Transformer-based models have been proven effective in the abs...

RefSum: Refactoring Neural Summarization

Although some recent works show potential complementarity among differen...

Nutri-bullets: Summarizing Health Studies by Composing Segments

We introduce Nutri-bullets, a multi-document summarization task for heal...

Chart-to-Text: A Large-Scale Benchmark for Chart Summarization

Charts are commonly used for exploring data and communicating insights. ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.