Evaluating Factuality in Generation with Dependency-level Entailment

10/12/2020
by   Tanya Goyal, et al.
0

Despite significant progress in text generation models, a serious limitation is their tendency to produce text that is factually inconsistent with information in the input. Recent work has studied whether textual entailment systems can be used to identify factual errors; however, these sentence-level entailment models are trained to solve a different problem than generation filtering and they do not localize which part of a generation is non-factual. In this paper, we propose a new formulation of entailment that decomposes it at the level of dependency arcs. Rather than focusing on aggregate decisions, we instead ask whether the semantic relationship manifested by individual dependency arcs in the generated output is supported by the input. Human judgments on this task are difficult to obtain; we therefore propose a method to automatically create data based on existing entailment or paraphrase corpora. Experiments show that our dependency arc entailment model trained on this data can identify factual inconsistencies in paraphrasing and summarization better than sentence-level methods or those based on question generation, while additionally localizing the erroneous parts of the generation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2023

Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

Despite the seeming success of contemporary grounded text generation sys...
research
11/21/2020

Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference

A major challenge in evaluating data-to-text (D2T) generation is measuri...
research
03/20/2022

Entailment Relation Aware Paraphrase Generation

We introduce a new task of entailment relation aware paraphrase generati...
research
11/30/2022

Revisiting text decomposition methods for NLI-based factuality scoring of summaries

Scoring the factuality of a generated summary involves measuring the deg...
research
05/02/2020

Improving Truthfulness of Headline Generation

Most studies on abstractive summarization re-port ROUGE scores between s...
research
03/11/2021

ENTRUST: Argument Reframing with Language Models and Entailment

"Framing" involves the positive or negative presentation of an argument ...
research
04/20/2018

Acquisition of Phrase Correspondences using Natural Deduction Proofs

How to identify, extract, and use phrasal knowledge is a crucial problem...

Please sign up or login with your details

Forgot password? Click here to reset