FRUIT: Faithfully Reflecting Updated Information in Text

12/16/2021
by   Robert L. Logan IV, et al.
0

Textual knowledge bases such as Wikipedia require considerable effort to keep up to date and consistent. While automated writing assistants could potentially ease this burden, the problem of suggesting edits grounded in external knowledge has been under-explored. In this paper, we introduce the novel generation task of *faithfully reflecting updated information in text*(FRUIT) where the goal is to update an existing article given new evidence. We release the FRUIT-WIKI dataset, a collection of over 170K distantly supervised data produced from pairs of Wikipedia snapshots, along with our data generation pipeline and a gold evaluation set of 914 instances whose edits are guaranteed to be supported by the evidence. We provide benchmark results for popular generation systems as well as EDIT5 – a T5-based approach tailored to editing we introduce that establishes the state of the art. Our analysis shows that developing models that can update articles faithfully requires new capabilities for neural generation models, and opens doors to many new applications.

READ FULL TEXT

page 8

page 14

page 15

page 16

research
05/13/2019

Towards Content Transfer through Grounded Text Generation

Recent work in neural generation has attracted significant interest in c...
research
08/11/2023

Taboo and Collaborative Knowledge Production: Evidence from Wikipedia

By definition, people are reticent or even unwilling to talk about taboo...
research
09/27/2022

EditEval: An Instruction-Based Benchmark for Text Improvements

Evaluation of text generation to date has primarily focused on content c...
research
07/20/2021

WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset

We present a new dataset of Wikipedia articles each paired with a knowle...
research
12/29/2020

Generating Wikipedia Article Sections from Diverse Data Sources

Datasets for data-to-text generation typically focus either on multi-dom...
research
05/05/2022

Entity Cloze By Date: What LMs Know About Unseen Entities

Language models (LMs) are typically trained once on a large-scale corpus...
research
05/22/2020

A Generative Approach to Titling and Clustering Wikipedia Sections

We evaluate the performance of transformer encoders with various decoder...

Please sign up or login with your details

Forgot password? Click here to reset