Cited Text Spans for Citation Text Generation

09/12/2023
by   Xiangci Li, et al.
0

Automatic related work generation must ground their outputs to the content of the cited papers to avoid non-factual hallucinations, but due to the length of scientific documents, existing abstractive approaches have conditioned only on the cited paper abstracts. We demonstrate that the abstract is not always the most appropriate input for citation generation and that models trained in this way learn to hallucinate. We propose to condition instead on the cited text span (CTS) as an alternative to the abstract. Because manual CTS annotation is extremely time- and labor-intensive, we experiment with automatic, ROUGE-based labeling of candidate CTS sentences, achieving sufficiently strong performance to substitute for expensive human annotations, and we propose a human-in-the-loop, keyword-based CTS retrieval approach that makes generating citation texts grounded in the full text of cited papers both promising and practical.

READ FULL TEXT
research
02/02/2020

Citation Text Generation

We introduce the task of citation text generation: given a pair of scien...
research
12/19/2022

CiteBench: A benchmark for Scientific Citation Text Generation

The publication rates are skyrocketing across many fields of science, an...
research
11/14/2022

Controllable Citation Text Generation

The aim of citation generation is usually to automatically generate a ci...
research
04/26/2021

Semantic Analysis for Automated Evaluation of the Potential Impact of Research Articles

Can the analysis of the semantics of words used in the text of a scienti...
research
05/07/2022

CORWA: A Citation-Oriented Related Work Annotation Dataset

Academic research is an exploratory activity to discover new solutions t...
research
09/04/2023

Automatic Scam-Baiting Using ChatGPT

Automatic scam-baiting is an online fraud countermeasure that involves a...
research
01/15/2023

Using citation networks to evaluate the impact of text length on the identification of relevant concepts

The identification of the most significant concepts in unstructured data...

Please sign up or login with your details

Forgot password? Click here to reset