High-Precision Extraction of Emerging Concepts from Scientific Literature

06/11/2020
by   Daniel King, et al.
0

Identification of new concepts in scientific literature can help power faceted search, scientific trend analysis, knowledge-base construction, and more, but current methods are lacking. Manual identification cannot keep up with the torrent of new publications, while the precision of existing automatic techniques is too low for many applications. We present an unsupervised concept extraction method for scientific literature that achieves much higher precision than previous work. Our approach relies on a simple but novel intuition: each scientific concept is likely to be introduced or popularized by a single paper that is disproportionately cited by subsequent papers mentioning the concept. From a corpus of computer science papers on arXiv, we find that our method achieves a Precision@1000 of 99 substantially better precision-yield trade-off across the top 15,000 extractions. To stimulate research in this area, we release our code and data (https://github.com/allenai/ForeCite).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2021

TDMSci: A Specialized Corpus for Scientific Literature Entity Tagging of Tasks Datasets and Metrics

Tasks, Datasets and Evaluation Metrics are important concepts for unders...
research
10/08/2020

Extracting a Knowledge Base of Mechanisms from COVID-19 Papers

The urgency of mitigating COVID-19 has spawned a large and diverse body ...
research
10/18/2022

Title detection: a novel approach to automatically finding retractions and other editorial notices in the scholarly literature

Despite being a key element in the process of disseminating scientific k...
research
05/14/2022

ACCoRD: A Multi-Document Approach to Generating Diverse Descriptions of Scientific Concepts

Systems that can automatically define unfamiliar terms hold the promise ...
research
10/05/2021

Using Elasticsearch for entity recognition in affiliation disambiguation

Automatic recognition of affiliations in the metadata of scholarly publi...
research
10/06/2017

Unsupervised Extraction of Representative Concepts from Scientific Literature

This paper studies the automated categorization and extraction of scient...
research
04/01/2023

From Zero to Hero: Convincing with Extremely Complicated Math

Becoming a (super) hero is almost every kid's dream. During their shelte...

Please sign up or login with your details

Forgot password? Click here to reset