Enhancing Keyphrase Extraction from Academic Articles with their Reference Information

11/28/2021
by   Chengzhi Zhang, et al.
0

With the development of Internet technology, the phenomenon of information overload is becoming more and more obvious. It takes a lot of time for users to obtain the information they need. However, keyphrases that summarize document information highly are helpful for users to quickly obtain and understand documents. For academic resources, most existing studies extract keyphrases through the title and abstract of papers. We find that title information in references also contains author-assigned keyphrases. Therefore, this article uses reference information and applies two typical methods of unsupervised extraction methods (TF*IDF and TextRank), two representative traditional supervised learning algorithms (Naïve Bayes and Conditional Random Field) and a supervised deep learning model (BiLSTM-CRF), to analyze the specific performance of reference information on keyphrase extraction. It is expected to improve the quality of keyphrase recognition from the perspective of expanding the source text. The experimental results show that reference information can increase precision, recall, and F1 of automatic keyphrase extraction to a certain extent. This indicates the usefulness of reference information on keyphrase extraction of academic papers and provides a new idea for the following research on automatic keyphrase extraction.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/17/2023

A Benchmark of PDF Information Extraction Tools using a Multi-Task and Multi-Domain Evaluation Framework for Academic Documents

Extracting information from academic PDF documents is crucial for numero...
research
12/23/2021

LAME: Layout Aware Metadata Extraction Approach for Research Articles

The volume of academic literature, such as academic conference papers an...
research
11/28/2021

Enhancing Identification of Structure Function of Academic Articles Using Contextual Information

With the enrichment of literature resources, researchers are facing the ...
research
01/09/2022

Phocus: Picking Valuable Research from a Sea of Citations

The deluge of new papers has significantly blocked the development of ac...
research
10/21/2020

Using the Full-text Content of Academic Articles to Identify and Evaluate Algorithm Entities in the Domain of Natural Language Processing

In the era of big data, the advancement, improvement, and application of...
research
12/04/2018

Information Extraction Framework to Build Legislation Network

This paper concerns an Information Extraction process for building a dyn...
research
09/24/2018

WiRe57 : A Fine-Grained Benchmark for Open Information Extraction

We build a reference for the task of Open Information Extraction, on fiv...

Please sign up or login with your details

Forgot password? Click here to reset