Using citation networks to evaluate the impact of text length on the identification of relevant concepts

01/15/2023
by   Jorge A. V. Tohalino, et al.
0

The identification of the most significant concepts in unstructured data is of critical importance in various practical applications. Despite the large number of methods that have been put forth to extract the main topics of texts, a limited number of studies have probed the impact of the text length on the performance of keyword extraction (KE) methods. In this study, we adopted a network-based approach to evaluate whether keywords extracted from paper abstracts are compatible with keywords extracted from full papers. We employed a community detection method to identify groups of related papers in citation networks. These paper clusters were then employed to extract keywords from abstracts. Our results indicate that while the various community detection methods employed in our KE approach yielded similar levels of accuracy, a correlation analysis revealed that these methods produced distinct keyword lists for each abstract. We also observed that all considered approaches, however, reach low values of accuracy. Surprisingly, text clustering approaches outperformed all citation-based methods. The findings suggest that using different sources of information to extract keywords can lead to significant differences in performance. This effect can play an important role in applications relying upon the identification of relevant concepts.

READ FULL TEXT

page 7

page 17

research
05/04/2021

On the Stability of Citation Networks

Citation networks can reveal many important information regarding the de...
research
06/18/2019

Query Generation for Patent Retrieval with Keyword Extraction based on Syntactic Features

This paper describes a new method to extract relevant keywords from pate...
research
04/10/2021

FRAKE: Fusional Real-time Automatic Keyword Extraction

Keyword extraction is called identifying words or phrases that express t...
research
05/04/2022

Using virtual edges to extract keywords from texts modeled as complex networks

Detecting keywords in texts is important for many text mining applicatio...
research
05/17/2020

Analyzing the relationship between text features and research proposal productivity

Predicting the output of research grants is of considerable relevance to...
research
09/12/2023

Cited Text Spans for Citation Text Generation

Automatic related work generation must ground their outputs to the conte...
research
05/15/2019

An interdisciplinary survey of network similarity methods

Comparative graph and network analysis play an important role in both sy...

Please sign up or login with your details

Forgot password? Click here to reset