Using virtual edges to extract keywords from texts modeled as complex networks

05/04/2022
by   Jorge A. V. Tohalino, et al.
0

Detecting keywords in texts is important for many text mining applications. Graph-based methods have been commonly used to automatically find the key concepts in texts, however, relevant information provided by embeddings has not been widely used to enrich the graph structure. Here we modeled texts co-occurrence networks, where nodes are words and edges are established either by contextual or semantical similarity. We compared two embedding approaches – Word2vec and BERT – to check whether edges created via word embeddings can improve the quality of the keyword extraction method. We found that, in fact, the use of virtual edges can improve the discriminability of co-occurrence networks. The best performance was obtained when we considered low percentages of addition of virtual (embedding) edges. A comparative analysis of structural and dynamical network metrics revealed the degree, PageRank, and accessibility are the metrics displaying the best performance in the model enriched with virtual edges.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/13/2020

Using word embeddings to improve the discriminability of co-occurrence text networks

Word co-occurrence networks have been employed to analyze texts both in ...
research
04/10/2021

FRAKE: Fusional Real-time Automatic Keyword Extraction

Keyword extraction is called identifying words or phrases that express t...
research
01/15/2023

Using citation networks to evaluate the impact of text length on the identification of relevant concepts

The identification of the most significant concepts in unstructured data...
research
11/13/2021

Keyphrase Extraction Using Neighborhood Knowledge Based on Word Embeddings

Keyphrase extraction is the task of finding several interesting phrases ...
research
06/30/2016

Representation of texts as complex networks: a mesoscopic approach

Statistical techniques that analyze texts, referred to as text analytics...
research
05/11/2017

On the role of words in the network structure of texts: application to authorship attribution

Well-established automatic analyses of texts mainly consider frequencies...
research
01/11/2023

HADA: A Graph-based Amalgamation Framework in Image-text Retrieval

Many models have been proposed for vision and language tasks, especially...

Please sign up or login with your details

Forgot password? Click here to reset