Graph-Community Detection for Cross-Document Topic Segment Relationship Identification

06/13/2016
by   Pedro Mota, et al.
0

In this paper we propose a graph-community detection approach to identify cross-document relationships at the topic segment level. Given a set of related documents, we automatically find these relationships by clustering segments with similar content (topics). In this context, we study how different weighting mechanisms influence the discovery of word communities that relate to the different topics found in the documents. Finally, we test different mapping functions to assign topic segments to word communities, determining which topic segments are considered equivalent. By performing this task it is possible to enable efficient multi-document browsing, since when a user finds relevant content in one document we can provide access to similar topics in other documents. We deploy our approach in two different scenarios. One is an educational scenario where equivalence relationships between learning materials need to be found. The other consists of a series of dialogs in a social context where students discuss commonplace topics. Results show that our proposed approach better discovered equivalence relationships in learning material documents and obtained close results in the social speech domain, where the best performing approach was a clustering technique.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/13/2023

G2T: A Simple but Effective Framework for Topic Modeling based on Pretrained Language Model and Community Detection

It has been reported that clustering-based topic models, which cluster h...
research
07/02/2020

A Novel Graph Based Clustering Approach to Document Topic Modeling

Clustering is the task of assigning a set of objects into groups so that...
research
09/29/2020

Neural Topic Modeling by Incorporating Document Relationship Graph

Graph Neural Networks (GNNs) that capture the relationships between grap...
research
12/03/2018

From the User to the Medium: Neural Profiling Across Web Communities

Online communities provide a unique way for individuals to access inform...
research
10/14/2021

Is Stance Detection Topic-Independent and Cross-topic Generalizable? – A Reproduction Study

Cross-topic stance detection is the task to automatically detect stances...
research
04/24/2021

Automatic Description Construction for Math Expression via Topic Relation Graph

Math expressions are important parts of scientific and educational docum...
research
08/03/2018

Content-driven, unsupervised clustering of news articles through multiscale graph partitioning

The explosion in the amount of news and journalistic content being gener...

Please sign up or login with your details

Forgot password? Click here to reset