A novel sentence embedding based topic detection method for micro-blog

06/10/2020
by   Cong Wan, et al.
0

Topic detection is a challenging task, especially without knowing the exact number of topics. In this paper, we present a novel approach based on neural network to detect topics in the micro-blogging dataset. We use an unsupervised neural sentence embedding model to map the blogs to an embedding space. Our model is a weighted power mean word embedding model, and the weights are calculated by attention mechanism. Experimental result shows our embedding method performs better than baselines in sentence clustering. In addition, we propose an improved clustering algorithm referred as relationship-aware DBSCAN (RADBSCAN). It can discover topics from a micro-blogging dataset, and the topic number depends on dataset character itself. Moreover, in order to solve the problem of parameters sensitive, we take blog forwarding relationship as a bridge of two independent clusters. Finally, we validate our approach on a dataset from sina micro-blog. The result shows that we can detect all the topics successfully and extract keywords in each topic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Effective Neural Topic Modeling with Embedding Clustering Regularization

Topic models have been prevalent for decades with various applications. ...
research
06/16/2022

Towards Better Understanding with Uniformity and Explicit Regularization of Embeddings in Embedding-based Neural Topic Models

Embedding-based neural topic models could explicitly represent words and...
research
02/20/2023

Persian topic detection based on Human Word association and graph embedding

In this paper, we propose a framework to detect topics in social media b...
research
06/09/2016

Generative Topic Embedding: a Continuous Representation of Documents (Extended Version with Proofs)

Word embedding maps words into a low-dimensional continuous embedding sp...
research
04/16/2021

Are Classes Clusters?

Sentence embedding models aim to provide general purpose embeddings for ...
research
02/07/2022

Towards Micro-video Thumbnail Selection via a Multi-label Visual-semantic Embedding Model

The thumbnail, as the first sight of a micro-video, plays a pivotal role...
research
04/16/2021

Tracing Topic Transitions with Temporal Graph Clusters

Twitter serves as a data source for many Natural Language Processing (NL...

Please sign up or login with your details

Forgot password? Click here to reset