Hierarchical Latent Semantic Mapping for Automated Topic Generation

11/11/2015
by   Guorui Zhou, et al.
0

Much of information sits in an unprecedented amount of text data. Managing allocation of these large scale text data is an important problem for many areas. Topic modeling performs well in this problem. The traditional generative models (PLSA,LDA) are the state-of-the-art approaches in topic modeling and most recent research on topic generation has been focusing on improving or extending these models. However, results of traditional generative models are sensitive to the number of topics K, which must be specified manually. The problem of generating topics from corpus resembles community detection in networks. Many effective algorithms can automatically detect communities from networks without a manually specified number of the communities. Inspired by these algorithms, in this paper, we propose a novel method named Hierarchical Latent Semantic Mapping (HLSM), which automatically generates topics from corpus. HLSM calculates the association between each pair of words in the latent topic space, then constructs a unipartite network of words with this association and hierarchically generates topics from this network. We apply HLSM to several document collections and the experimental comparisons against several state-of-the-art approaches demonstrate the promising performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/25/2023

Improving the Inference of Topic Models via Infinite Latent State Replications

In text mining, topic models are a type of probabilistic generative mode...
research
10/18/2016

Modeling community structure and topics in dynamic text networks

The last decade has seen great progress in both dynamic network modeling...
research
04/13/2023

G2T: A Simple but Effective Framework for Topic Modeling based on Pretrained Language Model and Community Detection

It has been reported that clustering-based topic models, which cluster h...
research
07/29/2017

Topology Analysis of International Networks Based on Debates in the United Nations

In complex, high dimensional and unstructured data it is often difficult...
research
03/22/2022

On the Modeling and Simulation of Portfolio Allocation Schemes: an Approach based on Network Community Detection

We present a study on portfolio investments in financial applications. W...
research
08/05/2015

Progressive EM for Latent Tree Models and Hierarchical Topic Detection

Hierarchical latent tree analysis (HLTA) is recently proposed as a new m...
research
03/25/2021

Term-community-based topic detection with variable resolution

Network-based procedures for topic detection in huge text collections of...

Please sign up or login with your details

Forgot password? Click here to reset