Recurrent Coupled Topic Modeling over Sequential Documents

06/23/2021
by   Jinjin Guo, et al.
0

The abundant sequential documents such as online archival, social media and news feeds are streamingly updated, where each chunk of documents is incorporated with smoothly evolving yet dependent topics. Such digital texts have attracted extensive research on dynamic topic modeling to infer hidden evolving topics and their temporal dependencies. However, most of the existing approaches focus on single-topic-thread evolution and ignore the fact that a current topic may be coupled with multiple relevant prior topics. In addition, these approaches also incur the intractable inference problem when inferring latent parameters, resulting in a high computational cost and performance degradation. In this work, we assume that a current topic evolves from all prior topics with corresponding coupling weights, forming the multi-topic-thread evolution. Our method models the dependencies between evolving topics and thoroughly encodes their complex multi-couplings across time steps. To conquer the intractable inference challenge, a new solution with a set of novel data augmentation techniques is proposed, which successfully discomposes the multi-couplings between evolving topics. A fully conjugate model is thus obtained to guarantee the effectiveness and efficiency of the inference technique. A novel Gibbs sampler with a backward-forward filter algorithm efficiently learns latent timeevolving parameters in a closed-form. In addition, the latent Indian Buffet Process (IBP) compound distribution is exploited to automatically infer the overall topic number and customize the sparse topic proportions for each sequential document without bias. The proposed method is evaluated on both synthetic and real-world datasets against the competitive baselines, demonstrating its superiority over the baselines in terms of the low per-word perplexity, high coherent topics, and better document time prediction.

READ FULL TEXT
research
03/15/2012

Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream

Topic models have proven to be a useful tool for discovering latent stru...
research
09/29/2020

Neural Topic Modeling with Cycle-Consistent Adversarial Training

Advances on deep generative models have attracted significant research i...
research
09/24/2018

Streaming dynamic and distributed inference of latent geometric structures

We develop new models and algorithms for learning the temporal dynamics ...
research
11/15/2017

Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time

Dynamic topic modeling facilitates the identification of topical trends ...
research
06/30/2018

A Constrained Coupled Matrix-Tensor Factorization for Learning Time-evolving and Emerging Topics

Topic discovery has witnessed a significant growth as a field of data mi...
research
09/19/2018

Modeling Online Discourse with Coupled Distributed Topics

In this paper, we propose a deep, globally normalized topic model that i...
research
08/20/2017

Efficient Online Inference for Infinite Evolutionary Cluster models with Applications to Latent Social Event Discovery

The Recurrent Chinese Restaurant Process (RCRP) is a powerful statistica...

Please sign up or login with your details

Forgot password? Click here to reset