Stochastic Divergence Minimization for Biterm Topic Model

05/01/2017
by   Zhenghang Cui, et al.
0

As the emergence and the thriving development of social networks, a huge number of short texts are accumulated and need to be processed. Inferring latent topics of collected short texts is useful for understanding its hidden structure and predicting new contents. Unlike conventional topic models such as latent Dirichlet allocation (LDA), a biterm topic model (BTM) was recently proposed for short texts to overcome the sparseness of document-level word co-occurrences by directly modeling the generation process of word pairs. Stochastic inference algorithms based on collapsed Gibbs sampling (CGS) and collapsed variational inference have been proposed for BTM. However, they either require large computational complexity, or rely on very crude estimation. In this work, we develop a stochastic divergence minimization inference algorithm for BTM to estimate latent topics more accurately in a scalable way. Experiments demonstrate the superiority of our proposed algorithm compared with existing inference algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2012

Sparse Stochastic Inference for Latent Dirichlet allocation

We present a hybrid algorithm for Bayesian topic models that combines th...
research
07/19/2011

Using Variational Inference and MapReduce to Scale Topic Modeling

Latent Dirichlet Allocation (LDA) is a popular topic modeling technique ...
research
12/17/2014

Word Network Topic Model: A Simple but General Solution for Short and Imbalanced Texts

The short text has been the prevalent format for information of Internet...
research
12/10/2015

Scalable Modeling of Conversational-role based Self-presentation Characteristics in Large Online Forums

Online discussion forums are complex webs of overlapping subcommunities ...
research
10/27/2016

Geometric Dirichlet Means algorithm for topic inference

We propose a geometric algorithm for topic learning and inference that i...
research
04/13/2019

Short Text Topic Modeling Techniques, Applications, and Performance: A Survey

Analyzing short texts infers discriminative and coherent latent topics t...
research
07/04/2012

Mining Associated Text and Images with Dual-Wing Harmoniums

We propose a multi-wing harmonium model for mining multimedia data that ...

Please sign up or login with your details

Forgot password? Click here to reset