Nonparametric Relational Topic Models through Dependent Gamma Processes

by   Junyu Xuan, et al.
University of Technology Sydney
Shanghai University

Traditional Relational Topic Models provide a way to discover the hidden topics from a document network. Many theoretical and practical tasks, such as dimensional reduction, document clustering, link prediction, benefit from this revealed knowledge. However, existing relational topic models are based on an assumption that the number of hidden topics is known in advance, and this is impractical in many real-world applications. Therefore, in order to relax this assumption, we propose a nonparametric relational topic model in this paper. Instead of using fixed-dimensional probability distributions in its generative model, we use stochastic processes. Specifically, a gamma process is assigned to each document, which represents the topic interest of this document. Although this method provides an elegant solution, it brings additional challenges when mathematically modeling the inherent network structure of typical document network, i.e., two spatially closer documents tend to have more similar topics. Furthermore, we require that the topics are shared by all the documents. In order to resolve these challenges, we use a subsampling strategy to assign each document a different gamma process from the global gamma process, and the subsampling probabilities of documents are assigned with a Markov Random Field constraint that inherits the document network structure. Through the designed posterior inference algorithm, we can discover the hidden topics and its number simultaneously. Experimental results on both synthetic and real-world network datasets demonstrate the capabilities of learning the hidden topics and, more importantly, the number of topics.


page 8

page 13


Infinite Author Topic Model based on Mixed Gamma-Negative Binomial Process

Incorporating the side information of text corpus, i.e., authors, time s...

Random-walk Based Generative Model for Classifying Document Networks

Document networks are found in various collections of real-world data, s...

Integrating Document Clustering and Topic Modeling

Document clustering and topic modeling are two closely related tasks whi...

Sawtooth Factorial Topic Embeddings Guided Gamma Belief Network

Hierarchical topic models such as the gamma belief network (GBN) have de...

Continuous-time Infinite Dynamic Topic Models

Topic models are probabilistic models for discovering topical themes in ...

Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization

The cooperative hierarchical structure is a common and significant data ...

Scalable Models for Computing Hierarchies in Information Networks

Information hierarchies are organizational structures that often used to...

Please sign up or login with your details

Forgot password? Click here to reset