Nonparametric Bayes Pachinko Allocation

06/20/2012
by   Wei Li, et al.
0

Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset. In this paper, we propose a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP). Although the HDP can capture topic correlations defined by nested data structure, it does not automatically discover such correlations from unstructured data. By assuming an HDP-based prior for PAM, we are able to learn both the number of topics and how the topics are correlated. We evaluate our model on synthetic and real-world text datasets, and show that nonparametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2020

Gaussian Hierarchical Latent Dirichlet Allocation: Bringing Polysemy Back

Topic models are widely used to discover the latent representation of a ...
research
01/19/2021

Analysis and tuning of hierarchical topic models based on Renyi entropy approach

Hierarchical topic modeling is a potentially powerful instrument for det...
research
10/05/2016

Decentralized Topic Modelling with Latent Dirichlet Allocation

Privacy preserving networks can be modelled as decentralized networks (e...
research
01/16/2015

Bayesian Nonparametrics in Topic Modeling: A Brief Tutorial

Using nonparametric methods has been increasingly explored in Bayesian h...
research
07/18/2017

Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization

The cooperative hierarchical structure is a common and significant data ...
research
10/10/2019

Learning beyond Predefined Label Space via Bayesian Nonparametric Topic Modelling

In real world machine learning applications, testing data may contain so...
research
09/02/2020

Local-HDP: Interactive Open-Ended 3D Object Categorization

We introduce a non-parametric hierarchical Bayesian approach for open-en...

Please sign up or login with your details

Forgot password? Click here to reset