Spectral Methods for Correlated Topic Models

05/30/2016
by   Forough Arabshahi, et al.
0

In this paper, we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations, by assuming that the hidden topic proportions are drawn from a flexible class of Normalized Infinitely Divisible (NID) distributions. NID distributions are generated through the process of normalizing a family of independent Infinitely Divisible (ID) random variables. The Dirichlet distribution is a special case obtained by normalizing a set of Gamma random variables. We prove that this flexible topic model class can be learned via spectral methods using only moments up to the third order, with (low order) polynomial sample and computational complexity. The proof is based on a key new technique derived here that allows us to diagonalize the moments of the NID distribution through an efficient procedure that requires evaluating only univariate integrals, despite the fact that we are handling high dimensional multivariate moments. In order to assess the performance of our proposed Latent NID topic model, we use two real datasets of articles collected from New York Times and Pubmed. Our experiments yield improved perplexity on both datasets compared with the baseline.

READ FULL TEXT
research
04/30/2012

A Spectral Algorithm for Latent Dirichlet Allocation

The problem of topic modeling can be seen as a generalization of the clu...
research
12/10/2013

Guaranteed Model Order Estimation and Sample Complexity Bounds for LDA

The question of how to determine the number of independent latent factor...
research
10/22/2015

A 'Gibbs-Newton' Technique for Enhanced Inference of Multivariate Polya Parameters and Topic Models

Hyper-parameters play a major role in the learning and inference process...
research
09/12/2016

Hyperspectral Unmixing with Endmember Variability using Partial Membership Latent Dirichlet Allocation

The application of Partial Membership Latent Dirichlet Allocation(PM-LDA...
research
02/19/2016

Spectral Learning for Supervised Topic Models

Supervised topic models simultaneously model the latent topic structure ...
research
11/25/2013

Learning Reputation in an Authorship Network

The problem of searching for experts in a given academic field is hugely...
research
03/24/2011

The Discrete Infinite Logistic Normal Distribution

We present the discrete infinite logistic normal distribution (DILN), a ...

Please sign up or login with your details

Forgot password? Click here to reset