Provable Algorithms for Inference in Topic Models

05/27/2016
by   Sanjeev Arora, et al.
0

Recently, there has been considerable progress on designing algorithms with provable guarantees -- typically using linear algebraic methods -- for parameter learning in latent variable models. But designing provable algorithms for inference has proven to be more challenging. Here we take a first step towards provable inference in topic models. We leverage a property of topic models that enables us to construct simple linear estimators for the unknown topic proportions that have small variance, and consequently can work with short documents. Our estimators also correspond to finding an estimate around which the posterior is well-concentrated. We show lower bounds that for shorter documents it can be information theoretically impossible to find the hidden topics. Finally, we give empirical results that demonstrate that our algorithm works on realistic topic models. It yields good solutions on synthetic data and runs in time comparable to a single iteration of Gibbs sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/19/2012

A Practical Algorithm for Topic Modeling with Provable Guarantees

Topic models provide a useful method for dimensionality reduction and ex...
research
11/19/2017

Prior-aware Dual Decomposition: Document-specific Topic Inference for Spectral Topic Models

Spectral topic modeling algorithms operate on matrices/tensors of word c...
research
02/19/2016

Scaling up Dynamic Topic Models

Dynamic topic models (DTMs) are very effective in discovering topics and...
research
05/09/2012

On Smoothing and Inference for Topic Models

Latent Dirichlet analysis, or topic modeling, is a flexible latent varia...
research
11/01/2016

Robust Spectral Inference for Joint Stochastic Matrix Factorization

Spectral inference provides fast algorithms and provable optimality for ...
research
10/26/2012

Managing sparsity, time, and quality of inference in topic models

Inference is an integral part of probabilistic topic models, but is ofte...
research
08/12/2020

On Uniformly Sampling Traces of a Transition System (Extended Version)

A key problem in constrained random verification (CRV) concerns generati...

Please sign up or login with your details

Forgot password? Click here to reset