Model Selection for Topic Models via Spectral Decomposition

10/23/2014
by   Dehua Cheng, et al.
0

Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following recent advances in topic model inference via tensor decomposition, we make a first attempt to provide theoretical analysis on model selection in latent Dirichlet allocation. Under mild conditions, we derive the upper bound and lower bound on the number of topics given a text collection of finite size. Experimental results demonstrate that our bounds are accurate and tight. Furthermore, using Gaussian mixture model as an example, we show that our methodology can be easily generalized to model selection analysis for other latent models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2009

A Nonconformity Approach to Model Selection for SVMs

We investigate the issue of model selection and the use of the nonconfor...
research
08/02/2012

Multidimensional Membership Mixture Models

We present the multidimensional membership mixture (M3) models where eve...
research
07/08/2013

Bridging Information Criteria and Parameter Shrinkage for Model Selection

Model selection based on classical information criteria, such as BIC, is...
research
12/10/2013

Guaranteed Model Order Estimation and Sample Complexity Bounds for LDA

The question of how to determine the number of independent latent factor...
research
03/01/2022

Topic Analysis for Text with Side Data

Although latent factor models (e.g., matrix factorization) obtain good p...
research
02/23/2023

Detecting Signs of Model Change with Continuous Model Selection Based on Descriptive Dimensionality

We address the issue of detecting changes of models that lie behind a da...
research
07/31/2017

Familia: An Open-Source Toolkit for Industrial Topic Modeling

Familia is an open-source toolkit for pragmatic topic modeling in indust...

Please sign up or login with your details

Forgot password? Click here to reset