Concentrated Document Topic Model

02/06/2021
by   Hao Lei, et al.
0

We propose a Concentrated Document Topic Model(CDTM) for unsupervised text classification, which is able to produce a concentrated and sparse document topic distribution. In particular, an exponential entropy penalty is imposed on the document topic distribution. Documents that have diverse topic distributions are penalized more, while those having concentrated topics are penalized less. We apply the model to the benchmark NIPS dataset and observe more coherent topics and more concentrated and sparse document-topic distributions than Latent Dirichlet Allocation(LDA).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/11/2012

The Author-Topic Model for Authors and Documents

We introduce the author-topic model, a generative model for documents th...
research
06/16/2017

An Automatic Approach for Document-level Topic Model Evaluation

Topic models jointly learn topics and document-level topic distribution....
research
01/22/2014

Parsimonious Topic Models with Salient Word Discovery

We propose a parsimonious topic model for text corpora. In related model...
research
07/24/2015

The Polylingual Labeled Topic Model

In this paper, we present the Polylingual Labeled Topic Model, a model w...
research
02/25/2010

Syntactic Topic Models

The syntactic topic model (STM) is a Bayesian nonparametric model of lan...
research
10/18/2021

Uncertainty-aware Topic Modeling Visualization

Topic modeling is a state-of-the-art technique for analyzing text corpor...
research
03/31/2021

Topic Scaling: A Joint Document Scaling – Topic Model Approach To Learn Time-Specific Topics

This paper proposes a new methodology to study sequential corpora by imp...

Please sign up or login with your details

Forgot password? Click here to reset