Conceptualization Topic Modeling

04/07/2017
by   Yi-Kun Tang, et al.
0

Recently, topic modeling has been widely used to discover the abstract topics in text corpora. Most of the existing topic models are based on the assumption of three-layer hierarchical Bayesian structure, i.e. each document is modeled as a probability distribution over topics, and each topic is a probability distribution over words. However, the assumption is not optimal. Intuitively, it's more reasonable to assume that each topic is a probability distribution over concepts, and then each concept is a probability distribution over words, i.e. adding a latent concept layer between topic layer and word layer in traditional three-layer assumption. In this paper, we verify the proposed assumption by incorporating the new assumption in two representative topic models, and obtain two novel topic models. Extensive experiments were conducted among the proposed models and corresponding baselines, and the results show that the proposed models significantly outperform the baselines in terms of case study and perplexity, which means the new assumption is more reasonable than traditional one.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

Dirichlet belief networks for topic structure learning

Recently, considerable research effort has been devoted to developing de...
research
04/26/2020

Neural Topic Modeling with Bidirectional Adversarial Training

Recent years have witnessed a surge of interests of using neural topic m...
research
10/23/2018

Topic representation: finding more representative words in topic models

The top word list, i.e., the top-M words with highest marginal probabili...
research
11/10/2017

Joint Sentiment/Topic Modeling on Text Data Using Boosted Restricted Boltzmann Machine

Recently by the development of the Internet and the Web, different types...
research
09/29/2019

Lifelong Neural Topic Learning in Contextualized Autoregressive Topic Models of Language via Informative Transfers

Topic models such as LDA, DocNADE, iDocNADEe have been popular in docume...
research
06/26/2018

Unveiling the semantic structure of text documents using paragraph-aware Topic Models

Classic Topic Models are built under the Bag Of Words assumption, in whi...
research
07/31/2017

Combining Thesaurus Knowledge and Probabilistic Topic Models

In this paper we present the approach of introducing thesaurus knowledge...

Please sign up or login with your details

Forgot password? Click here to reset