Context Reinforced Neural Topic Modeling over Short Texts

08/11/2020
by   Jiachun Feng, et al.
0

As one of the prevalent topic mining tools, neural topic modeling has attracted a lot of interests for the advantages of high efficiency in training and strong generalisation abilities. However, due to the lack of context in each short text, the existing neural topic models may suffer from feature sparsity on such documents. To alleviate this issue, we propose a Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows. Firstly, by assuming that each short text covers only a few salient topics, CRNTM infers the topic for each word in a narrow range. Secondly, our model exploits pre-trained word embeddings by treating topics as multivariate Gaussian distributions or Gaussian mixture distributions in the embedding space. Extensive experiments on two benchmark datasets validate the effectiveness of the proposed model on both topic discovery and text classification.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2023

Effective Neural Topic Modeling with Embedding Clustering Regularization

Topic models have been prevalent for decades with various applications. ...
research
09/11/2018

Topic Memory Networks for Short Text Classification

Many classification models work poorly on short texts due to data sparsi...
research
02/12/2015

Ordering-sensitive and Semantic-aware Topic Modeling

Topic modeling of textual corpora is an important and challenging proble...
research
10/22/2018

Sparsemax and Relaxed Wasserstein for Topic Sparsity

Topic sparsity refers to the observation that individual documents usual...
research
10/16/2017

Classifying Web Exploits with Topic Modeling

This short empirical paper investigates how well topic modeling and data...
research
10/09/2018

textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior

We address two challenges of probabilistic topic modelling in order to b...
research
12/17/2014

Word Network Topic Model: A Simple but General Solution for Short and Imbalanced Texts

The short text has been the prevalent format for information of Internet...

Please sign up or login with your details

Forgot password? Click here to reset