Improving Contextualized Topic Models with Negative Sampling

03/27/2023
by   Suman Adhya, et al.
0

Topic modeling has emerged as a dominant method for exploring large document collections. Recent approaches to topic modeling use large contextualized language models and variational autoencoders. In this paper, we propose a negative sampling mechanism for a contextualized topic model to improve the quality of the generated topics. In particular, during model training, we perturb the generated document-topic vector and use a triplet loss to encourage the document reconstructed from the correct document-topic vector to be similar to the input document and dissimilar to the document reconstructed from the perturbed vector. Experiments for different topic counts on three publicly available benchmark datasets show that in most cases, our approach leads to an increase in topic coherence over that of the baselines. Our model also achieves very high topic diversity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/16/2017

An Automatic Approach for Document-level Topic Model Evaluation

Topic models jointly learn topics and document-level topic distribution....
research
10/25/2021

Contrastive Learning for Neural Topic Model

Recent empirical studies show that adversarial topic models (ATM) can su...
research
09/02/2023

MPTopic: Improving topic modeling via Masked Permuted pre-training

Topic modeling is pivotal in discerning hidden semantic structures withi...
research
03/27/2023

Improving Neural Topic Models with Wasserstein Knowledge Distillation

Topic modeling is a dominant method for exploring document collections o...
research
10/05/2020

Improving Neural Topic Models using Knowledge Distillation

Topic models are often used to identify human-interpretable topics to he...
research
12/02/2020

TAN-NTM: Topic Attention Networks for Neural Topic Modeling

Topic models have been widely used to learn representations from text an...
research
07/24/2019

Topic Modeling with Wasserstein Autoencoders

We propose a novel neural topic model in the Wasserstein autoencoders (W...

Please sign up or login with your details

Forgot password? Click here to reset