TAN-NTM: Topic Attention Networks for Neural Topic Modeling

12/02/2020
by   Madhur Panwar, et al.
0

Topic models have been widely used to learn representations from text and gain insight into document corpora. To perform topic discovery, existing neural models use document bag-of-words (BoW) representation as input followed by variational inference and learn topic-word distribution through reconstructing BoW. Such methods have mainly focused on analysing the effect of enforcing suitable priors on document distribution. However, little importance has been given to encoding improved document features for capturing document semantics better. In this work, we propose a novel framework: TAN-NTM which models document as a sequence of tokens instead of BoW at the input layer and processes it through an LSTM whose output is used to perform variational inference followed by BoW decoding. We apply attention on LSTM outputs to empower the model to attend on relevant words which convey topic related cues. We hypothesise that attention can be performed effectively if done in a topic guided manner and establish this empirically through ablations. We factor in topic-word distribution to perform topic aware attention achieving state-of-the-art results with  9-15 percentage improvement over score of existing SOTA topic models in NPMI coherence metric on four benchmark datasets - 20NewsGroup, Yelp, AGNews, DBpedia. TAN-NTM also obtains better document classification accuracy owing to learning improved document-topic features. We qualitatively discuss that attention mechanism enables unsupervised discovery of keywords. Motivated by this, we further show that our proposed framework achieves state-of-the-art performance on topic aware supervised generation of keyphrases on StackExchange and Weibo datasets.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/22/2020

A Discrete Variational Recurrent Topic Model without the Reparametrization Trick

We show how to learn a neural topic model with discrete random variables...
01/10/2020

Inductive Document Network Embedding with Topic-Word Attention

Document network embedding aims at learning representations for a struct...
09/07/2018

Coherence-Aware Neural Topic Modeling

Topic models are evaluated based on their ability to describe documents ...
11/19/2015

Neural Variational Inference for Text Processing

Recent advances in neural variational inference have spawned a renaissan...
10/25/2021

Contrastive Learning for Neural Topic Model

Recent empirical studies show that adversarial topic models (ATM) can su...
08/11/2018

Document Informed Neural Autoregressive Topic Models

Context information around words helps in determining their actual meani...
07/05/2021

Is Automated Topic Model Evaluation Broken?: The Incoherence of Coherence

Topic model evaluation, like evaluation of other unsupervised methods, c...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.