Learning document embeddings along with their uncertainties

08/20/2019
by   Santosh Kesiraju, et al.
0

Majority of the text modelling techniques yield only point estimates of document embeddings and lack in capturing the uncertainty of the estimates. These uncertainties give a notion of how well the embeddings represent a document. We present Bayesian subspace multinomial model (Bayesian SMM), a generative log-linear model that learns to represent documents in the form of Gaussian distributions, thereby encoding the uncertainty in its covariance. Additionally, in the proposed Bayesian SMM, we address a commonly encountered problem of intractability that appears during variational inference in mixed-logit models. We also present a generative Gaussian linear classifier for topic identification that exploits the uncertainty in document embeddings. Our intrinsic evaluation using perplexity measure shows that the proposed Bayesian SMM fits the data better as compared to variational auto-encoder based document model. Our topic identification experiments on speech (Fisher) and text (20Newsgroups) corpora show that the proposed Bayesian SMM is robust to over-fitting on unseen test data. The topic ID results show that the proposed model is significantly better than variational auto-encoder based methods and achieve similar results when compared to fully supervised discriminative models.

READ FULL TEXT
research
07/02/2020

Bayesian multilingual topic model for zero-shot cross-lingual topic identification

This paper presents a Bayesian multilingual topic model for learning lan...
research
11/25/2019

Discovering topics with neural topic models built from PLSA assumptions

In this paper we present a model for unsupervised topic discovery in tex...
research
11/03/2016

Tensor Decomposition via Variational Auto-Encoder

Tensor decomposition is an important technique for capturing the high-or...
research
08/01/2017

SenGen: Sentence Generating Neural Variational Topic Model

We present a new topic model that generates documents by sampling a topi...
research
02/25/2020

Variational Inference and Bayesian CNNs for Uncertainty Estimation in Multi-Factorial Bone Age Prediction

Additionally to the extensive use in clinical medicine, biological age (...
research
09/15/2023

RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function

In indoor scenes, reverberation is a crucial factor in degrading the per...
research
12/06/2017

Noisy Natural Gradient as Variational Inference

Combining the flexibility of deep learning with Bayesian uncertainty est...

Please sign up or login with your details

Forgot password? Click here to reset