SentenceMIM: A Latent Variable Language Model

02/18/2020
by   Micha Livne, et al.
0

We introduce sentenceMIM, a probabilistic auto-encoder for language modelling, trained with Mutual Information Machine (MIM) learning. Previous attempts to learn variational auto-encoders for language data? have had mixed success, with empirical performance well below state-of-the-art auto-regressive models, a key barrier being the? occurrence of posterior collapse with VAEs. The recently proposed MIM framework encourages high mutual information between observations and latent variables, and is more robust against posterior collapse. This paper formulates a MIM model for text data, along with a corresponding learning algorithm. We demonstrate excellent perplexity (PPL) results on several datasets, and show that the framework learns a rich latent space, allowing for interpolation between sentences of different lengths with a fixed-dimensional latent representation. We also demonstrate the versatility of sentenceMIM by utilizing a trained model for question-answering, a transfer learning task, without fine-tuning. To the best of our knowledge, this is the first latent variable model (LVM) for text modelling that achieves competitive performance with non-LVM models.

READ FULL TEXT
research
08/30/2019

Implicit Deep Latent Variable Models for Text Generation

Deep latent variable models (LVM) such as variational auto-encoder (VAE)...
research
06/12/2019

Improving Importance Weighted Auto-Encoders with Annealed Importance Sampling

Stochastic variational inference with an amortized inference model and t...
research
06/02/2020

Variational Mutual Information Maximization Framework for VAE Latent Codes with Continuous and Discrete Priors

Learning interpretable and disentangled representations of data is a key...
research
09/23/2016

Language as a Latent Variable: Discrete Generative Models for Sentence Compression

In this work we explore deep generative models of text in which the late...
research
01/20/2019

Modeling the Biological Pathology Continuum with HSIC-regularized Wasserstein Auto-encoders

A crucial challenge in image-based modeling of biomedical data is to ide...
research
08/18/2022

Improving Small Molecule Generation using Mutual Information Machine

We address the task of controlled generation of small molecules, which e...

Please sign up or login with your details

Forgot password? Click here to reset