Latent-Variable Generative Models for Data-Efficient Text Classification

10/01/2019
by   Xiaoan Ding, et al.
0

Generative classifiers offer potential advantages over their discriminative counterparts, namely in the areas of data efficiency, robustness to data shift and adversarial examples, and zero-shot learning (Ng and Jordan,2002; Yogatama et al., 2017; Lewis and Fan,2019). In this paper, we improve generative text classifiers by introducing discrete latent variables into the generative story, and explore several graphical model configurations. We parameterize the distributions using standard neural architectures used in conditional language modeling and perform learning by directly maximizing the log marginal likelihood via gradient-based optimization, which avoids the need to do expectation-maximization. We empirically characterize the performance of our models on six text classification datasets. The choice of where to include the latent variable has a significant impact on performance, with the strongest results obtained when using the latent variable as an auxiliary conditioning variable in the generation of the textual input. This model consistently outperforms both the generative and discriminative classifiers in small-data settings. We analyze our model by using it for controlled generation, finding that the latent variable captures interpretable properties of the data, even with very small training sets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/12/2018

Avoiding Latent Variable Collapse With Generative Skip Models

Variational autoencoders (VAEs) learn distributions of high-dimensional ...
research
06/11/2020

Discrete Latent Variable Representations for Low-Resource Text Classification

While much work on deep latent variable models of text uses continuous l...
research
03/06/2017

Generative and Discriminative Text Classification with Recurrent Neural Networks

We empirically characterize the performance of discriminative and genera...
research
06/11/2018

Growing Better Graphs With Latent-Variable Probabilistic Graph Grammars

Recent work in graph models has found that probabilistic hyperedge repla...
research
10/23/2021

Hierarchical Few-Shot Generative Models

A few-shot generative model should be able to generate data from a distr...
research
10/23/2015

Fast Latent Variable Models for Inference and Visualization on Mobile Devices

In this project we outline Vedalia, a high performance distributed netwo...

Please sign up or login with your details

Forgot password? Click here to reset