Prediction-Constrained Training for Semi-Supervised Mixture and Topic Models

07/23/2017
by   Michael C. Hughes, et al.
0

Supervisory signals have the potential to make low-dimensional data representations, like those learned by mixture and topic models, more interpretable and useful. We propose a framework for training latent variable models that explicitly balances two goals: recovery of faithful generative explanations of high-dimensional data, and accurate prediction of associated semantic labels. Existing approaches fail to achieve these goals due to an incomplete treatment of a fundamental asymmetry: the intended application is always predicting labels from data, not data from labels. Our prediction-constrained objective for training generative models coherently integrates loss-based supervisory signals while enabling effective semi-supervised learning from partially labeled data. We derive learning algorithms for semi-supervised mixture and topic models using stochastic gradient descent with automatic differentiation. We demonstrate improved prediction quality compared to several previous supervised topic models, achieving predictions competitive with high-dimensional logistic regression on text sentiment analysis and electronic health records tasks while simultaneously learning interpretable topics.

READ FULL TEXT
research
12/01/2017

Prediction-Constrained Topic Models for Antidepressant Recommendation

Supervisory signals can help topic models discover low-dimensional data ...
research
06/19/2019

Semi-supervised Logistic Learning Based on Exponential Tilt Mixture Models

Consider semi-supervised learning for classification, where both labeled...
research
11/15/2019

Prediction Focused Topic Models for Electronic Health Records

Electronic Health Record (EHR) data can be represented as discrete count...
research
04/07/2022

A Joint Learning Approach for Semi-supervised Neural Topic Modeling

Topic models are some of the most popular ways to represent textual data...
research
11/20/2019

On Universal Features for High-Dimensional Learning and Inference

We consider the problem of identifying universal low-dimensional feature...
research
10/13/2020

Controlling the Interaction Between Generation and Inference in Semi-Supervised Variational Autoencoders Using Importance Weighting

Even though Variational Autoencoders (VAEs) are widely used for semi-sup...
research
07/23/2020

Semi-supervised Learning From Demonstration Through Program Synthesis: An Inspection Robot Case Study

Semi-supervised learning improves the performance of supervised machine ...

Please sign up or login with your details

Forgot password? Click here to reset