An empirical study of domain-agnostic semi-supervised learning via energy-based models: joint-training and pre-training

10/25/2020
by   Yunfu Song, et al.
0

A class of recent semi-supervised learning (SSL) methods heavily rely on domain-specific data augmentations. In contrast, generative SSL methods involve unsupervised learning based on generative models by either joint-training or pre-training, and are more appealing from the perspective of being domain-agnostic, since they do not inherently require data augmentations. Joint-training estimates the joint distribution of observations and labels, while pre-training is taken over observations only. Recently, energy-based models (EBMs) have achieved promising results for generative modeling. Joint-training via EBMs for SSL has been explored with encouraging results across different data modalities. In this paper, we make two contributions. First, we explore pre-training via EBMs for SSL and compare it to joint-training. Second, a suite of experiments are conducted over domains of image classification and natural language labeling to give a realistic whole picture of the performances of EBM based SSL methods. It is found that joint-training EBMs outperform pre-training EBMs marginally but nearly consistently.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2021

Wav2vec-S: Semi-Supervised Pre-Training for Speech Recognition

Self-supervised pre-training has dramatically improved the performance o...
research
05/22/2023

Rethinking Semi-supervised Learning with Language Models

Semi-supervised learning (SSL) is a popular setting aiming to effectivel...
research
01/18/2021

Multimodal Variational Autoencoders for Semi-Supervised Learning: In Defense of Product-of-Experts

Multimodal generative models should be able to learn a meaningful latent...
research
10/08/2020

No MCMC for me: Amortized sampling for fast and stable training of energy-based models

Energy-Based Models (EBMs) present a flexible and appealing way to repre...
research
06/26/2023

Distributive Pre-Training of Generative Modeling Using Matrix-Product States

Tensor networks have recently found applications in machine learning for...
research
01/31/2023

Learning Data Representations with Joint Diffusion Models

We introduce a joint diffusion model that simultaneously learns meaningf...
research
11/01/2022

Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions

Generative Flow Networks (GFlowNets) have demonstrated significant perfo...

Please sign up or login with your details

Forgot password? Click here to reset