Taming VAEs

10/01/2018
by   Danilo Jimenez Rezende, et al.
0

In spite of remarkable progress in deep latent variable generative modeling, training still remains a challenge due to a combination of optimization and generalization issues. In practice, a combination of heuristic algorithms (such as hand-crafted annealing of KL-terms) is often used in order to achieve the desired results, but such solutions are not robust to changes in model architecture or dataset. The best settings can often vary dramatically from one problem to another, which requires doing expensive parameter sweeps for each new case. Here we develop on the idea of training VAEs with additional constraints as a way to control their behaviour. We first present a detailed theoretical analysis of constrained VAEs, expanding our understanding of how these models work. We then introduce and analyze a practical algorithm termed Generalized ELBO with Constrained Optimization, GECO. The main advantage of GECO for the machine learning practitioner is a more intuitive, yet principled, process of tuning the loss. This involves defining of a set of constraints, which typically have an explicit relation to the desired model performance, in contrast to tweaking abstract hyper-parameters which implicitly affect the model behavior. Encouraging experimental results in several standard datasets indicate that GECO is a very robust and effective tool to balance reconstruction and compression constraints.

READ FULL TEXT

page 10

page 19

page 20

page 21

research
01/17/2022

A Comparative study of Hyper-Parameter Optimization Tools

Most of the machine learning models have associated hyper-parameters alo...
research
04/05/2023

Hyper-parameter Tuning for Adversarially Robust Models

This work focuses on the problem of hyper-parameter tuning (HPT) for rob...
research
08/09/2020

Stability analysis for the Implicit-Explicit discretization of the Cahn-Hilliard equation

Implicit-Explicit methods have been widely used for the efficient numeri...
research
05/12/2019

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

In most practical settings and theoretical analysis, one assumes that a ...
research
07/15/2023

Minimal Random Code Learning with Mean-KL Parameterization

This paper studies the qualitative behavior and robustness of two varian...
research
09/06/2022

Annealing Optimization for Progressive Learning with Stochastic Approximation

In this work, we introduce a learning model designed to meet the needs o...
research
03/25/2019

Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing

Variational autoencoders (VAEs) with an auto-regressive decoder have bee...

Please sign up or login with your details

Forgot password? Click here to reset