Bridging Discrete and Backpropagation: Straight-Through and Beyond

04/17/2023
by   Liyuan Liu, et al.
0

Backpropagation, the cornerstone of deep learning, is limited to computing gradients solely for continuous variables. This limitation hinders various research on problems involving discrete latent variables. To address this issue, we propose a novel approach for approximating the gradient of parameters involved in generating discrete latent variables. First, we examine the widely used Straight-Through (ST) heuristic and demonstrate that it works as a first-order approximation of the gradient. Guided by our findings, we propose a novel method called ReinMax, which integrates Heun's Method, a second-order numerical method for solving ODEs, to approximate the gradient. Our method achieves second-order accuracy without requiring Hessian or other second-order derivatives. We conduct experiments on structured output prediction and unsupervised generative modeling tasks. Our results show that brings consistent improvements over the state of the art, including ST and Straight-Through Gumbel-Softmax. Implementations are released at https://github.com/microsoft/ReinMax.

READ FULL TEXT

page 6

page 7

page 15

page 16

research
09/07/2016

Discrete Variational Autoencoders

Probabilistic models with discrete latent variables naturally capture da...
research
11/03/2016

Categorical Reparameterization with Gumbel-Softmax

Categorical variables are a natural choice for representing discrete str...
research
08/24/2023

A second-order length-preserving and unconditionally energy stable rotational discrete gradient method for Oseen-Frank gradient flows

We present a second-order strictly length-preserving and unconditionally...
research
07/03/2020

Efficient Marginalization of Discrete and Structured Latent Variables via Sparsity

Training neural network models with discrete (categorical or structured)...
research
01/23/2020

Information Compensation for Deep Conditional Generative Networks

In recent years, unsupervised/weakly-supervised conditional generative a...
research
09/09/2015

Fast Second-Order Stochastic Backpropagation for Variational Inference

We propose a second-order (Hessian or Hessian-free) based optimization m...
research
11/04/2020

EAdam Optimizer: How ε Impact Adam

Many adaptive optimization methods have been proposed and used in deep l...

Please sign up or login with your details

Forgot password? Click here to reset