Learning Hard Alignments with Variational Inference

05/16/2017
by   Dieterich Lawson, et al.
0

There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention can offer benefits over soft attention such as decreased computational cost, but training hard attention models can be difficult because of the discrete latent variables they introduce. Previous work used REINFORCE and Q-learning to approach these issues, but those methods can provide high-variance gradient estimates and be slow to train. In this paper, we tackle the problem of learning hard attention for a sequential task using variational inference methods, specifically the recently introduced VIMCO and NVIL. Furthermore, we propose a novel baseline that adapts VIMCO to this setting. We demonstrate our method on a phoneme recognition task in clean and noisy environments and show that our method outperforms REINFORCE, with the difference being greater for a more complicated task.

READ FULL TEXT
research
07/10/2018

Latent Alignment and Variational Attention

Neural attention has become central to many state-of-the-art models in n...
research
09/04/2015

Stochastic gradient variational Bayes for gamma approximating distributions

While stochastic variational inference is relatively well known for scal...
research
11/04/2020

Quantized Variational Inference

We present Quantized Variational Inference, a new algorithm for Evidence...
research
12/23/2019

Mixture of Inference Networks for VAE-based Audio-visual Speech Enhancement

In this paper, we are interested in unsupervised speech enhancement usin...
research
06/12/2019

Neural Variational Inference For Estimating Uncertainty in Knowledge Graph Embeddings

Recent advances in Neural Variational Inference allowed for a renaissanc...
research
10/14/2015

Embarrassingly Parallel Variational Inference in Nonconjugate Models

We develop a parallel variational inference (VI) procedure for use in da...
research
06/13/2019

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

We introduce the use of Bayesian optimal experimental design techniques ...

Please sign up or login with your details

Forgot password? Click here to reset