Variational Bayesian Dropout

11/19/2018
by   Yuhang Liu, et al.
0

Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite) which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with variance sampled from a uniform hyper-prior. Then, we incorporate such a prior into inferring the joint posterior over network weights and the variance in the hierarchical prior, with which both the network training and the dropout rate estimation can be cast into a joint optimization problem. More importantly, the hierarchical prior is a proper prior which enables the inference of posterior to be well-posed. In addition, we further show that the proposed VBD can be seamlessly applied to network compression. Experiments on both classification and network compression tasks demonstrate the superior performance of the proposed VBD in terms of regularizing network training.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/08/2015

Variational Dropout and the Local Reparameterization Trick

We investigate a local reparameterizaton technique for greatly reducing ...
research
07/05/2018

Variational Bayesian dropout: pitfalls and fixes

Dropout, a stochastic regularisation technique for training of neural ne...
research
11/08/2017

Variational Gaussian Dropout is not Bayesian

Gaussian multiplicative noise is commonly used as a stochastic regularis...
research
05/20/2017

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Dropout-based regularization methods can be regarded as injecting random...
research
02/16/2021

Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout

Approximate inference in deep Bayesian networks exhibits a dilemma of ho...
research
10/28/2021

Preventing posterior collapse in variational autoencoders for text generation via decoder regularization

Variational autoencoders trained to minimize the reconstruction error ar...
research
11/19/2019

Parameters Estimation for the Cosmic Microwave Background with Bayesian Neural Networks

In this paper, we present the first study that compares different models...

Please sign up or login with your details

Forgot password? Click here to reset