Bayesian Hypernetworks

10/13/2017
by   David Krueger, et al.
0

We propose Bayesian hypernetworks: a framework for approximate Bayesian inference in neural networks. A Bayesian hypernetwork, h, is a neural network which learns to transform a simple noise distribution, p(ϵ) = N(0,I), to a distribution q(θ) q(h(ϵ)) over the parameters θ of another neural network (the "primary network"). We train q with variational inference, using an invertible h to enable efficient estimation of the variational lower bound on the posterior p(θ | D) via sampling. In contrast to most methods for Bayesian deep learning, Bayesian hypernets can represent a complex multimodal approximate posterior with correlations between parameters, while enabling cheap i.i.d. sampling of q(θ). We demonstrate these qualitative advantages of Bayesian hypernets, which also achieve competitive performance on a suite of tasks that demonstrate the advantage of estimating model uncertainty, including active learning and anomaly detection.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset