DeepAI AI Chat
Log In Sign Up

Rate Optimal Variational Bayesian Inference for Sparse DNN

by   Jincheng Bai, et al.
Purdue University

Sparse deep neural network (DNN) has drawn much attention in recent studies because it possesses a great approximation power and is easier to implement and store in practice comparing to fully connected DNN. In this work, we consider variational Bayesian inference, a computationally efficient alternative to Markov chain Monte Carlo method, on the sparse DNN modeling under spike-and-slab prior. Our theoretical investigation shows that, for any α-Hölder smooth function, the variational posterior distribution shares the (near-)optimal contraction property, and the variation inference leads to (near-)optimal generalization error, as long as the network architecture is properly tuned according to smoothness parameter α. Furthermore, an adaptive variational inference procedure is developed to automatically select optimal network structure even when α is unknown. Our result also applies to the case that the truth is instead a ReLU neural network, and certain contraction bound is obtained.


page 1

page 2

page 3

page 4


Generalization Error Bounds for Deep Variational Inference

Variational inference is becoming more and more popular for approximatin...

Convergence Rates of Variational Inference in Sparse Deep Learning

Variational inference is becoming more and more popular for approximatin...

Posterior Concentration for Sparse Deep Learning

Spike-and-Slab Deep Learning (SS-DL) is a fully Bayesian alternative to ...

What Are Bayesian Neural Network Posteriors Really Like?

The posterior over Bayesian neural network (BNN) parameters is extremely...

Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee

Sparse deep learning aims to address the challenge of huge storage consu...

Deep Neural Networks as Point Estimates for Deep Gaussian Processes

Deep Gaussian processes (DGPs) have struggled for relevance in applicati...

Variational Bayes Deep Operator Network: A data-driven Bayesian solver for parametric differential equations

Neural network based data-driven operator learning schemes have shown tr...