DeepAI AI Chat
Log In Sign Up

Convergence Rates of Variational Inference in Sparse Deep Learning

Variational inference is becoming more and more popular for approximating intractable posterior distributions in Bayesian statistics and machine learning. Meanwhile, a few recent works have provided theoretical justification and new insights on deep neural networks for estimating smooth functions in usual settings such as nonparametric regression. In this paper, we show that variational inference for sparse deep learning retains the same generalization properties than exact Bayesian inference. In particular, we highlight the connection between estimation and approximation theories via the classical bias-variance trade-off and show that it leads to near-minimax rates of convergence for Hölder smooth functions. Additionally, we show that the model selection framework over the neural network architecture via ELBO maximization does not overfit and adaptively achieves the optimal rate of convergence.


page 1

page 2

page 3

page 4


Generalization Error Bounds for Deep Variational Inference

Variational inference is becoming more and more popular for approximatin...

Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee

Sparse deep learning aims to address the challenge of huge storage consu...

Polynomial-Spline Neural Networks with Exact Integrals

Using neural networks to solve variational problems, and other scientifi...

Rate Optimal Variational Bayesian Inference for Sparse DNN

Sparse deep neural network (DNN) has drawn much attention in recent stud...

Measurement error models: from nonparametric methods to deep neural networks

The success of deep learning has inspired recent interests in applying n...

Loss convergence in a causal Bayesian neural network of retail firm performance

We extend the empirical results from the structural equation model (SEM)...

Consistency of Variational Bayes Inference for Estimation and Model Selection in Mixtures

Mixture models are widely used in Bayesian statistics and machine learni...