Bayesian Interpolation with Deep Linear Networks

12/29/2022
by   Boris Hanin, et al.
0

This article concerns Bayesian inference using deep linear networks with output dimension one. In the interpolating (zero noise) regime we show that with Gaussian weight priors and MSE negative log-likelihood loss both the predictive posterior and the Bayesian model evidence can be written in closed form in terms of a class of meromorphic special functions called Meijer-G functions. These results are non-asymptotic and hold for any training dataset, network depth, and hidden layer widths, giving exact solutions to Bayesian interpolation using a deep Gaussian process with a Euclidean covariance at each layer. Through novel asymptotic expansions of Meijer-G functions, a rich new picture of the role of depth emerges. Specifically, we find that the posteriors in deep linear networks with data-independent priors are the same as in shallow networks with evidence maximizing data-dependent priors. In this sense, deep linear networks make provably optimal predictions. We also prove that, starting from data-agnostic priors, Bayesian model evidence in wide networks is only maximized at infinite depth. This gives a principled reason to prefer deeper networks (at least in the linear case). Finally, our results show that with data-agnostic priors a novel notion of effective depth given by #hidden layers×#training data/network width determines the Bayesian posterior in wide linear networks, giving rigorous new scaling laws for generalization error.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/23/2021

Depth induces scale-averaging in overparameterized linear Bayesian neural networks

Inference in deep Bayesian neural networks is only fully understood in t...
research
04/23/2021

Exact priors of finite neural networks

Bayesian neural networks are theoretically well-understood only in the i...
research
07/23/2013

Generative, Fully Bayesian, Gaussian, Openset Pattern Classifier

This report works out the details of a closed-form, fully Bayesian, mult...
research
11/29/2019

Richer priors for infinitely wide multi-layer perceptrons

It is well-known that the distribution over functions induced through a ...
research
03/06/2023

Bayesian inference with finitely wide neural networks

The analytic inference, e.g. predictive distribution being in closed for...
research
10/11/2018

Bayesian neural networks increasingly sparsify their units with depth

We investigate deep Bayesian neural networks with Gaussian priors on the...
research
04/25/2020

Compromise-free Bayesian neural networks

We conduct a thorough analysis of the relationship between the out-of-sa...

Please sign up or login with your details

Forgot password? Click here to reset