A theory of data variability in Neural Network Bayesian inference

07/31/2023
by   Javed Lindner, et al.
0

Bayesian inference and kernel methods are well established in machine learning. The neural network Gaussian process in particular provides a concept to investigate neural networks in the limit of infinitely wide hidden layers by using kernel and inference methods. Here we build upon this limit and provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks. We systematically compute generalization properties of linear, non-linear, and deep non-linear networks for kernel matrices with heterogeneous entries. In contrast to currently employed spectral methods we derive the generalization properties from the statistical properties of the input, elucidating the interplay of input dimensionality, size of the training data set, and variability of the data. We show that data variability leads to a non-Gaussian action reminiscent of a (φ^3+φ^4)-theory. Using our formalism on a synthetic task and on MNIST we obtain a homogeneous kernel matrix approximation for the learning curve as well as corrections due to data variability which allow the estimation of the generalization properties and exact results for the bounds of the learning curves in the case of infinitely many training data points.

READ FULL TEXT

page 2

page 4

research
05/17/2023

Sparsity-depth Tradeoff in Infinitely Wide Deep Neural Networks

We investigate how sparse neural activity affects the generalization per...
research
03/06/2023

Bayesian inference with finitely wide neural networks

The analytic inference, e.g. predictive distribution being in closed for...
research
10/23/2021

Learning curves for Gaussian process regression with power-law priors and targets

We study the power-law asymptotics of learning curves for Gaussian proce...
research
05/28/2023

Bayesian inference and neural estimation of acoustic wave propagation

In this work, we introduce a novel framework which combines physics and ...
research
12/20/2021

Transformers Can Do Bayesian Inference

Currently, it is hard to reap the benefits of deep learning for Bayesian...
research
01/15/2013

Why Size Matters: Feature Coding as Nystrom Sampling

Recently, the computer vision and machine learning community has been in...
research
12/15/2022

Time-limited Balanced Truncation for Data Assimilation Problems

Balanced truncation is a well-established model order reduction method i...

Please sign up or login with your details

Forgot password? Click here to reset