Scale Mixtures of Neural Network Gaussian Processes

07/03/2021
by   Hyungi Lee, et al.
0

Recent works have revealed that infinitely-wide feed-forward or recurrent neural networks of any architecture correspond to Gaussian processes referred to as NNGP. While these works have extended the class of neural networks converging to Gaussian processes significantly, however, there has been little focus on broadening the class of stochastic processes that such neural networks converge to. In this work, inspired by the scale mixture of Gaussian random variables, we propose the scale mixture of NNGP for which we introduce a prior distribution on the scale of the last-layer parameters. We show that simply introducing a scale prior on the last-layer parameters can turn infinitely-wide neural networks of any architecture into a richer class of stochastic processes. Especially, with certain scale priors, we obtain heavy-tailed stochastic processes, and we recover Student's t processes in the case of inverse gamma priors. We further analyze the distributions of the neural networks initialized with our prior setting and trained with gradient descents and obtain similar results as for NNGP. We present a practical posterior-inference algorithm for the scale mixture of NNGP and empirically demonstrate its usefulness on regression and classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/30/2019

Non-Gaussian processes and neural networks at finite widths

Gaussian processes are ubiquitous in nature and engineering. A case in p...
research
07/19/2017

Improving Output Uncertainty Estimation and Generalization in Deep Learning via Neural Network Gaussian Processes

We propose a simple method that combines neural networks and Gaussian pr...
research
03/14/2019

Functional Variational Bayesian Neural Networks

Variational Bayesian neural networks (BNNs) perform variational inferenc...
research
05/17/2023

Deep quantum neural networks form Gaussian processes

It is well known that artificial neural networks initialized from indepe...
research
07/10/2020

Characteristics of Monte Carlo Dropout in Wide Neural Networks

Monte Carlo (MC) dropout is one of the state-of-the-art approaches for u...
research
06/24/2019

Sequential Neural Processes

Neural processes combine the strengths of neural networks and Gaussian p...
research
04/19/2023

Martingale Posterior Neural Processes

A Neural Process (NP) estimates a stochastic process implicitly defined ...

Please sign up or login with your details

Forgot password? Click here to reset