Non-Vacuous Generalisation Bounds for Shallow Neural Networks

02/03/2022
by   Felix Biggs, et al.
3

We focus on a specific class of shallow neural networks with a single hidden layer, namely those with L_2-normalised data and either a sigmoid-shaped Gaussian error function ("erf") activation or a Gaussian Error Linear Unit (GELU) activation. For these networks, we derive new generalisation bounds through the PAC-Bayesian theory; unlike most existing such bounds they apply to neural networks with deterministic rather than randomised parameters. Our bounds are empirically non-vacuous when the network is trained with vanilla stochastic gradient descent on MNIST and Fashion-MNIST.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/28/2022

Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks

We study the overparametrization bounds required for the global converge...
research
11/10/2019

Symmetrical Gaussian Error Linear Units (SGELUs)

In this paper, a novel neural network activation function, called Symmet...
research
12/21/2021

Risk bounds for aggregated shallow neural networks using Gaussian prior

Analysing statistical properties of neural networks is a central topic i...
research
07/08/2021

On Margins and Derandomisation in PAC-Bayes

We develop a framework for derandomising PAC-Bayesian generalisation bou...
research
10/09/2019

Nearly Minimal Over-Parametrization of Shallow Neural Networks

A recent line of work has shown that an overparametrized neural network ...
research
07/11/2023

Fundamental limits of overparametrized shallow neural networks for supervised learning

We carry out an information-theoretical analysis of a two-layer neural n...
research
06/15/2021

Predicting Unreliable Predictions by Shattering a Neural Network

Piecewise linear neural networks can be split into subfunctions, each wi...

Please sign up or login with your details

Forgot password? Click here to reset