The Gaussian equivalence of generative models for learning with two-layer neural networks

06/25/2020
by   Sebastian Goldt, et al.
87

Understanding the impact of data structure on learning in neural networks remains a key challenge for the theory of neural networks. Many theoretical works on neural networks do not explicitly model training data, or assume that inputs are drawn independently from some factorised probability distribution. Here, we go beyond the simple i.i.d. modelling paradigm by studying neural networks trained on data drawn from structured generative models. We make three contributions: First, we establish rigorous conditions under which a class of generative models shares key statistical properties with an appropriately chosen Gaussian feature model. Second, we use this Gaussian equivalence theorem (GET) to derive a closed set of equations that describe the dynamics of two-layer neural networks trained using one-pass stochastic gradient descent on data drawn from a large class of generators. We complement our theoretical results by experiments demonstrating how our theory applies to deep, pre-trained generative models.

READ FULL TEXT
research
06/01/2021

The Gaussian equivalence of generative models for learning with shallow neural networks

Understanding the impact of data structure on the computational tractabi...
research
11/21/2022

Neural networks trained with SGD learn distributions of increasing complexity

The ability of deep neural networks to generalise well even when they in...
research
09/25/2019

Modelling the influence of data structure on learning in neural networks

The lack of crisp mathematical models that capture the structure of real...
research
06/04/2021

Learning Curves for SGD on Structured Features

The generalization performance of a machine learning algorithm such as a...
research
10/27/2020

A Bayesian Perspective on Training Speed and Model Selection

We take a Bayesian perspective to illustrate a connection between traini...
research
03/26/2018

A Provably Correct Algorithm for Deep Learning that Actually Works

We describe a layer-by-layer algorithm for training deep convolutional n...
research
05/26/2022

Gaussian Universality of Linear Classifiers with Random Labels in High-Dimension

While classical in many theoretical settings, the assumption of Gaussian...

Please sign up or login with your details

Forgot password? Click here to reset