The Gaussian equivalence of generative models for learning with shallow neural networks

06/01/2021
by   Bruno Loureiro, et al.
0

Understanding the impact of data structure on the computational tractability of learning is a key challenge for the theory of neural networks. Many theoretical works do not explicitly model training data, or assume that inputs are drawn component-wise independently from some simple probability distribution. Here, we go beyond this simple paradigm by studying the performance of neural networks trained on data drawn from pre-trained generative models. This is possible due to a Gaussian equivalence stating that the key metrics of interest, such as the training and test errors, can be fully captured by an appropriately chosen Gaussian model. We provide three strands of rigorous, analytical and numerical evidence corroborating this equivalence. First, we establish rigorous conditions for the Gaussian equivalence to hold in the case of single-layer generative models, as well as deterministic rates for convergence in distribution. Second, we leverage this equivalence to derive a closed set of equations describing the generalisation performance of two widely studied machine learning problems: two-layer neural networks trained using one-pass stochastic gradient descent, and full-batch pre-learned features or kernel methods. Finally, we perform experiments demonstrating how our theory applies to deep, pre-trained generative models. These results open a viable path to the theoretical study of machine learning models with realistic data.

READ FULL TEXT

page 15

page 16

research
06/25/2020

The Gaussian equivalence of generative models for learning with two-layer neural networks

Understanding the impact of data structure on learning in neural network...
research
05/01/2021

One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks

There has been a recent surge of interest in understanding the convergen...
research
11/21/2022

Neural networks trained with SGD learn distributions of increasing complexity

The ability of deep neural networks to generalise well even when they in...
research
06/04/2021

Learning Curves for SGD on Structured Features

The generalization performance of a machine learning algorithm such as a...
research
08/01/2023

An Exact Kernel Equivalence for Finite Classification Models

We explore the equivalence between neural networks and kernel methods by...
research
06/10/2020

Deterministic Gaussian Averaged Neural Networks

We present a deterministic method to compute the Gaussian average of neu...
research
03/10/2020

Frequency Bias in Neural Networks for Input of Non-Uniform Density

Recent works have partly attributed the generalization ability of over-p...

Please sign up or login with your details

Forgot password? Click here to reset