Learning Curves for SGD on Structured Features

06/04/2021
by   Blake Bordelon, et al.
0

The generalization performance of a machine learning algorithm such as a neural network depends in a non-trivial way on the structure of the data distribution. Models of generalization in machine learning theory often ignore the low-dimensional structure of natural signals, either by considering data-agnostic bounds or by studying the performance of the algorithm when trained on uncorrelated features. To analyze the influence of data structure on test loss dynamics, we study an exactly solveable model of stochastic gradient descent (SGD) which predicts test loss when training on features with arbitrary covariance structure. We solve the theory exactly for both Gaussian features and arbitrary features and we show that the simpler Gaussian model accurately predicts test loss of nonlinear random-feature models and deep neural networks trained with SGD on real datasets such as MNIST and CIFAR-10. We show that modeling the geometry of the data in the induced feature space is indeed crucial to accurately predict the test error throughout learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2022

Studying Generalization Through Data Averaging

The generalization of machine learning models has a complex dependence o...
research
06/07/2022

Generalization Error Bounds for Deep Neural Networks Trained by SGD

Generalization error bounds for deep neural networks trained by stochast...
research
03/01/2023

Learning curves for deep structured Gaussian feature models

In recent years, significant attention in deep learning theory has been ...
research
06/07/2023

Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning

In this paper, we first present an explanation regarding the common occu...
research
06/01/2021

The Gaussian equivalence of generative models for learning with shallow neural networks

Understanding the impact of data structure on the computational tractabi...
research
06/25/2020

The Gaussian equivalence of generative models for learning with two-layer neural networks

Understanding the impact of data structure on learning in neural network...
research
06/20/2019

Data Cleansing for Models Trained with SGD

Data cleansing is a typical approach used to improve the accuracy of mac...

Please sign up or login with your details

Forgot password? Click here to reset