Learning Trajectories are Generalization Indicators

04/25/2023
by   Jingwen Fu, et al.
0

The aim of this paper is to investigate the connection between learning trajectories of the Deep Neural Networks (DNNs) and their corresponding generalization capabilities when being optimized with broadly used gradient descent and stochastic gradient descent algorithms. In this paper, we construct Linear Approximation Function to model the trajectory information and we propose a new generalization bound with richer trajectory information based on it. Our proposed generalization bound relies on the complexity of learning trajectory and the ratio between the bias and diversity of training set. Experimental results indicate that the proposed method effectively captures the generalization trend across various training steps, learning rates, and label noise levels.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/30/2019

Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks

We study the training and generalization of deep neural networks (DNNs) ...
research
02/05/2021

Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Understanding the generalization capability of learning algorithms is at...
research
06/09/2022

Trajectory-dependent Generalization Bounds for Deep Neural Networks via Fractional Brownian Motion

Despite being tremendously overparameterized, it is appreciated that dee...
research
10/01/2022

Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition

This work analyzes the solution trajectory of gradient-based algorithms ...
research
09/25/2022

Stochastic Gradient Descent Captures How Children Learn About Physics

As children grow older, they develop an intuitive understanding of the p...
research
02/06/2022

Anticorrelated Noise Injection for Improved Generalization

Injecting artificial noise into gradient descent (GD) is commonly employ...
research
05/26/2022

Learning to Reason with Neural Networks: Generalization, Unseen Data and Boolean Measures

This paper considers the Pointer Value Retrieval (PVR) benchmark introdu...

Please sign up or login with your details

Forgot password? Click here to reset