Investigating the locality of neural network training dynamics

11/01/2021
by   Soham Dan, et al.
0

A fundamental quest in the theory of deep-learning is to understand the properties of the trajectories in the weight space that a learning algorithm takes. One such property that had very recently been isolated is that of "local elasticity" (S_ rel), which quantifies the propagation of influence of a sampled data point on the prediction at another data point. In this work, we perform a comprehensive study of local elasticity by providing new theoretical insights and more careful empirical evidence of this property in a variety of settings. Firstly, specific to the classification setting, we suggest a new definition of the original idea of S_ rel. Via experiments on state-of-the-art neural networks training on SVHN, CIFAR-10 and CIFAR-100 we demonstrate how our new S_ rel detects the property of the weight updates preferring to make changes in predictions within the same class of the sampled data. Next, we demonstrate via examples of neural nets doing regression that the original S_ rel reveals a 2-phase behaviour: that their training proceeds via an initial elastic phase when S_ rel changes rapidly and an eventual inelastic phase when S_ rel remains large. Lastly, we give multiple examples of learning via gradient flows for which one can get a closed-form expression of the original S_ rel function. By studying the plots of these derived formulas we given a theoretical demonstration of some of the experimentally detected properties of S_ rel in the regression setting.

READ FULL TEXT
research
03/01/2017

Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank

Recently low displacement rank (LDR) matrices, or so-called structured m...
research
03/04/2020

The large learning rate phase of deep learning: the catapult mechanism

The choice of initial learning rate can have a profound effect on the pe...
research
10/25/2017

mixup: Beyond Empirical Risk Minimization

Large deep neural networks are powerful, but exhibit undesirable behavio...
research
04/29/2020

Equilibrium Propagation with Continual Weight Updates

Equilibrium Propagation (EP) is a learning algorithm that bridges Machin...
research
05/25/2021

Towards Understanding the Condensation of Two-layer Neural Networks at Initial Training

It is important to study what implicit regularization is imposed on the ...
research
10/11/2021

Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations

Understanding the training dynamics of deep learning models is perhaps a...
research
06/20/2023

No Wrong Turns: The Simple Geometry Of Neural Networks Optimization Paths

Understanding the optimization dynamics of neural networks is necessary ...

Please sign up or login with your details

Forgot password? Click here to reset