Magnitude and Angle Dynamics in Training Single ReLU Neurons

09/27/2022
by   Sangmin Lee, et al.
0

To understand learning the dynamics of deep ReLU networks, we investigate the dynamic system of gradient flow w(t) by decomposing it to magnitude w(t) and angle ϕ(t):= π - θ(t) components. In particular, for multi-layer single ReLU neurons with spherically symmetric data distribution and the square loss function, we provide upper and lower bounds for magnitude and angle components to describe the dynamics of gradient flow. Using the obtained bounds, we conclude that small scale initialization induces slow convergence speed for deep single ReLU neurons. Finally, by exploiting the relation of gradient flow and gradient descent, we extend our results to the gradient descent approach. All theoretical results are verified by experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/11/2022

Support Vectors and Gradient Dynamics for Implicit Bias in ReLU Networks

Understanding implicit bias of gradient descent has been an important go...
research
02/20/2023

Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron

We revisit the problem of learning a single neuron with ReLU activation ...
research
07/24/2023

Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization

This paper studies the problem of training a two-layer ReLU network for ...
research
11/16/2022

On the symmetries in the dynamics of wide two-layer neural networks

We consider the idealized setting of gradient flow on the population ris...
research
06/18/2019

Gradient Dynamics of Shallow Univariate ReLU Networks

We present a theoretical and empirical study of the gradient dynamics of...
research
06/02/2021

Learning a Single Neuron with Bias Using Gradient Descent

We theoretically study the fundamental problem of learning a single neur...
research
08/03/2022

Gradient descent provably escapes saddle points in the training of shallow ReLU networks

Dynamical systems theory has recently been applied in optimization to pr...

Please sign up or login with your details

Forgot password? Click here to reset