When does gradient descent with logistic loss interpolate using deep networks with smoothed ReLU activations?

02/09/2021
by   Niladri S. Chatterji, et al.
0

We establish conditions under which gradient descent applied to fixed-width deep networks drives the logistic loss to zero, and prove bounds on the rate of convergence. Our analysis applies for smoothed approximations to the ReLU, such as Swish and the Huberized ReLU, proposed in previous applied work. We provide two sufficient conditions for convergence. The first is simply a bound on the loss at initialization. The second is a data separation condition used in prior analyses.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/04/2020

When does gradient descent with logistic loss find interpolating two-layer networks?

We study the training of finite-width two-layer smoothed ReLU networks f...
research
01/24/2021

On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths

This paper studies the global convergence of gradient descent for deep R...
research
09/29/2022

Restricted Strong Convexity of Deep Learning Models with Smooth Activations

We consider the problem of optimization of deep learning models with smo...
research
08/04/2022

Agnostic Learning of General ReLU Activation Using Gradient Descent

We provide a convergence analysis of gradient descent for the problem of...
research
05/27/2020

On the Convergence of Gradient Descent Training for Two-layer ReLU-networks in the Mean Field Regime

We describe a necessary and sufficient condition for the convergence to ...
research
06/14/2020

Global Convergence of Sobolev Training for Overparametrized Neural Networks

Sobolev loss is used when training a network to approximate the values a...
research
08/03/2022

Gradient descent provably escapes saddle points in the training of shallow ReLU networks

Dynamical systems theory has recently been applied in optimization to pr...

Please sign up or login with your details

Forgot password? Click here to reset