Global Convergence of Gradient Descent for Deep Linear Residual Networks

11/02/2019
by   Lei Wu, et al.
0

We analyze the global convergence of gradient descent for deep linear residual networks by proposing a new initialization: zero-asymmetric (ZAS) initialization. It is motivated by avoiding stable manifolds of saddle points. We prove that under the ZAS initialization, for an arbitrary target matrix, gradient descent converges to an ε-optimal point in O(L^3 log(1/ε)) iterations, which scales polynomially with the network depth L. Our result and the (Ω(L)) convergence time for the standard initialization (Xavier or near-identity) [Shamir, 2018] together demonstrate the importance of the residual structure and the initialization in the optimization for deep linear neural networks, especially when L is large.

READ FULL TEXT

page 1

page 2

page 3

page 4

08/04/2021

Convergence of gradient descent for learning linear neural networks

We study the convergence properties of gradient descent for training dee...
09/23/2018

Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks

In this note, we study the dynamics of gradient descent on objective fun...
02/16/2018

Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks

We analyze algorithms for approximating a function f(x) = Φ x mapping ^d...
05/09/2021

Directional Convergence Analysis under Spherically Symmetric Distribution

We consider the fundamental problem of learning linear predictors (i.e.,...
05/28/2016

Weighted Residuals for Very Deep Networks

Deep residual networks have recently shown appealing performance on many...
05/31/2021

Why does CTC result in peaky behavior?

The peaky behavior of CTC models is well known experimentally. However, ...
11/11/2019

Stronger Convergence Results for Deep Residual Networks: Network Width Scales Linearly with Training Data Size

Deep neural networks are highly expressive machine learning models with ...