A high-resolution dynamical view on momentum methods for over-parameterized neural networks

08/08/2022
by   Xin Liu, et al.
1

In this paper, we present the convergence analysis of momentum methods in training a two-layer over-parameterized ReLU neural network, where the number of parameters is significantly larger than that of training instances. Existing works on momentum methods show that the heavy-ball method (HB) and Nesterov's accelerated method (NAG) share the same limiting ordinary differential equation (ODE), which leads to identical convergence rate. From a high-resolution dynamical view, we show that HB differs from NAG in terms of the convergence rate. In addition, our findings provide tighter upper bounds on convergence for the high-resolution ODEs of HB and NAG.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/28/2023

On Underdamped Nesterov's Acceleration

The high-resolution differential equation framework has been proven to b...
research
07/05/2021

Provable Convergence of Nesterov Accelerated Method for Over-Parameterized Neural Networks

Despite the empirical success of deep learning, it still lacks theoretic...
research
04/18/2022

A Convergence Analysis of Nesterov's Accelerated Gradient Method in Training Deep Linear Neural Networks

Momentum methods, including heavy-ball (HB) and Nesterov's accelerated g...
research
10/08/2021

Heavy Ball Momentum for Conditional Gradient

Conditional gradient, aka Frank Wolfe (FW) algorithms, have well-documen...
research
12/01/2020

Convergence of Gradient Algorithms for Nonconvex C^1+α Cost Functions

This paper is concerned with convergence of stochastic gradient algorith...
research
06/13/2023

Accelerated Convergence of Nesterov's Momentum for Deep Neural Networks under Partial Strong Convexity

Current state-of-the-art analyses on the convergence of gradient descent...
research
09/24/2022

Tradeoffs between convergence rate and noise amplification for momentum-based accelerated optimization algorithms

We study momentum-based first-order optimization algorithms in which the...

Please sign up or login with your details

Forgot password? Click here to reset