A high-resolution dynamical view on momentum methods for over-parameterized neural networks

08/08/2022

∙

In this paper, we present the convergence analysis of momentum methods in training a two-layer over-parameterized ReLU neural network, where the number of parameters is significantly larger than that of training instances. Existing works on momentum methods show that the heavy-ball method (HB) and Nesterov's accelerated method (NAG) share the same limiting ordinary differential equation (ODE), which leads to identical convergence rate. From a high-resolution dynamical view, we show that HB differs from NAG in terms of the convergence rate. In addition, our findings provide tighter upper bounds on convergence for the high-resolution ODEs of HB and NAG.

READ FULL TEXT

A high-resolution dynamical view on momentum methods for over-parameterized neural networks

Sign in with Google

Consider DeepAI Pro