On the convergence proof of AMSGrad and a new version

04/07/2019

∙

The adaptive moment estimation algorithm Adam (Kingma and Ba, ICLR 2015) is a popular optimizer in the training of deep neural networks. However, Reddi et al. (ICLR 2018) have recently shown that the convergence proof of Adam is problematic and proposed a variant of Adam called AMSGrad as a fix. In this paper, we show that the convergence proof of AMSGrad is also problematic, and we present various fixes for it, which include a new version of AMSGrad.

READ FULL TEXT

On the convergence proof of AMSGrad and a new version

Sign in with Google

Consider DeepAI Pro