Divergence of the ADAM algorithm with fixed-stepsize: a (very) simple example

08/01/2023
by   Ph. L. Toint, et al.
0

A very simple unidimensional function with Lipschitz continuous gradient is constructed such that the ADAM algorithm with constant stepsize, started from the origin, diverges when applied to minimize this function in the absence of noise on the gradient. Divergence occurs irrespective of the choice of the method parameters.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset