Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality
In this paper, we present a novel stochastic normal map-based algorithm (ππππ¬-π²π¦π£) for nonconvex composite-type optimization problems and discuss its convergence properties. Using a time window-based strategy, we first analyze the global convergence behavior of ππππ¬-π²π¦π£ and it is shown that every accumulation point of the generated sequence of iterates {x^k}_k corresponds to a stationary point almost surely and in an expectation sense. The obtained results hold under standard assumptions and extend the more limited convergence guarantees of the basic proximal stochastic gradient method. In addition, based on the well-known Kurdyka-Εojasiewicz (KL) analysis framework, we provide novel point-wise convergence results for the iterates {x^k}_k and derive convergence rates that depend on the underlying KL exponent ΞΈ and the step size dynamics {Ξ±_k}_k. Specifically, for the popular step size scheme Ξ±_k=πͺ(1/k^Ξ³), Ξ³β (2/3,1], (almost sure) rates of the form x^k-x^* = πͺ(1/k^p), p β (0,1/2), can be established. The obtained rates are faster than related and existing convergence rates for π²π¦π£ and improve on the non-asymptotic complexity bounds for ππππ¬-π²π¦π£.
READ FULL TEXT