Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization
We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with n component functions. At the core of our analysis is a new decomposition scheme of the optimization error, under which we directly analyze the ergodicity of the numerical approximations of Langevin dynamics and prove sharp convergence rates. We establish the first global convergence guarantee of gradient Langevin dynamics (GLD) with iteration complexity O(1/ϵ·(1/ϵ)). In addition, we improve the convergence rate of stochastic gradient Langevin dynamics (SGLD) to the "almost minimizer", which does not depend on the undesirable uniform spectral gap introduced in previous studies. Furthermore, we for the first time prove the global convergence guarantee of variance reduced stochastic gradient Langevin dynamics (VR-SGLD) with iteration complexity O(m/(Bϵ^3)·(1/ϵ)), where B is the mini-batch size and m is the length of the inner loop. We show that the gradient complexity of VR-SGLD is O(n^1/2/ϵ^3/2·(1/ϵ)), which outperforms O(n/ϵ·(1/ϵ)) gradient complexity of GLD, when the number of component functions satisfies n ≥ 1/ϵ. Our theoretical analysis shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.
READ FULL TEXT