A Hitting Time Analysis of Stochastic Gradient Langevin Dynamics

02/18/2017
by   Yuchen Zhang, et al.
0

We study the Stochastic Gradient Langevin Dynamics (SGLD) algorithm for non-convex optimization. The algorithm performs stochastic gradient descent, where in each step it injects appropriately scaled Gaussian noise to the update. We analyze the algorithm's hitting time to an arbitrary subset of the parameter space. Two results follow from our general theory: First, we prove that for empirical risk minimization, if the empirical risk is point-wise close to the (smooth) population risk, then the algorithm achieves an approximate local minimum of the population risk in polynomial time, escaping suboptimal local minima that only exist in the empirical risk. Second, we show that SGLD improves on one of the best known learnability results for learning linear classifiers under the zero-one loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2017

Non-convex learning via Stochastic Gradient Langevin Dynamics: a nonasymptotic analysis

Stochastic Gradient Langevin Dynamics (SGLD) is a popular variant of Sto...
research
03/25/2018

Minimizing Nonconvex Population Risk from Rough Empirical Risk

Population risk---the expectation of the loss over the sampling mechanis...
research
10/24/2019

Diametrical Risk Minimization: Theory and Computations

The theoretical and empirical performance of Empirical Risk Minimization...
research
02/05/2019

Distribution-Dependent Analysis of Gibbs-ERM Principle

Gibbs-ERM learning is a natural idealized model of learning with stochas...
research
05/28/2018

Understanding Generalization and Optimization Performance of Deep CNNs

This work aims to provide understandings on the remarkable success of de...
research
02/18/2018

Local Optimality and Generalization Guarantees for the Langevin Algorithm via Empirical Metastability

We study the detailed path-wise behavior of the discrete-time Langevin a...
research
11/21/2016

Scalable Approximations for Generalized Linear Problems

In stochastic optimization, the population risk is generally approximate...

Please sign up or login with your details

Forgot password? Click here to reset