Statistical Inference with Stochastic Gradient Algorithms

by   Jeffrey Negrea, et al.

Tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory–practice gap by characterizing the statistical asymptotics of SGAs via a joint step-size–sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choice of tuning parameters and asymptotically has covariance proportional to that of the MLE sampling distribution. We also prove a Bernstein–von Mises-like theorem to guide tuning, including for generalized posteriors that are robust to model misspecification. Numerical experiments validate our results in realistic finite-sample regimes.


page 1

page 2

page 3

page 4


Optimal Rates for Multi-pass Stochastic Gradient Methods

We analyze the learning properties of the stochastic gradient method whe...

(Non-) asymptotic properties of Stochastic Gradient Langevin Dynamics

Applying standard Markov chain Monte Carlo (MCMC) algorithms to large da...

Constant Step Size Stochastic Gradient Descent for Probabilistic Modeling

Stochastic gradient methods enable learning probabilistic models from la...

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

Tuning the step size of stochastic gradient descent is tedious and error...

Accelerating Mini-batch SARAH by Step Size Rules

StochAstic Recursive grAdient algoritHm (SARAH), originally proposed for...

Stochastic Methods in Variational Inequalities: Ergodicity, Bias and Refinements

For min-max optimization and variational inequalities problems (VIP) enc...

Learning with SGD and Random Features

Sketching and stochastic gradient methods are arguably the most common t...

Please sign up or login with your details

Forgot password? Click here to reset