Variance reduction for distributed stochastic gradient MCMC

04/23/2020
by   Khaoula El Mekkaoui, et al.
11

Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), have emerged as one of the dominant approaches for posterior sampling in large-scale settings. While gradient evaluations based on only a small fraction of the data significantly reduce the computational cost, they may suffer from high variance, leading to slow convergence. In distributed settings, where the data lie scattered across a number of workers, the problem of high variance is particularly imminent and is even worse if the data subsets held by the workers are very heterogeneous. The impact of variance reduction has been studied in serial settings but not in distributed scenarios so far. In this work, we derive variance bounds for distributed SGLD and introduce the concept of conducive gradients, zero-mean stochastic gradients that serve as a mechanism for sharing probabilistic information between workers. We introduce a novel stochastic gradient estimator which incorporates the inducive gradients, and show both theoretically and empirically that it reduces variance, and hence improves convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2016

Stochastic Gradient MCMC with Stale Gradients

Stochastic gradient MCMC (SG-MCMC) has played an important role in large...
research
06/16/2017

Control Variates for Stochastic Gradient MCMC

It is well known that Markov chain Monte Carlo (MCMC) methods scale poor...
research
01/27/2020

Variance Reduction with Sparse Gradients

Variance reduction methods such as SVRG and SpiderBoost use a mixture of...
research
08/23/2020

Multi-kernel Passive Stochastic Gradient Algorithms

This paper develops a novel passive stochastic gradient algorithm. In pa...
research
10/28/2022

Preferential Subsampling for Stochastic Gradient Langevin Dynamics

Stochastic gradient MCMC (SGMCMC) offers a scalable alternative to tradi...
research
07/25/2020

Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient

Deep Q-learning algorithms often suffer from poor gradient estimations w...
research
04/22/2022

Distributed stochastic projection-free solver for constrained optimization

This paper proposes a distributed stochastic projection-free algorithm f...

Please sign up or login with your details

Forgot password? Click here to reset