Analysis of SGD with Biased Gradient Estimators

07/31/2020
by   Ahmad Ajalloeian, et al.
0

We analyze the complexity of biased stochastic gradient methods (SGD), where individual updates are corrupted by deterministic, i.e. biased error terms. We derive convergence results for smooth (non-convex) functions and give improved rates under the Polyak-Lojasiewicz condition. We quantify how the magnitude of the bias impacts the attainable accuracy and convergence rates. Our framework covers many applications where either only biased gradient updates are available or preferred over unbiased ones for performance reasons. For instance, in the domain of distributed learning, biased gradient compression techniques such as top-k compression have been proposed as a tool to alleviate the communication bottleneck and in derivative-free optimization, only biased gradient estimators can be queried. We discuss a few guiding examples that show the broad applicability of our analysis.

READ FULL TEXT

Authors

page 1

page 2

page 3

page 4

02/27/2020

On Biased Compression for Distributed Learning

In the last few years, various communication compression techniques have...
09/04/2020

On Communication Compression for Distributed Optimization on Heterogeneous Data

Lossy gradient compression, with either unbiased or biased compressors, ...
12/14/2020

Quantizing data for distributed learning

We consider machine learning applications that train a model by leveragi...
01/28/2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes

Sign-based algorithms (e.g. signSGD) have been proposed as a biased grad...
03/23/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Decentralized stochastic optimization methods have gained a lot of atten...
02/18/2018

Optimizing Spectral Sums using Randomized Chebyshev Expansions

The trace of matrix functions, often called spectral sums, e.g., rank, l...
02/09/2021

A New Framework for Variance-Reduced Hamiltonian Monte Carlo

We propose a new framework of variance-reduced Hamiltonian Monte Carlo (...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.