Analysis of SGD with Biased Gradient Estimators

07/31/2020
by   Ahmad Ajalloeian, et al.
0

We analyze the complexity of biased stochastic gradient methods (SGD), where individual updates are corrupted by deterministic, i.e. biased error terms. We derive convergence results for smooth (non-convex) functions and give improved rates under the Polyak-Lojasiewicz condition. We quantify how the magnitude of the bias impacts the attainable accuracy and convergence rates. Our framework covers many applications where either only biased gradient updates are available or preferred over unbiased ones for performance reasons. For instance, in the domain of distributed learning, biased gradient compression techniques such as top-k compression have been proposed as a tool to alleviate the communication bottleneck and in derivative-free optimization, only biased gradient estimators can be queried. We discuss a few guiding examples that show the broad applicability of our analysis.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/27/2020

On Biased Compression for Distributed Learning

In the last few years, various communication compression techniques have...
research
09/04/2020

On Communication Compression for Distributed Optimization on Heterogeneous Data

Lossy gradient compression, with either unbiased or biased compressors, ...
research
05/25/2023

A Guide Through the Zoo of Biased SGD

Stochastic Gradient Descent (SGD) is arguably the most important single ...
research
12/14/2020

Quantizing data for distributed learning

We consider machine learning applications that train a model by leveragi...
research
01/28/2019

Error Feedback Fixes SignSGD and other Gradient Compression Schemes

Sign-based algorithms (e.g. signSGD) have been proposed as a biased grad...
research
10/23/2020

Linearly Converging Error Compensated SGD

In this paper, we propose a unified analysis of variants of distributed ...
research
02/18/2018

Optimizing Spectral Sums using Randomized Chebyshev Expansions

The trace of matrix functions, often called spectral sums, e.g., rank, l...

Please sign up or login with your details

Forgot password? Click here to reset