Finite Precision Stochastic Optimization -- Accounting for the Bias

08/22/2019
by   Prathamesh Mayekar, et al.
0

We consider first order stochastic optimization where the oracle must quantize each subgradient estimate to $r$ bits. We treat two oracle models: the first where the Euclidean norm of the oracle output is almost surely bounded and the second where it is mean square bounded. Prior work in this setting assumes the availability of unbiased quantizers. While this assumption is valid in the case of almost surely bounded oracles, it does not hold true for the standard setting of mean square bounded oracles, and the bias can dramatically affect the convergence rate. We analyze the performance of standard quantizers from prior work in combination with projected stochastic gradient descent for both these oracle models and present two new adaptive quantizers that outperform the existing ones. Specifically, for almost surely bounded oracles, we establish first a lower bound for the precision needed to attain the standard convergence rate of $T^{-\frac 12}$ for optimizing convex functions over a $d$-dimentional domain. Our proposed Rotated Adaptive Tetra-iterated Quantizer (RATQ) is merely a factor $O(\log \log \log^\ast d)$ far from this lower bound. For mean square bounded oracles, we show that a state-of-the-art Rotated Uniform Quantizer (RUQ) from prior work would need atleast $\Omega(d\log T)$ bits to achieve the convergence rate of $T^{-\frac 12}$, using any optimization protocol. However, our proposed Rotated Adaptive Quantizer (RAQ) outperforms RUQ in this setting and attains a convergence rate of $T^{-\frac 12}$ using a precision of only $O(d\log\log T)$. For mean square bounded oracles, in the communication-starved regime where the precision $r$ is fixed to a constant independent of $T$, we show that RUQ cannot attain a convergence rate better than $T^{-\frac 14}$ for any $r$, while RAQ can attain convergence at rates arbitrarily close to $T^{-\frac 12}$ as $r$ increases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/22/2019

Finite Precision Stochastic Optimisation -- Accounting for the Bias

We consider first order stochastic optimization where the oracle must qu...
research
02/14/2023

Breaking the Lower Bound with (Little) Structure: Acceleration in Non-Convex Stochastic Optimization with Heavy-Tailed Noise

We consider the stochastic optimization problem with smooth but not nece...
research
01/24/2020

Limits on Gradient Compression for Stochastic Optimization

We consider stochastic optimization over ℓ_p spaces using access to a fi...
research
04/02/2021

Information-constrained optimization: can adaptive processing of gradients help?

We revisit first-order optimization under local information constraints ...
research
01/08/2018

How To Make the Gradients Small Stochastically

In convex stochastic optimization, convergence rates in terms of minimiz...
research
01/28/2023

Unbiased estimators for the Heston model with stochastic interest rates

We combine the unbiased estimators in Rhee and Glynn (Operations Researc...
research
10/24/2019

Arbitrary Rates of Convergence for Projected and Extrinsic Means

We study central limit theorems for the projected sample mean of indepen...

Please sign up or login with your details

Forgot password? Click here to reset