Explore Aggressively, Update Conservatively: Stochastic Extragradient Methods with Variable Stepsize Scaling

03/23/2020
by   Yu-Guan Hsieh, et al.
0

Owing to their stability and convergence speed, extragradient methods have become a staple for solving large-scale saddle-point problems in machine learning. The basic premise of these algorithms is the use of an extrapolation step before performing an update; thanks to this exploration step, extra-gradient methods overcome many of the non-convergence issues that plague gradient descent/ascent schemes. On the other hand, as we show in this paper, running vanilla extragradient with stochastic gradients may jeopardize its convergence, even in simple bilinear models. To overcome this failure, we investigate a double stepsize extragradient algorithm where the exploration step evolves at a more aggressive time-scale compared to the update step. We show that this modification allows the method to converge even with stochastic gradients, and we derive sharp convergence rates under an error bound condition.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2022

SRKCD: a stabilized Runge-Kutta method for stochastic optimization

We introduce a family of stochastic optimization methods based on the Ru...
research
05/23/2018

Predictive Local Smoothness for Stochastic Gradient Methods

Stochastic gradient methods are dominant in nonconvex optimization espec...
research
07/07/2018

Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile

Owing to their connection with generative adversarial networks (GANs), s...
research
08/20/2022

Adam Can Converge Without Any Modification on Update Rules

Ever since Reddi et al. 2018 pointed out the divergence issue of Adam, m...
research
11/13/2021

Bolstering Stochastic Gradient Descent with Model Building

Stochastic gradient descent method and its variants constitute the core ...
research
03/28/2019

Block stochastic gradient descent for large-scale tomographic reconstruction in a parallel network

Iterative algorithms have many advantages for linear tomographic image r...

Please sign up or login with your details

Forgot password? Click here to reset