On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms

06/13/2022
by   Lam M. Nguyen, et al.
0

Stochastic gradient descent (SGD) algorithm is the method of choice in many machine learning tasks thanks to its scalability and efficiency in dealing with large-scale problems. In this paper, we focus on the shuffling version of SGD which matches the mainstream practical heuristics. We show the convergence to a global solution of shuffling SGD for a class of non-convex functions under over-parameterized settings. Our analysis employs more relaxed non-convex assumptions than previous literature. Nevertheless, we maintain the desired computational complexity as shuffling SGD has achieved in the general convex setting.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2022

Tackling benign nonconvexity with smoothing and stochastic gradients

Non-convex optimization problems are ubiquitous in machine learning, esp...
research
06/06/2012

No More Pesky Learning Rates

The performance of stochastic gradient descent (SGD) depends critically ...
research
10/13/2022

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) has been the method of choice for lear...
research
06/16/2021

Robust Training in High Dimensions via Block Coordinate Geometric Median Descent

Geometric median (Gm) is a classical method in statistics for achieving ...
research
09/27/2018

The Convergence of Sparsified Gradient Methods

Distributed training of massive machine learning models, in particular d...
research
01/10/2020

Choosing the Sample with Lowest Loss makes SGD Robust

The presence of outliers can potentially significantly skew the paramete...
research
06/30/2020

AdaSGD: Bridging the gap between SGD and Adam

In the context of stochastic gradient descent(SGD) and adaptive moment e...

Please sign up or login with your details

Forgot password? Click here to reset