Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle

07/07/2020
by   Shaocong Ma, et al.
0

Although SGD with random reshuffle has been widely-used in machine learning applications, there is a limited understanding of how model characteristics affect the convergence of the algorithm. In this work, we introduce model incoherence to characterize the diversity of model characteristics and study its impact on convergence of SGD with random reshuffle under weak strong convexity. Specifically, minimizer incoherence measures the discrepancy between the global minimizers of a sample loss and those of the total loss and affects the convergence error of SGD with random reshuffle. In particular, we show that the variable sequence generated by SGD with random reshuffle converges to a certain global minimizer of the total loss under full minimizer coherence. The other curvature incoherence measures the quality of condition numbers of the sample losses and determines the convergence rate of SGD. With model incoherence, our results show that SGD has a faster convergence rate and smaller convergence error under random reshuffle than those under random sampling, and hence provide justifications to the superior practical performance of SGD with random reshuffle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/02/2023

Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition

Modern machine learning models are often over-parameterized and as a res...
research
06/12/2018

Convergence of SGD in Learning ReLU Models with Separable Data

We consider the binary classification problem in which the objective fun...
research
06/12/2020

Adaptive Gradient Methods Can Be Provably Faster than SGD after Finite Epochs

Adaptive gradient methods have attracted much attention of machine learn...
research
11/13/2019

Exponential Convergence Rates of Classification Errors on Learning with SGD and Random Features

Although kernel methods are widely used in many learning problems, they ...
research
09/17/2023

Global Convergence of SGD For Logistic Loss on Two Layer Neural Nets

In this note, we demonstrate a first-of-its-kind provable convergence of...
research
03/31/2022

Data Sampling Affects the Complexity of Online SGD over Dependent Data

Conventional machine learning applications typically assume that data sa...
research
02/03/2022

Characterizing Finding Good Data Orderings for Fast Convergence of Sequential Gradient Methods

While SGD, which samples from the data with replacement is widely studie...

Please sign up or login with your details

Forgot password? Click here to reset