Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem

03/08/2021
by   Francesca Mignacco, et al.
0

In this paper we investigate how gradient-based algorithms such as gradient descent, (multi-pass) stochastic gradient descent, its persistent variant, and the Langevin algorithm navigate non-convex loss-landscapes and which of them is able to reach the best generalization error at limited sample complexity. We consider the loss landscape of the high-dimensional phase retrieval problem as a prototypical highly non-convex example. We observe that for phase retrieval the stochastic variants of gradient descent are able to reach perfect generalization for regions of control parameters where the gradient descent algorithm is not. We apply dynamical mean-field theory from statistical physics to characterize analytically the full trajectories of these algorithms in their continuous-time limit, with a warm start, and for large system sizes. We further unveil several intriguing properties of the landscape and the algorithms such as that the gradient descent can obtain better generalization properties from less informed initializations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/10/2020

Dynamical mean-field theory for stochastic gradient descent in Gaussian mixture classification

We analyze in a closed form the learning dynamics of stochastic gradient...
research
05/29/2019

How to iron out rough landscapes and get optimal performances: Replicated Gradient Descent and its application to tensor PCA

In many high-dimensional estimation problems the main task consists in m...
research
06/17/2022

Landscape Learning for Neural Network Inversion

Many machine learning methods operate by inverting a neural network at i...
research
01/31/2020

Learning Unitaries by Gradient Descent

We study the hardness of learning unitary transformations by performing ...
research
12/20/2014

Explorations on high dimensional landscapes

Finding minima of a real valued non-convex function over a high dimensio...
research
11/29/2019

A note on Douglas-Rachford, subgradients, and phase retrieval

The properties of gradient techniques for the phase retrieval problem ha...
research
01/25/2022

On Uniform Boundedness Properties of SGD and its Momentum Variants

A theoretical, and potentially also practical, problem with stochastic g...

Please sign up or login with your details

Forgot password? Click here to reset