Stability and Generalization of Learning Algorithms that Converge to Global Optima

10/23/2017
by   Zachary Charles, et al.
0

We establish novel generalization bounds for learning algorithms that converge to global minima. We do so by deriving black-box stability results that only depend on the convergence of a learning algorithm and the geometry around the minimizers of the loss function. The results are shown for nonconvex loss functions satisfying the Polyak-Łojasiewicz (PL) and the quadratic growth (QG) conditions. We further show that these conditions arise for some neural networks with linear activations. We use our black-box results to establish the stability of optimization algorithms such as stochastic gradient descent (SGD), gradient descent (GD), randomized coordinate descent (RCD), and the stochastic variance reduced gradient method (SVRG), in both the PL and the strongly convex setting. Our results match or improve state-of-the-art generalization bounds and can easily be extended to similar optimization algorithms. Finally, we show that although our results imply comparable stability for SGD and GD in the PL setting, there exist simple neural networks with multiple local minima where SGD is stable but GD is not.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2018

Generalization Error Bounds with Probabilistic Guarantee for SGD in Nonconvex Optimization

The success of deep learning has led to a rising interest in the general...
research
10/14/2022

A Scalable Finite Difference Method for Deep Reinforcement Learning

Several low-bandwidth distributable black-box optimization algorithms ha...
research
03/23/2023

The Probabilistic Stability of Stochastic Gradient Descent

A fundamental open problem in deep learning theory is how to define and ...
research
08/17/2021

Stability and Generalization for Randomized Coordinate Descent

Randomized coordinate descent (RCD) is a popular optimization algorithm ...
research
02/14/2022

Black-Box Generalization

We provide the first generalization error analysis for black-box learnin...
research
02/20/2020

Bounding the expected run-time of nonconvex optimization with early stopping

This work examines the convergence of stochastic gradient-based optimiza...
research
08/16/2022

On the generalization of learning algorithms that do not converge

Generalization analyses of deep learning typically assume that the train...

Please sign up or login with your details

Forgot password? Click here to reset