Connecting Optimization and Generalization via Gradient Flow Path Length

02/22/2022
by   Fusheng Liu, et al.
27

Optimization and generalization are two essential aspects of machine learning. In this paper, we propose a framework to connect optimization with generalization by analyzing the generalization error based on the length of optimization trajectory under the gradient flow algorithm after convergence. Through our approach, we show that, with a proper initialization, gradient flow converges following a short path with an explicit length estimate. Such an estimate induces a length-based generalization bound, showing that short optimization paths after convergence are associated with good generalization, which also matches our numerical results. Our framework can be applied to broad settings. For example, we use it to obtain generalization estimates on three distinct machine learning models: underdetermined ℓ_p linear regression, kernel regression, and overparameterized two-layer ReLU neural networks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/11/2021

Towards Understanding Generalization via Decomposing Excess Risk Dynamics

Generalization is one of the critical issues in machine learning. Howeve...
research
09/30/2022

On the optimization and generalization of overparameterized implicit neural networks

Implicit neural networks have become increasingly attractive in the mach...
research
02/13/2019

Differential Description Length for Hyperparameter Selection in Machine Learning

This paper introduces a new method for model selection and more generall...
research
08/02/2019

Path Length Bounds for Gradient Descent and Flow

We provide path length bounds on gradient descent (GD) and flow (GF) cur...
research
12/07/2021

A generalization gap estimation for overparameterized models via the Langevin functional variance

This paper discusses the estimation of the generalization gap, the diffe...
research
06/04/2022

A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning

Adaptive gradient methods have become popular in optimizing deep neural ...
research
04/02/2023

Saddle-to-Saddle Dynamics in Diagonal Linear Networks

In this paper we fully describe the trajectory of gradient flow over dia...

Please sign up or login with your details

Forgot password? Click here to reset