Towards Understanding Generalization via Decomposing Excess Risk Dynamics

06/11/2021
by   Jiaye Teng, et al.
0

Generalization is one of the critical issues in machine learning. However, traditional methods like uniform convergence are not powerful enough to fully explain generalization because they may yield vacuous bounds even in overparameterized linear regression regimes. An alternative solution is to analyze the generalization dynamics to derive algorithm-dependent bounds, e.g., stability. Unfortunately, the stability-based bound is still far from explaining the remarkable generalization ability of neural networks due to the coarse-grained analysis of the signal and noise. Inspired by the observation that neural networks show a slow convergence rate when fitting noise, we propose decomposing the excess risk dynamics and applying stability-based bound only on the variance part (which measures how the model performs on pure noise). We provide two applications for the framework, including a linear case (overparameterized linear regression with gradient descent) and a non-linear case (matrix recovery with gradient flow). Under the decomposition framework, the new bound accords better with the theoretical and empirical evidence compared to the stability-based bound and uniform convergence bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/27/2016

Generalization Error Bounds for Optimization Algorithms via Stability

Many machine learning tasks can be formulated as Regularized Empirical R...
research
02/22/2022

Connecting Optimization and Generalization via Gradient Flow Path Length

Optimization and generalization are two essential aspects of machine lea...
research
10/01/2020

Understanding the Role of Adversarial Regularization in Supervised Learning

Despite numerous attempts sought to provide empirical evidence of advers...
research
03/12/2019

An Exponential Efron-Stein Inequality for Lq Stable Learning Rules

There is accumulating evidence in the literature that stability of learn...
research
02/12/2022

Relaxing the Feature Covariance Assumption: Time-Variant Bounds for Benign Overfitting in Linear Regression

Benign overfitting demonstrates that overparameterized models can perfor...
research
10/17/2021

Explaining generalization in deep learning: progress and fundamental limits

This dissertation studies a fundamental open challenge in deep learning ...
research
03/26/2021

Lower Bounds on the Generalization Error of Nonlinear Learning Models

We study in this paper lower bounds for the generalization error of mode...

Please sign up or login with your details

Forgot password? Click here to reset