Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

11/21/2022
by   Yaoyu Zhang, et al.
0

Models with nonlinear architectures/parameterizations such as deep neural networks (DNNs) are well known for their mysteriously good generalization performance at overparameterization. In this work, we tackle this mystery from a novel perspective focusing on the transition of the target recovery/fitting accuracy as a function of the training data size. We propose a rank stratification for general nonlinear models to uncover a model rank as an "effective size of parameters" for each function in the function space of the corresponding model. Moreover, we establish a linear stability theory proving that a target function almost surely becomes linearly stable when the training data size equals its model rank. Supported by our experiments, we propose a linear stability hypothesis that linearly stable functions are preferred by nonlinear training. By these results, model rank of a target function predicts a minimal training data size for its successful recovery. Specifically for the matrix factorization model and DNNs of fully-connected or convolutional architectures, our rank stratification shows that the model rank for specific target functions can be much lower than the size of model parameters. This result predicts the target recovery capability even at heavy overparameterization for these nonlinear models as demonstrated quantitatively by our experiments. Overall, our work provides a unified framework with quantitative prediction power to understand the mysterious target recovery behavior at overparameterization for general nonlinear models.

READ FULL TEXT

page 3

page 5

page 6

page 7

page 11

research
07/18/2023

Optimistic Estimate Uncovers the Potential of Nonlinear Models

We propose an optimistic estimate to evaluate the best possible fitting ...
research
11/03/2021

On the Application of Data-Driven Deep Neural Networks in Linear and Nonlinear Structural Dynamics

The use of deep neural network (DNN) models as surrogates for linear and...
research
10/29/2019

Scalable Deep Neural Networks via Low-Rank Matrix Factorization

Compressing deep neural networks (DNNs) is important for real-world appl...
research
03/18/2021

The Low-Rank Simplicity Bias in Deep Networks

Modern deep neural networks are highly over-parameterized compared to th...
research
10/15/2020

On the exact computation of linear frequency principle dynamics and its generalization

Recent works show an intriguing phenomenon of Frequency Principle (F-Pri...
research
09/29/2022

Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions

We show that the representation cost of fully connected neural networks ...
research
02/01/2018

Augmented Space Linear Model

The linear model uses the space defined by the input to project the targ...

Please sign up or login with your details

Forgot password? Click here to reset