Quadratic models for understanding neural network dynamics

05/24/2022
by   Libin Zhu, et al.
0

In this work, we propose using a quadratic model as a tool for understanding properties of wide neural networks in both optimization and generalization. We show analytically that certain deep learning phenomena such as the "catapult phase" from [Lewkowycz et al. 2020], which cannot be captured by linear models, are manifested in the quadratic model for shallow ReLU networks. Furthermore, our empirical results indicate that the behaviour of quadratic models parallels that of neural networks in generalization, especially in the large learning rate regime. We expect that quadratic models will serve as a useful tool for analysis of neural networks.

READ FULL TEXT
research
01/18/2023

Catapult Dynamics and Phase Transitions in Quadratic Nets

Neural networks trained with gradient descent can undergo non-trivial ph...
research
10/03/2019

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Recent theoretical work has established connections between over-paramet...
research
06/08/2022

Neural Collapse: A Review on Modelling Principles and Generalization

With a recent observation of the "Neural Collapse (NC)" phenomena by Pap...
research
09/22/2021

Robust Generalization of Quadratic Neural Networks via Function Identification

A key challenge facing deep learning is that neural networks are often n...
research
06/04/2019

Towards Task and Architecture-Independent Generalization Gap Predictors

Can we use deep learning to predict when deep learning works? Our result...
research
02/15/2021

A generalized quadratic loss for SVM and Deep Neural Networks

We consider some supervised binary classification tasks and a regression...
research
09/26/2019

The Implicit Bias of Depth: How Incremental Learning Drives Generalization

A leading hypothesis for the surprising generalization of neural network...

Please sign up or login with your details

Forgot password? Click here to reset