The Implicit Bias of Gradient Descent on Generalized Gated Linear Networks

02/05/2022
by   Samuel Lippl, et al.
0

Understanding the asymptotic behavior of gradient-descent training of deep neural networks is essential for revealing inductive biases and improving network performance. We derive the infinite-time training limit of a mathematically tractable class of deep nonlinear neural networks, gated linear networks (GLNs), and generalize these results to gated networks described by general homogeneous polynomials. We study the implications of our results, focusing first on two-layer GLNs. We then apply our theoretical predictions to GLNs trained on MNIST and show how architectural constraints and the implicit bias of gradient descent affect performance. Finally, we show that our theory captures a substantial portion of the inductive bias of ReLU networks. By making the inductive bias explicit, our framework is poised to inform the development of more efficient, biologically plausible, and robust learning algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2021

Depth Without the Magic: Inductive Bias of Natural Gradient Descent

In gradient descent, changing how we parametrize the model can lead to d...
research
10/18/2022

Disentangling the Predictive Variance of Deep Ensembles through the Neural Tangent Kernel

Identifying unfamiliar inputs, also known as out-of-distribution (OOD) d...
research
07/07/2020

Gradient Descent Converges to Ridgelet Spectrum

Deep learning achieves a high generalization performance in practice, de...
research
07/21/2022

The Neural Race Reduction: Dynamics of Abstraction in Gated Networks

Our theoretical understanding of deep learning has not kept pace with it...
research
10/31/2022

Globally Gated Deep Linear Networks

Recently proposed Gated Linear Networks present a tractable nonlinear ne...
research
10/19/2020

Parameter Norm Growth During Training of Transformers

The capacity of neural networks like the widely adopted transformer is k...
research
10/24/2020

Inductive Bias of Gradient Descent for Exponentially Weight Normalized Smooth Homogeneous Neural Nets

We analyze the inductive bias of gradient descent for weight normalized ...

Please sign up or login with your details

Forgot password? Click here to reset