Training invariances and the low-rank phenomenon: beyond linear networks

01/28/2022
by   Thien Le, et al.
0

The implicit bias induced by the training of neural networks has become a topic of rigorous study. In the limit of gradient flow and gradient descent with appropriate step size, it has been shown that when one trains a deep linear network with logistic or exponential loss on linearly separable data, the weights converge to rank-1 matrices. In this paper, we extend this theoretical result to the much wider class of nonlinear ReLU-activated feedforward networks containing fully-connected layers and skip connections. To the best of our knowledge, this is the first time a low-rank phenomenon is proven rigorously for these architectures, and it reflects empirical results in the literature. The proof relies on specific local training invariances, sometimes referred to as alignment, which we show to hold for a wide set of ReLU architectures. Our proof relies on a specific decomposition of the network into a multilinear function and another ReLU network whose weights are constant under a certain parameter directional convergence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/04/2018

Algorithmic Regularization in Learning Deep Homogeneous Models: Layers are Automatically Balanced

We study the implicit regularization imposed by gradient descent for lea...
research
10/04/2018

Gradient descent aligns the layers of deep linear networks

This paper establishes risk convergence and asymptotic weight matrix ali...
research
01/30/2022

Implicit Regularization Towards Rank Minimization in ReLU Networks

We study the conjectured relationship between the implicit regularizatio...
research
02/02/2022

The Role of Linear Layers in Nonlinear Interpolating Networks

This paper explores the implicit bias of overparameterized neural networ...
research
10/06/2020

A Unifying View on Implicit Bias in Training Linear Neural Networks

We study the implicit bias of gradient flow (i.e., gradient descent with...
research
06/13/2022

Rank Diminishing in Deep Neural Networks

The rank of neural networks measures information flowing across layers. ...
research
06/11/2020

Directional convergence and alignment in deep learning

In this paper, we show that although the minimizers of cross-entropy and...

Please sign up or login with your details

Forgot password? Click here to reset