Balancedness and Alignment are Unlikely in Linear Neural Networks

We study the invariance properties of alignment in linear neural networks under gradient descent. Alignment of weight matrices is a form of implicit regularization, and previous works have studied this phenomenon in fully connected networks with 1-dimensional outputs. In such networks, we prove that there exists an initialization such that adjacent layers remain aligned throughout training under any real-valued loss function. We then define alignment for fully connected networks with multidimensional outputs and prove that it generally cannot be an invariant for such networks under the squared loss. Moreover, we characterize the datasets under which alignment is possible. We then analyze networks with layer constraints such as convolutional networks. In particular, we prove that gradient descent is equivalent to projected gradient descent, and show that alignment is impossible given sufficiently large datasets. Importantly, since our definition of alignment is a relaxation of balancedness, our negative results extend to this property.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/04/2018

Gradient descent aligns the layers of deep linear networks

This paper establishes risk convergence and asymptotic weight matrix ali...
research
08/05/2022

On the non-universality of deep learning: quantifying the cost of symmetry

We prove computational limitations for learning with neural networks tra...
research
02/25/2022

An initial alignment between neural network and target is needed for gradient descent to learn

This paper introduces the notion of "Initial Alignment" (INAL) between a...
research
06/01/2018

Implicit Bias of Gradient Descent on Linear Convolutional Networks

We show that gradient descent on full-width linear convolutional network...
research
02/11/2018

Optimizing Neural Networks in the Equivalent Class Space

It has been widely observed that many activation functions and pooling m...
research
07/08/2014

Regression-Based Image Alignment for General Object Categories

Gradient-descent methods have exhibited fast and reliable performance fo...
research
07/09/2023

Trajectory Alignment: Understanding the Edge of Stability Phenomenon via Bifurcation Theory

Cohen et al. (2021) empirically study the evolution of the largest eigen...

Please sign up or login with your details

Forgot password? Click here to reset