Implicit Bias of Linear Equivariant Networks

10/12/2021
by   Hannah Lawrence, et al.
0

Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs) which excel in a wide range of scientific and technical applications by explicitly encoding group symmetries, such as rotations and permutations, in their architectures. Although the success of G-CNNs is driven by the explicit symmetry bias of their convolutional architecture, a recent line of work has proposed that the implicit bias of training algorithms on a particular parameterization (or architecture) is key to understanding generalization for overparameterized neural nets. In this context, we show that L-layer full-width linear G-CNNs trained via gradient descent in a binary classification task converge to solutions with low-rank Fourier matrix coefficients, regularized by the 2/L-Schatten matrix norm. Our work strictly generalizes previous analysis on the implicit bias of linear CNNs to linear G-CNNs over all finite groups, including the challenging setting of non-commutative symmetry groups (such as permutations). We validate our theorems via experiments on a variety of groups and empirically explore more realistic nonlinear networks, which locally capture similar regularization patterns. Finally, we provide intuitive interpretations of our Fourier space implicit regularization results in real space via uncertainty principles.

READ FULL TEXT

page 9

page 23

research
12/27/2020

Universal Approximation Theorem for Equivariant Maps by Group CNNs

Group symmetry is inherent in a wide variety of data distributions. Data...
research
12/12/2022

Implicit Neural Convolutional Kernels for Steerable CNNs

Steerable convolutional neural networks (CNNs) provide a general framewo...
research
10/19/2021

Learning Equivariances and Partial Equivariances from Data

Group equivariant Convolutional Neural Networks (G-CNNs) constrain featu...
research
11/08/2021

Lattice gauge symmetry in neural networks

We review a novel neural network architecture called lattice gauge equiv...
research
10/22/2019

Explicitly Bayesian Regularizations in Deep Learning

Generalization is essential for deep learning. In contrast to previous w...
research
06/09/2022

A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis

The Strong Lottery Ticket Hypothesis (SLTH) stipulates the existence of ...
research
02/24/2021

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

We study the function space characterization of the inductive bias resul...

Please sign up or login with your details

Forgot password? Click here to reset