Log In Sign Up

Implicit Bias of Linear Equivariant Networks

by   Hannah Lawrence, et al.

Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs) which excel in a wide range of scientific and technical applications by explicitly encoding group symmetries, such as rotations and permutations, in their architectures. Although the success of G-CNNs is driven by the explicit symmetry bias of their convolutional architecture, a recent line of work has proposed that the implicit bias of training algorithms on a particular parameterization (or architecture) is key to understanding generalization for overparameterized neural nets. In this context, we show that L-layer full-width linear G-CNNs trained via gradient descent in a binary classification task converge to solutions with low-rank Fourier matrix coefficients, regularized by the 2/L-Schatten matrix norm. Our work strictly generalizes previous analysis on the implicit bias of linear CNNs to linear G-CNNs over all finite groups, including the challenging setting of non-commutative symmetry groups (such as permutations). We validate our theorems via experiments on a variety of groups and empirically explore more realistic nonlinear networks, which locally capture similar regularization patterns. Finally, we provide intuitive interpretations of our Fourier space implicit regularization results in real space via uncertainty principles.


page 9

page 23


Universal Approximation Theorem for Equivariant Maps by Group CNNs

Group symmetry is inherent in a wide variety of data distributions. Data...

Implicit Neural Convolutional Kernels for Steerable CNNs

Steerable convolutional neural networks (CNNs) provide a general framewo...

Learning Equivariances and Partial Equivariances from Data

Group equivariant Convolutional Neural Networks (G-CNNs) constrain featu...

Explicitly Bayesian Regularizations in Deep Learning

Generalization is essential for deep learning. In contrast to previous w...

A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis

The Strong Lottery Ticket Hypothesis (SLTH) stipulates the existence of ...

Robust Recovery via Implicit Bias of Discrepant Learning Rates for Double Over-parameterization

Recent advances have shown that implicit bias of gradient descent on ove...

Inductive Bias of Multi-Channel Linear Convolutional Networks with Bounded Weight Norm

We study the function space characterization of the inductive bias resul...