Understanding the Covariance Structure of Convolutional Filters

10/07/2022
by   Asher Trockman, et al.
0

Neural network weights are typically initialized at random from univariate distributions, controlling just the variance of individual weights even in highly-structured operations like convolutions. Recent ViT-inspired convolutional networks such as ConvMixer and ConvNeXt use large-kernel depthwise convolutions whose learned filters have notable structure; this presents an opportunity to study their empirical covariances. In this work, we first observe that such learned filters have highly-structured covariance matrices, and moreover, we find that covariances calculated from small networks may be used to effectively initialize a variety of larger networks of different depths, widths, patch sizes, and kernel sizes, indicating a degree of model-independence to the covariance structure. Motivated by these findings, we then propose a learning-free multivariate initialization scheme for convolutional filters using a simple, closed-form construction of their covariance. Models using our initialization outperform those using traditional univariate initializations, and typically meet or exceed the performance of those initialized from the covariances of learned filters; in some cases, this improvement can be achieved without training the depthwise convolutional filters at all.

READ FULL TEXT

page 3

page 7

page 13

page 15

page 16

page 17

page 18

page 20

research
02/03/2018

A Model for Learned Bloom Filters and Related Structures

Recent work has suggested enhancing Bloom filters by using a pre-filter,...
research
02/18/2016

RandomOut: Using a convolutional gradient norm to rescue convolutional filters

Filters in convolutional neural networks are sensitive to their initiali...
research
03/18/2023

ExplainFix: Explainable Spatially Fixed Deep Networks

Is there an initialization for deep networks that requires no learning? ...
research
02/19/2021

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent

Recent work has highlighted the role of initialization scale in determin...
research
12/28/2020

Action Recognition with Kernel-based Graph Convolutional Networks

Learning graph convolutional networks (GCNs) is an emerging field which ...
research
05/27/2022

Dual Convexified Convolutional Neural Networks

We propose the framework of dual convexified convolutional neural networ...
research
01/24/2020

A Branching and Merging Convolutional Network with Homogeneous Filter Capsules

We present a convolutional neural network design with additional branche...

Please sign up or login with your details

Forgot password? Click here to reset