Reparameterization through Spatial Gradient Scaling

03/05/2023
by   Alexander Detkov, et al.
0

Reparameterization aims to improve the generalization of deep neural networks by transforming convolutional layers into equivalent multi-branched structures during training. However, there exists a gap in understanding how reparameterization may change and benefit the learning process of neural networks. In this paper, we present a novel spatial gradient scaling method to redistribute learning focus among weights in convolutional networks. We prove that spatial gradient scaling achieves the same learning dynamics as a branched reparameterization yet without introducing structural changes into the network. We further propose an analytical approach that dynamically learns scalings for each convolutional layer based on the spatial characteristics of its input feature map gauged by mutual information. Experiments on CIFAR-10, CIFAR-100, and ImageNet show that without searching for reparameterized structures, our proposed scaling method outperforms the state-of-the-art reparameterization strategies at a lower computational cost.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/08/2018

A look at the topology of convolutional neural networks

Convolutional neural networks (CNN's) are powerful and widely used tools...
research
10/06/2017

Projection Based Weight Normalization for Deep Neural Networks

Optimizing deep neural networks (DNNs) often suffers from the ill-condit...
research
04/15/2020

A Hybrid Method for Training Convolutional Neural Networks

Artificial Intelligence algorithms have been steadily increasing in popu...
research
07/27/2023

Speed Limits for Deep Learning

State-of-the-art neural networks require extreme computational power to ...
research
06/16/2021

Scaling-up Diverse Orthogonal Convolutional Networks with a Paraunitary Framework

Enforcing orthogonality in neural networks is an antidote for gradient v...
research
08/01/2020

L-CNN: A Lattice cross-fusion strategy for multistream convolutional neural networks

This paper proposes a fusion strategy for multistream convolutional netw...
research
11/10/2022

MGiaD: Multigrid in all dimensions. Efficiency and robustness by coarsening in resolution and channel dimensions

Current state-of-the-art deep neural networks for image classification a...

Please sign up or login with your details

Forgot password? Click here to reset