On Implicit Filter Level Sparsity in Convolutional Neural Networks

11/29/2018
by   Dushyant Mehta, et al.
0

We investigate filter level sparsity that emerges in convolutional neural networks (CNNs) which employ Batch Normalization and ReLU activation, and are trained with adaptive gradient descent techniques and L2 regularization (or weight decay). We conduct an extensive experimental study casting these initial findings into hypotheses and conclusions about the mechanisms underlying the emergent filter level sparsity. This study allows new insight into the performance gap obeserved between adapative and non-adaptive gradient descent methods in practice. Further, analysis of the effect of training strategies and hyperparameters on the sparsity leads to practical suggestions in designing CNN training strategies enabling us to explore the tradeoffs between feature selectivity, network capacity, and generalization performance. Lastly, we show that the implicit sparsity can be harnessed for neural network speedup at par or better than explicit sparsification / pruning approaches, without needing any modifications to the typical training pipeline.

READ FULL TEXT

page 4

page 11

research
05/13/2019

Implicit Filter Sparsification In Convolutional Neural Networks

We show implicit filter level sparsity manifests in convolutional neural...
research
06/20/2023

The Implicit Bias of Batch Normalization in Linear Models and Two-layer Linear Convolutional Neural Networks

We study the implicit bias of batch normalization trained by gradient de...
research
02/21/2020

Exploiting the Full Capacity of Deep Neural Networks while Avoiding Overfitting by Targeted Sparsity Regularization

Overfitting is one of the most common problems when training deep neural...
research
05/25/2022

On the Interpretability of Regularisation for Neural Networks Through Model Gradient Similarity

Most complex machine learning and modelling techniques are prone to over...
research
02/18/2018

Efficient Sparse-Winograd Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are computationally intensive, whic...
research
10/01/2021

Powerpropagation: A sparsity inducing weight reparameterisation

The training of sparse neural networks is becoming an increasingly impor...
research
02/25/2020

Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory

Batch Normalization (BatchNorm) is commonly used in Convolutional Neural...

Please sign up or login with your details

Forgot password? Click here to reset