Take it in your stride: Do we need striding in CNNs?

12/07/2017
by   Chen Kong, et al.
0

Since their inception, CNNs have utilized some type of striding operator to reduce the overlap of receptive fields and spatial dimensions. Although having clear heuristic motivations (i.e. lowering the number of parameters to learn) the mathematical role of striding within CNN learning remains unclear. This paper offers a novel and mathematical rigorous perspective on the role of the striding operator within modern CNNs. Specifically, we demonstrate theoretically that one can always represent a CNN that incorporates striding with an equivalent non-striding CNN which has more filters and smaller size. Through this equivalence we are then able to characterize striding as an additional mechanism for parameter sharing among channels, thus reducing training complexity. Finally, the framework presented in this paper offers a new mathematical perspective on the role of striding which we hope shall facilitate and simplify the future theoretical analysis of CNNs.

READ FULL TEXT
research
01/23/2018

Learning to Prune Filters in Convolutional Neural Networks

Many state-of-the-art computer vision algorithms use large scale convolu...
research
12/08/2016

Filter sharing: Efficient learning of parameters for volumetric convolutions

Typical convolutional neural networks (CNNs) have several millions of pa...
research
02/27/2019

Reducing Artificial Neural Network Complexity: A Case Study on Exoplanet Detection

Despite their successes in the field of self-learning AI, Convolutional ...
research
04/18/2023

SO(2) and O(2) Equivariance in Image Recognition with Bessel-Convolutional Neural Networks

For many years, it has been shown how much exploiting equivariances can ...
research
07/04/2022

Approximation bounds for convolutional neural networks in operator learning

Recently, deep Convolutional Neural Networks (CNNs) have proven to be su...
research
06/10/2023

FalconNet: Factorization for the Light-weight ConvNets

Designing light-weight CNN models with little parameters and Flops is a ...
research
12/07/2017

CNNs are Globally Optimal Given Multi-Layer Support

Stochastic Gradient Descent (SGD) is the central workhorse for training ...

Please sign up or login with your details

Forgot password? Click here to reset