Learning strides in convolutional neural networks

02/03/2022
by   Rachid Riad, et al.
10

Convolutional neural networks typically contain several downsampling operators, such as strided convolutions or pooling layers, that progressively reduce the resolution of intermediate representations. This provides some shift-invariance while reducing the computational complexity of the whole architecture. A critical hyperparameter of such layers is their stride: the integer factor of downsampling. As strides are not differentiable, finding the best configuration either requires cross-validation or discrete optimization (e.g. architecture search), which rapidly become prohibitive as the search space grows exponentially with the number of downsampling layers. Hence, exploring this search space by gradient descent would allow finding better configurations at a lower computational cost. This work introduces DiffStride, the first downsampling layer with learnable strides. Our layer learns the size of a cropping mask in the Fourier domain, that effectively performs resizing in a differentiable way. Experiments on audio and image classification show the generality and effectiveness of our solution: we use DiffStride as a drop-in replacement to standard downsampling layers and outperform them. In particular, we show that introducing our layer into a ResNet-18 architecture allows keeping consistent high performance on CIFAR10, CIFAR100 and ImageNet even when training starts from poor random stride configurations. Moreover, formulating strides as learnable variables allows us to introduce a regularization term that controls the computational complexity of the architecture. We show how this regularization allows trading off accuracy for efficiency on ImageNet.

READ FULL TEXT
research
06/24/2018

DARTS: Differentiable Architecture Search

This paper addresses the scalability challenge of architecture search by...
research
04/08/2019

ASAP: Architecture Search, Anneal and Prune

Automatic methods for Neural Architecture Search (NAS) have been shown t...
research
06/21/2023

Balanced Mixture of SuperNets for Learning the CNN Pooling Architecture

Downsampling layers, including pooling and strided convolutions, are cru...
research
08/04/2022

Exploring Computational Complexity Of Ride-Pooling Problems

Ride-pooling is computationally challenging. The number of feasible ride...
research
04/07/2021

Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection

This paper reports the first successful application of a differentiable ...
research
11/18/2019

ImmuNeCS: Neural Committee Search by an Artificial Immune System

Current Neural Architecture Search techniques can suffer from a few shor...
research
02/10/2023

DNArch: Learning Convolutional Neural Architectures by Backpropagation

We present Differentiable Neural Architectures (DNArch), a method that j...

Please sign up or login with your details

Forgot password? Click here to reset