DNArch: Learning Convolutional Neural Architectures by Backpropagation

02/10/2023
by   David W. Romero, et al.
0

We present Differentiable Neural Architectures (DNArch), a method that jointly learns the weights and the architecture of Convolutional Neural Networks (CNNs) by backpropagation. In particular, DNArch allows learning (i) the size of convolutional kernels at each layer, (ii) the number of channels at each layer, (iii) the position and values of downsampling layers, and (iv) the depth of the network. To this end, DNArch views neural architectures as continuous multidimensional entities, and uses learnable differentiable masks along each dimension to control their size. Unlike existing methods, DNArch is not limited to a predefined set of possible neural components, but instead it is able to discover entire CNN architectures across all combinations of kernel sizes, widths, depths and downsampling. Empirically, DNArch finds performant CNN architectures for several classification and dense prediction tasks on both sequential and image data. When combined with a loss term that considers the network complexity, DNArch finds powerful architectures that respect a predefined computational budget.

READ FULL TEXT
research
06/07/2022

Towards a General Purpose CNN for Long Range Dependencies in ND

The use of Convolutional Neural Networks (CNNs) is widespread in Deep Le...
research
10/15/2021

FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

When designing Convolutional Neural Networks (CNNs), one must select the...
research
10/14/2020

Differentiable Implicit Layers

In this paper, we introduce an efficient backpropagation scheme for non-...
research
06/11/2020

Multigrid-in-Channels Architectures for Wide Convolutional Neural Networks

We present a multigrid approach that combats the quadratic growth of the...
research
09/15/2017

Towards CNN map representation and compression for camera relocalisation

This paper presents a study on the use of Convolutional Neural Networks ...
research
02/03/2022

Learning strides in convolutional neural networks

Convolutional neural networks typically contain several downsampling ope...
research
06/01/2018

Targeted Kernel Networks: Faster Convolutions with Attentive Regularization

We propose Attentive Regularization (AR), a method to constrain the acti...

Please sign up or login with your details

Forgot password? Click here to reset