Learning Implicitly Recurrent CNNs Through Parameter Sharing

02/26/2019
by   Pedro Savarese, et al.
0

We introduce a parameter sharing scheme, in which different layers of a convolutional neural network (CNN) are defined by a learned linear combination of parameter tensors from a global bank of templates. Restricting the number of templates yields a flexible hybridization of traditional CNNs and recurrent networks. Compared to traditional CNNs, we demonstrate substantial parameter savings on standard image classification tasks, while maintaining accuracy. Our simple parameter sharing scheme, though defined via soft weights, in practice often yields trained networks with near strict recurrent structure; with negligible side effects, they convert into networks with actual loops. Training these networks thus implicitly involves discovery of suitable recurrent architectures. Though considering only the design aspect of recurrent links, our trained networks achieve accuracy competitive with those built using state-of-the-art neural architecture search (NAS) procedures. Our hybridization of recurrent and convolutional networks may also represent a beneficial architectural bias. Specifically, on synthetic tasks which are algorithmic in nature, our hybrid networks both train faster and extrapolate better to test examples outside the span of the training set.

READ FULL TEXT

page 8

page 12

page 13

research
05/28/2019

RecNets: Channel-wise Recurrent Convolutional Neural Networks

In this paper, we introduce Channel-wise recurrent convolutional neural ...
research
08/31/2019

HM-NAS: Efficient Neural Architecture Search via Hierarchical Masking

The use of automatic methods, often referred to as Neural Architecture S...
research
07/29/2019

Particle Swarm Optimisation for Evolving Deep Neural Networks for Image Classification by Evolving and Stacking Transferable Blocks

Deep Convolutional Neural Networks (CNNs) have been widely used in image...
research
07/15/2021

Recurrent Parameter Generators

We present a generic method for recurrently using the same parameters fo...
research
11/11/2022

Metaphors We Learn By

Gradient based learning using error back-propagation (“backprop”) is a w...
research
03/29/2021

Rethinking Neural Operations for Diverse Tasks

An important goal of neural architecture search (NAS) is to automate-awa...
research
10/27/2022

Deepening Neural Networks Implicitly and Locally via Recurrent Attention Strategy

More and more empirical and theoretical evidence shows that deepening ne...

Please sign up or login with your details

Forgot password? Click here to reset