ShaResNet: reducing residual network parameter number by sharing weights

02/28/2017
by   Alexandre Boulch, et al.
0

Deep Residual Networks have reached the state of the art in many image processing tasks such image classification. However, the cost for a gain in accuracy in terms of depth and memory is prohibitive as it requires a higher number of residual blocks, up to double the initial value. To tackle this problem, we propose in this paper a way to reduce the redundant information of the networks. We share the weights of convolutional layers between residual blocks operating at the same spatial scale. The signal flows multiple times in the same convolutional layer. The resulting architecture, called ShaResNet, contains block specific layers and shared layers. These ShaResNet are trained exactly in the same fashion as the commonly used residual networks. We show, on the one hand, that they are almost as efficient as their sequential counterparts while involving less parameters, and on the other hand that they are more efficient than a residual network with the same number of parameters. For example, a 152-layer-deep residual network can be reduced to 106 convolutional layers, i.e. a parameter gain of 39%, while loosing less than 0.2% accuracy on ImageNet.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/23/2016

Wide Residual Networks

Deep residual networks were shown to be able to scale up to thousands of...
research
06/10/2019

Network Implosion: Effective Model Compression for ResNets via Static Layer Pruning and Retraining

Residual Networks with convolutional layers are widely used in the field...
research
04/19/2020

When Residual Learning Meets Dense Aggregation: Rethinking the Aggregation of Deep Neural Networks

Various architectures (such as GoogLeNets, ResNets, and DenseNets) have ...
research
01/27/2023

Deep Residual Compensation Convolutional Network without Backpropagation

PCANet and its variants provided good accuracy results for classificatio...
research
03/03/2017

Deep Collaborative Learning for Visual Recognition

Deep neural networks are playing an important role in state-of-the-art v...
research
04/06/2020

Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences

Attention is a commonly used mechanism in sequence processing, but it is...
research
03/17/2020

Hyperplane Arrangements of Trained ConvNets Are Biased

We investigate the geometric properties of the functions learned by trai...

Please sign up or login with your details

Forgot password? Click here to reset