Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks

03/07/2017
by   Chen Yunpeng, et al.
0

Residual units are wildly used for alleviating optimization difficulties when building deep neural networks. However, the performance gain does not well compensate the model size increase, indicating low parameter efficiency in these residual units. In this work, we first revisit the residual function in several variations of residual units and demonstrate that these residual functions can actually be explained with a unified framework based on generalized block term decomposition. Then, based on the new explanation, we propose a new architecture, Collective Residual Unit (CRU), which enhances the parameter efficiency of deep neural networks through collective tensor factorization. CRU enables knowledge sharing across different residual units using shared factors. Experimental results show that our proposed CRU Network demonstrates outstanding parameter efficiency, achieving comparable classification performance to ResNet-200 with the model size of ResNet-50. By building a deeper network using CRU, we can achieve state-of-the-art single model classification accuracy on ImageNet-1k and Places365-Standard benchmark datasets. (Code and trained models are available on GitHub)

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/30/2016

Wider or Deeper: Revisiting the ResNet Model for Visual Recognition

The trend towards increasingly deep neural networks has been driven by a...
research
06/01/2017

DiracNets: Training Very Deep Neural Networks Without Skip-Connections

Deep neural networks with skip-connections, such as ResNet, show excelle...
research
10/10/2016

Deep Pyramidal Residual Networks

Deep convolutional neural networks (DCNNs) have shown remarkable perform...
research
06/01/2016

Improving Deep Neural Network with Multiple Parametric Exponential Linear Units

Activation function is crucial to the recent successes of deep neural ne...
research
09/29/2017

Deep Competitive Pathway Networks

In the design of deep neural architectures, recent studies have demonstr...
research
01/11/2023

Enhancing ResNet Image Classification Performance by using Parameterized Hypercomplex Multiplication

Recently, many deep networks have introduced hypercomplex and related ca...
research
12/10/2019

Removable and/or Repeated Units Emerge in Overparametrized Deep Neural Networks

Deep neural networks (DNNs) perform well on a variety of tasks despite t...

Please sign up or login with your details

Forgot password? Click here to reset