EraseReLU: A Simple Way to Ease the Training of Deep Convolution Neural Networks

09/22/2017
by   Xuanyi Dong, et al.
0

For most state-of-the-art architectures, Rectified Linear Unit (ReLU) becomes a standard component accompanied with each layer. Although ReLU can ease the network training to an extent, the character of blocking negative values may suppress the propagation of useful information and leads to the difficulty of optimizing very deep Convolutional Neural Networks (CNNs). Moreover, stacking layers with nonlinear activations is hard to approximate the intrinsic linear transformations between feature representations. In this paper, we investigate the effect of erasing ReLUs of certain layers and apply it to various representative architectures following deterministic rules. It can ease the optimization and improve the generalization performance for very deep CNN models. We find two key factors being essential to the performance improvement: 1) the location where ReLU should be erased inside the basic module; 2) the proportion of basic modules to erase ReLU; We show that erasing the last ReLU layer of all basic modules in a network usually yields improved performance. In experiments, our approach successfully improves the performance of various representative architectures, and we report the improved results on SVHN, CIFAR-10/100, and ImageNet. Moreover, we achieve competitive single-model performance on CIFAR-100 with 16.53 state-of-the-art.

READ FULL TEXT
research
05/30/2016

Parametric Exponential Linear Unit for Deep Convolutional Neural Networks

The activation function is an important component in Convolutional Neura...
research
11/18/2015

ACDC: A Structured Efficient Linear Layer

The linear layer is one of the most pervasive modules in deep learning r...
research
09/19/2017

Training Better CNNs Requires to Rethink ReLU

With the rapid development of Deep Convolutional Neural Networks (DCNNs)...
research
06/18/2023

Learn to Enhance the Negative Information in Convolutional Neural Network

This paper proposes a learnable nonlinear activation mechanism specifica...
research
10/23/2021

Parametric Variational Linear Units (PVLUs) in Deep Convolutional Networks

The Rectified Linear Unit is currently a state-of-the-art activation fun...
research
04/02/2018

Average Biased ReLU Based CNN Descriptor for Improved Face Retrieval

The convolutional neural networks (CNN) like AlexNet, GoogleNet, VGGNet,...
research
04/20/2023

DeepReShape: Redesigning Neural Networks for Efficient Private Inference

The increasing demand for privacy and security has driven the advancemen...

Please sign up or login with your details

Forgot password? Click here to reset