Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks

03/11/2021
by   Theodoros Georgiou, et al.
0

Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it. Additionally, the results are less sensitive to hyperparameter settings such as batch size and regularization factor.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

10/06/2017

Projection Based Weight Normalization for Deep Neural Networks

Optimizing deep neural networks (DNNs) often suffers from the ill-condit...
01/30/2022

Training Thinner and Deeper Neural Networks: Jumpstart Regularization

Neural networks are more expressive when they have multiple layers. In t...
02/07/2018

ShakeDrop regularization

This paper proposes a powerful regularization method named ShakeDrop reg...
08/15/2019

Adaptive Regularization of Labels

Recently, a variety of regularization techniques have been widely applie...
06/06/2020

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

The strong correlation between neurons or filters can significantly weak...
10/06/2021

CBP: Backpropagation with constraint on weight precision using a pseudo-Lagrange multiplier method

Backward propagation of errors (backpropagation) is a method to minimize...
11/07/2016

Regularizing CNNs with Locally Constrained Decorrelations

Regularization is key for deep learning since it allows training more co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.