Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks

by   Theodoros Georgiou, et al.

Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it. Additionally, the results are less sensitive to hyperparameter settings such as batch size and regularization factor.



There are no comments yet.


page 1


Projection Based Weight Normalization for Deep Neural Networks

Optimizing deep neural networks (DNNs) often suffers from the ill-condit...

Training Thinner and Deeper Neural Networks: Jumpstart Regularization

Neural networks are more expressive when they have multiple layers. In t...

ShakeDrop regularization

This paper proposes a powerful regularization method named ShakeDrop reg...

Adaptive Regularization of Labels

Recently, a variety of regularization techniques have been widely applie...

MMA Regularization: Decorrelating Weights of Neural Networks by Maximizing the Minimal Angles

The strong correlation between neurons or filters can significantly weak...

CBP: Backpropagation with constraint on weight precision using a pseudo-Lagrange multiplier method

Backward propagation of errors (backpropagation) is a method to minimize...

Regularizing CNNs with Locally Constrained Decorrelations

Regularization is key for deep learning since it allows training more co...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.