A Novel, Scale-Invariant, Differentiable, Efficient, Scalable Regularizer

01/18/2023
by   Hovig Tigran Bayandorian, et al.
0

L_p-norm regularization schemes such as L_0, L_1, and L_2-norm regularization and L_p-norm-based regularization techniques such as weight decay and group LASSO compute a quantity which de pends on model weights considered in isolation from one another. This paper describes a novel regularizer which is not based on an L_p-norm. In contrast with L_p-norm-based regularization, this regularizer is concerned with the spatial arrangement of weights within a weight matrix. This regularizer is an additive term for the loss function and is differentiable, simple and fast to compute, scale-invariant, requires a trivial amount of additional memory, and can easily be parallelized. Empirically this method yields approximately a one order-of-magnitude improvement in the number of nonzero model parameters at a given level of accuracy.

READ FULL TEXT

page 2

page 7

page 8

page 10

research
08/07/2020

Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations

Using weight decay to penalize the L2 norms of weights in neural network...
research
02/22/2022

Explicit Regularization via Regularizer Mirror Descent

Despite perfectly interpolating the training data, deep neural networks ...
research
03/13/2023

Domain Generalization via Nuclear Norm Regularization

The ability to generalize to unseen domains is crucial for machine learn...
research
04/29/2018

SHADE: Information-Based Regularization for Deep Learning

Regularization is a big issue for training deep neural networks. In this...
research
04/29/2018

SHARE: Regularization for Deep Learning

Regularization is a big issue for training deep neural networks. In this...
research
07/07/2018

Approximate Leave-One-Out for Fast Parameter Tuning in High Dimensions

Consider the following class of learning schemes: β̂ := _β ∑_j=1^n ℓ(x_j...
research
08/27/2019

DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures

In seeking for sparse and efficient neural network models, many previous...

Please sign up or login with your details

Forgot password? Click here to reset