DeepHoyer: Learning Sparser Neural Network with Differentiable Scale-Invariant Sparsity Measures

08/27/2019
by   Huanrui Yang, et al.
0

In seeking for sparse and efficient neural network models, many previous works investigated on enforcing L1 or L0 regularizers to encourage weight sparsity during training. The L0 regularizer measures the parameter sparsity directly and is invariant to the scaling of parameter values, but it cannot provide useful gradients, and therefore requires complex optimization techniques. The L1 regularizer is almost everywhere differentiable and can be easily optimized with gradient descent. Yet it is not scale-invariant, causing the same shrinking rate to all parameters, which is inefficient in increasing sparsity. Inspired by the Hoyer measure (the ratio between L1 and L2 norms) used in traditional compressed sensing problems, we present DeepHoyer, a set of sparsity-inducing regularizers that are both differentiable almost everywhere and scale-invariant. Our experiments show that enforcing DeepHoyer regularizers can produce even sparser neural network models than previous works, under the same accuracy level. We also show that DeepHoyer can be applied to both element-wise and structural pruning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/09/2019

Group Pruning using a Bounded-Lp norm for Group Gating and Regularization

Deep neural networks achieve state-of-the-art results on several tasks w...
research
05/27/2022

Spartan: Differentiable Sparsity via Regularized Transportation

We present Spartan, a method for training sparse neural network models w...
research
10/14/2022

Neural Network Compression by Joint Sparsity Promotion and Redundancy Reduction

Compression of convolutional neural network models has recently been dom...
research
01/18/2023

A Novel, Scale-Invariant, Differentiable, Efficient, Scalable Regularizer

L_p-norm regularization schemes such as L_0, L_1, and L_2-norm regulariz...
research
08/07/2020

Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations

Using weight decay to penalize the L2 norms of weights in neural network...
research
05/20/2018

Wasserstein regularization for sparse multi-task regression

Two important elements have driven recent innovation in the field of reg...
research
06/06/2011

Bayesian and L1 Approaches to Sparse Unsupervised Learning

The use of L1 regularisation for sparse learning has generated immense r...

Please sign up or login with your details

Forgot password? Click here to reset