Hierarchical Adaptive Lasso: Learning Sparse Neural Networks with Shrinkage via Single Stage Training

08/24/2020
by   Skyler Seto, et al.
0

Deep neural networks achieve state-of-the-art performance in a variety of tasks, however this performance is closely tied to model size. Sparsity is one approach to limiting model size. Modern techniques for inducing sparsity in neural networks are (1) network pruning, a procedure involving iteratively training a model initialized with a previous run's weights and hard thresholding, (2) training in one-stage with a sparsity inducing penalty (usually based on the Lasso), and (3) training a binary mask jointly with the weights of the network. In this work, we study different sparsity inducing penalties from the perspective of Bayesian hierarchical models with the goal of designing penalties which perform well without retraining subnetworks in isolation. With this motivation, we present a novel penalty called Hierarchical Adaptive Lasso (HALO) which learns to adaptively sparsify weights of a given network via trainable parameters without learning a mask. When used to train over-parametrized networks, our penalty yields small subnetworks with high accuracy (winning tickets) even when the subnetworks are not trained in isolation. Empirically, on the CIFAR-100 dataset, we find that HALO is able to learn highly sparse network (only 5% of the parameters) with approximately a 2% and 4% gain in performance over state-of-the-art magnitude pruning methods at the same level of sparsity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/21/2020

Training Sparse Neural Networks using Compressed Sensing

Pruning the weights of neural networks is an effective and widely-used t...
research
04/09/2022

Channel Pruning In Quantization-aware Training: An Adaptive Projection-gradient Descent-shrinkage-splitting Method

We propose an adaptive projection-gradient descent-shrinkage-splitting m...
research
07/30/2021

Adaptive Optimizers with Sparse Group Lasso for Neural Networks in CTR Prediction

We develop a novel framework that adds the regularizers of the sparse gr...
research
08/14/2023

HyperSparse Neural Networks: Shifting Exploration to Exploitation through Adaptive Regularization

Sparse neural networks are a key factor in developing resource-efficient...
research
08/08/2022

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints

The performance of trained neural networks is robust to harsh levels of ...
research
03/20/2023

Induced Feature Selection by Structured Pruning

The advent of sparsity inducing techniques in neural networks has been o...
research
04/14/2023

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks

Sparse training is emerging as a promising avenue for reducing the compu...

Please sign up or login with your details

Forgot password? Click here to reset