Structured Bayesian Compression for Deep Neural Networks Based on The Turbo-VBI Approach

02/21/2023
by   Chengyu Xia, et al.
0

With the growth of neural network size, model compression has attracted increasing interest in recent research. As one of the most common techniques, pruning has been studied for a long time. By exploiting the structured sparsity of the neural network, existing methods can prune neurons instead of individual weights. However, in most existing pruning methods, surviving neurons are randomly connected in the neural network without any structure, and the non-zero weights within each neuron are also randomly distributed. Such irregular sparse structure can cause very high control overhead and irregular memory access for the hardware and even increase the neural network computational complexity. In this paper, we propose a three-layer hierarchical prior to promote a more regular sparse structure during pruning. The proposed three-layer hierarchical prior can achieve per-neuron weight-level structured sparsity and neuron-level structured sparsity. We derive an efficient Turbo-variational Bayesian inferencing (Turbo-VBI) algorithm to solve the resulting model compression problem with the proposed prior. The proposed Turbo-VBI algorithm has low complexity and can support more general priors than existing model compression algorithms. Simulation results show that our proposed algorithm can promote a more regular structure in the pruned neural networks while achieving even better performance in terms of compression rate and inferencing accuracy compared with the baselines.

READ FULL TEXT

page 1

page 5

page 7

page 12

page 14

research
06/11/2022

A Theoretical Understanding of Neural Network Compression from Sparse Linear Approximation

The goal of model compression is to reduce the size of a large neural ne...
research
06/28/2020

ESPN: Extremely Sparse Pruned Networks

Deep neural networks are often highly overparameterized, prohibiting the...
research
03/10/2015

Learning the Structure for Structured Sparsity

Structured sparsity has recently emerged in statistics, machine learning...
research
11/19/2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

The rapidly growing parameter volume of deep neural networks (DNNs) hind...
research
05/20/2017

Structured Bayesian Pruning via Log-Normal Multiplicative Noise

Dropout-based regularization methods can be regarded as injecting random...
research
08/07/2022

N2NSkip: Learning Highly Sparse Networks using Neuron-to-Neuron Skip Connections

The over-parametrized nature of Deep Neural Networks leads to considerab...
research
12/02/2018

Network Compression via Recursive Bayesian Pruning

Recently, compression and acceleration of deep neural networks are in cr...

Please sign up or login with your details

Forgot password? Click here to reset