A Surrogate Lagrangian Relaxation-based Model Compression for Deep Neural Networks

12/18/2020
by   Deniz Gurevin, et al.
4

Network pruning is a widely used technique to reduce computation cost and model size for deep neural networks. However, the typical three-stage pipeline, i.e., training, pruning and retraining (fine-tuning) significantly increases the overall training trails. For instance, the retraining process could take up to 80 epochs for ResNet-18 on ImageNet, that is 70 training trails. In this paper, we develop a systematic weight-pruning optimization approach based on Surrogate Lagrangian relaxation (SLR), which is tailored to overcome difficulties caused by the discrete nature of the weight-pruning problem while ensuring fast convergence. We decompose the weight-pruning problem into subproblems, which are coordinated by updating Lagrangian multipliers. Convergence is then accelerated by using quadratic penalty terms. We evaluate the proposed method on image classification tasks, i.e., ResNet-18, ResNet-50 and VGG-16 using ImageNet and CIFAR-10, as well as object detection tasks, i.e., YOLOv3 and YOLOv3-tiny using COCO 2014, PointPillars using KITTI 2017, and Ultra-Fast-Lane-Detection using TuSimple lane detection dataset. Numerical testing results demonstrate that with the adoption of the Surrogate Lagrangian Relaxation method, our SLR-based weight-pruning optimization approach achieves a high model accuracy even at the hard-pruning stage without retraining for many epochs, such as on PointPillars object detection model on KITTI dataset where we achieve 9.44x compression rate by only retraining for 3 epochs with less than 1 compression rate increases, SLR starts to perform better than ADMM and the accuracy gap between them increases. SLR achieves 15.2 ADMM on PointPillars after pruning under 9.49x compression. Given a limited budget of retraining epochs, our approach quickly recovers the model accuracy.

READ FULL TEXT

page 18

page 20

page 21

page 22

page 23

page 26

page 27

page 29

research
04/08/2023

Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning

Network pruning is a widely used technique to reduce computation cost an...
research
02/15/2018

Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers

We present a systematic weight pruning framework of deep neural networks...
research
10/17/2018

Progressive Weight Pruning of Deep Neural Networks using ADMM

Deep neural networks (DNNs) although achieving human-level performance i...
research
03/23/2019

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

Weight pruning and weight quantization are two important categories of D...
research
04/10/2018

A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers

Weight pruning methods for deep neural networks (DNNs) have been investi...
research
11/20/2020

Continuous Pruning of Deep Convolutional Networks Using Selective Weight Decay

During the last decade, deep convolutional networks have become the refe...
research
12/04/2019

Deep Model Compression via Deep Reinforcement Learning

Besides accuracy, the storage of convolutional neural networks (CNN) mod...

Please sign up or login with your details

Forgot password? Click here to reset