Practical Network Acceleration with Tiny Sets

02/16/2022
by   Guo-Hua Wang, et al.
0

Network compression is effective in accelerating the inference of deep neural networks, but often requires finetuning with all the training data to recover from the accuracy loss. It is impractical in some applications, however, due to data privacy issues or constraints in compression time budget. To deal with the above issues, we propose a method named PRACTISE to accelerate the network with tiny sets of training images. By considering both the pruned part and the unpruned part of a compressed model, PRACTISE alleviates layer-wise error accumulation, which is the main drawback of previous methods. Furthermore, existing methods are confined to few compression schemes, have limited speedup in terms of latency, and are unstable. In contrast, PRACTISE is stable, fast to train, versatile to handle various compression schemes, and achieves low latency. We also propose that dropping entire blocks is a better way than existing compression schemes when only tiny sets of training data are available. Extensive experiments demonstrate that PRACTISE achieves much higher accuracy and more stable models than state-of-the-art methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/02/2023

Practical Network Acceleration with Tiny Sets: Hypothesis, Theory, and Algorithm

Due to data privacy issues, accelerating networks with tiny training set...
research
11/18/2020

Layer-Wise Data-Free CNN Compression

We present an efficient method for compressing a trained neural network ...
research
11/21/2019

Few Shot Network Compression via Cross Distillation

Model compression has been widely adopted to obtain light-weighted deep ...
research
11/23/2022

Pruned Lightweight Encoders for Computer Vision

Latency-critical computer vision systems, such as autonomous driving or ...
research
01/07/2022

Compressing Models with Few Samples: Mimicking then Replacing

Few-sample compression aims to compress a big redundant model into a sma...
research
02/16/2023

THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression

Deep neural networks (DNNs) are the de-facto standard for essential use ...
research
08/20/2020

Compression with wildcards: All exact, or all minimal hitting sets

Our main objective is the COMPRESSED enumeration (based on wildcards) of...

Please sign up or login with your details

Forgot password? Click here to reset