Universal Deep Neural Network Compression

02/07/2018
by   Yoojin Choi, et al.
0

Compression of deep neural networks (DNNs) for memory- and computation-efficient compact feature representations becomes a critical problem particularly for deployment of DNNs on resource-limited platforms. In this paper, we investigate lossy compression of DNNs by weight quantization and lossless source coding for memory-efficient inference. Whereas the previous work addressed non-universal scalar quantization and entropy coding of DNN weights, we for the first time introduce universal DNN compression by universal vector quantization and universal source coding. In particular, we examine universal randomized lattice quantization of DNNs, which randomizes DNN weights by uniform random dithering before lattice quantization and can perform near-optimally on any source without relying on knowledge of its probability distribution. Entropy coding schemes such as Huffman codes require prior calculation of source statistics, which is computationally consuming. Instead, we propose universal lossless source coding schemes such as variants of Lempel-Ziv-Welch or the Burrows-Wheeler transform. Finally, we present the methods of fine-tuning vector quantized DNNs to recover the performance loss after quantization. Our experimental results show that the proposed universal DNN compression scheme achieves compression ratios of 124.80, 47.10 and 42.46 for LeNet5, 32-layer ResNet and AlexNet, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/05/2016

Towards the Limit of Network Quantization

Network quantization is one of network compression techniques to reduce ...
research
04/09/2018

Universal and Succinct Source Coding of Deep Neural Networks

Deep neural networks have shown incredible performance for inference tas...
research
07/27/2019

DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks

The field of video compression has developed some of the most sophistica...
research
04/08/2022

Characterizing and Understanding the Behavior of Quantized Models for Reliable Deployment

Deep Neural Networks (DNNs) have gained considerable attention in the pa...
research
04/07/2021

Learned transform compression with optimized entropy encoding

We consider the problem of learned transform compression where we learn ...
research
05/07/2023

Learned Wyner-Ziv Compressors Recover Binning

We consider lossy compression of an information source when the decoder ...
research
10/31/2022

Model Compression for DNN-Based Text-Independent Speaker Verification Using Weight Quantization

DNN-based models achieve high performance in the speaker verification (S...

Please sign up or login with your details

Forgot password? Click here to reset