Transform Quantization for CNN Compression

09/02/2020
by   Sean I. Young, et al.
7

In this paper, we compress convolutional neural network (CNN) weights post-training via transform quantization. Previous CNN quantization techniques tend to ignore the joint statistics of weights and activations, producing sub-optimal CNN performance at a given quantization bit-rate, or consider their joint statistics during training only and do not facilitate efficient compression of already trained CNN models. We optimally transform (decorrelate) and quantize the weights post-training using a rate-distortion framework to improve compression at any given quantization bit-rate. Transform quantization unifies quantization and dimensionality reduction (decorrelation) techniques in a single framework to facilitate low bit-rate compression of CNNs and efficient inference in the transform domain. We first introduce a theory of rate and distortion for CNN quantization, and pose optimum quantization as a rate-distortion optimization problem. We then show that this problem can be solved using optimal bit-depth allocation following decorrelation by the optimal End-to-end Learned Transform (ELT) we derive in this paper. Experiments demonstrate that transform quantization advances the state of the art in CNN compression in both retrained and non-retrained quantization scenarios. In particular, we find that transform quantization with retraining is able to compress CNN models such as AlexNet, ResNet and DenseNet to very low bit-rates (1-2 bits).

READ FULL TEXT

page 2

page 3

page 6

page 10

research
12/14/2021

Modeling Image Quantization Tradeoffs for Optimal Compression

All Lossy compression algorithms employ similar compression schemes – fr...
research
05/01/2019

Learned Image Compression with Soft Bit-based Rate-Distortion Optimization

This paper introduces the notion of soft bits to address the rate-distor...
research
02/23/2018

Autoencoder based image compression: can the learning be quantization independent?

This paper explores the problem of learning transforms for image compres...
research
07/10/2018

Learning a Single Tucker Decomposition Network for Lossy Image Compression with Multiple Bits-Per-Pixel Rates

Lossy image compression (LIC), which aims to utilize inexact approximati...
research
12/25/2021

Pseudocylindrical Convolutions for Learned Omnidirectional Image Compression

Although equirectangular projection (ERP) is a convenient form to store ...
research
06/17/2022

Lossy Compression with Gaussian Diffusion

We describe a novel lossy compression approach called DiffC which is bas...
research
12/11/2020

Parallelized Rate-Distortion Optimized Quantization Using Deep Learning

Rate-Distortion Optimized Quantization (RDOQ) has played an important ro...

Please sign up or login with your details

Forgot password? Click here to reset