Deep Neural Network Compression with Single and Multiple Level Quantization

03/06/2018
by   Yuhui Xu, et al.
0

Network quantization is an effective solution to compress deep neural networks for practical usage. Existing network quantization methods cannot sufficiently exploit the depth information to generate low-bit compressed network. In this paper, we propose two novel network quantization approaches, single-level network quantization (SLQ) for high-bit quantization and multi-level network quantization (MLQ) for extremely low-bit quantization (ternary).We are the first to consider the network quantization from both width and depth level. In the width level, parameters are divided into two parts: one for quantization and the other for re-training to eliminate the quantization loss. SLQ leverages the distribution of the parameters to improve the width level. In the depth level, we introduce incremental layer compensation to quantize layers iteratively which decreases the quantization loss in each iteration. The proposed approaches are validated with extensive experiments based on the state-of-the-art neural networks including AlexNet, VGG-16, GoogleNet and ResNet-18. Both SLQ and MLQ achieve impressive results.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/06/2018

DNQ: Dynamic Network Quantization

Network quantization is an effective method for the deployment of neural...
research
03/18/2021

Data-free mixed-precision quantization using novel sensitivity metric

Post-training quantization is a representative technique for compressing...
research
03/02/2021

All at Once Network Quantization via Collaborative Knowledge Transfer

Network quantization has rapidly become one of the most widely used meth...
research
05/03/2020

A Little Bit More: Bitplane-Wise Bit-Depth Recovery

Imaging sensors digitize incoming scene light at a dynamic range of 10–1...
research
12/04/2017

Adaptive Quantization for Deep Neural Network

In recent years Deep Neural Networks (DNNs) have been rapidly developed ...
research
05/14/2023

MultiQuant: A Novel Multi-Branch Topology Method for Arbitrary Bit-width Network Quantization

Arbitrary bit-width network quantization has received significant attent...

Please sign up or login with your details

Forgot password? Click here to reset