Weight Normalization based Quantization for Deep Neural Network Compression

07/01/2019
by   Wen-Pu Cai, et al.
0

With the development of deep neural networks, the size of network models becomes larger and larger. Model compression has become an urgent need for deploying these network models to mobile or embedded devices. Model quantization is a representative model compression technique. Although a lot of quantization methods have been proposed, many of them suffer from a high quantization error caused by a long-tail distribution of network weights. In this paper, we propose a novel quantization method, called weight normalization based quantization (WNQ), for model compression. WNQ adopts weight normalization to avoid the long-tail distribution of network weights and subsequently reduces the quantization error. Experiments on CIFAR-100 and ImageNet show that WNQ can outperform other baselines to achieve state-of-the-art performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/20/2018

Differentiable Fine-grained Quantization for Deep Neural Network Compression

Neural networks have shown great performance in cognitive tasks. When de...
research
02/05/2019

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Quantization of neural networks has become common practice, driven by th...
research
04/17/2020

Quantization Guided JPEG Artifact Correction

The JPEG image compression algorithm is the most popular method of image...
research
04/21/2023

Picking Up Quantization Steps for Compressed Image Classification

The sensitivity of deep neural networks to compressed images hinders the...
research
11/13/2018

Iteratively Training Look-Up Tables for Network Quantization

Operating deep neural networks on devices with limited resources require...
research
07/16/2019

An Inter-Layer Weight Prediction and Quantization for Deep Neural Networks based on a Smoothly Varying Weight Hypothesis

Network compression for deep neural networks has become an important par...
research
04/26/2023

Guaranteed Quantization Error Computation for Neural Network Model Compression

Neural network model compression techniques can address the computation ...

Please sign up or login with your details

Forgot password? Click here to reset