Quantized Neural Networks: Characterization and Holistic Optimization

05/31/2020
by   Yoonho Boo, et al.
0

Quantized deep neural networks (QDNNs) are necessary for low-power, high throughput, and embedded applications. Previous studies mostly focused on developing optimization methods for the quantization of given models. However, quantization sensitivity depends on the model architecture. Therefore, the model selection needs to be a part of the QDNN design process. Also, the characteristics of weight and activation quantization are quite different. This study proposes a holistic approach for the optimization of QDNNs, which contains QDNN training methods as well as quantization-friendly architecture design. Synthesized data is used to visualize the effects of weight and activation quantization. The results indicate that deeper models are more prone to activation quantization, while wider models improve the resiliency to both weight and activation quantization. This study can provide insight into better optimization of QDNNs.

READ FULL TEXT

page 6

page 7

page 10

page 19

page 20

page 21

research
07/17/2018

Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)

Deep learning algorithms achieve high classification accuracy at the exp...
research
03/21/2022

Overcoming Oscillations in Quantization-Aware Training

When training neural networks with simulated quantization, we observe th...
research
05/30/2023

Intriguing Properties of Quantization at Scale

Emergent properties have been widely adopted as a term to describe behav...
research
02/02/2020

SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks

Designing a deep neural network (DNN) with good generalization capabilit...
research
06/11/2019

Table-Based Neural Units: Fully Quantizing Networks for Multiply-Free Inference

In this work, we propose to quantize all parts of standard classificatio...
research
03/09/2021

MWQ: Multiscale Wavelet Quantized Neural Networks

Model quantization can reduce the model size and computational latency, ...
research
04/10/2020

Exposing Hardware Building Blocks to Machine Learning Frameworks

There are a plethora of applications that demand high throughput and low...

Please sign up or login with your details

Forgot password? Click here to reset