Automatic Network Adaptation for Ultra-Low Uniform-Precision Quantization

12/21/2022
by   Seongmin Park, et al.
0

Uniform-precision neural network quantization has gained popularity since it simplifies densely packed arithmetic unit for high computing capability. However, it ignores heterogeneous sensitivity to the impact of quantization errors across the layers, resulting in sub-optimal inference accuracy. This work proposes a novel neural architecture search called neural channel expansion that adjusts the network structure to alleviate accuracy degradation from ultra-low uniform-precision quantization. The proposed method selectively expands channels for the quantization sensitive layers while satisfying hardware constraints (e.g., FLOPs, PARAMs). Based on in-depth analysis and experiments, we demonstrate that the proposed method can adapt several popular networks channels to achieve superior 2-bit quantization accuracy on CIFAR10 and ImageNet. In particular, we achieve the best-to-date Top-1/Top-5 accuracy for 2-bit ResNet50 with smaller FLOPs and the parameter size.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2018

Precision Highway for Ultra Low-Precision Quantization

Neural network quantization has an inherent problem called accumulated q...
research
03/18/2021

Data-free mixed-precision quantization using novel sensitivity metric

Post-training quantization is a representative technique for compressing...
research
01/01/2020

ZeroQ: A Novel Zero Shot Quantization Framework

Quantization is a promising approach for reducing the inference time and...
research
07/15/2020

Finding Non-Uniform Quantization Schemes usingMulti-Task Gaussian Processes

We propose a novel method for neural network quantization that casts the...
research
11/29/2021

Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition

Recognition of overlapped speech has been a highly challenging task to d...
research
08/11/2020

PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

4-bit and lower precision mobile models are required due to the ever-inc...
research
08/22/2020

One Weight Bitwidth to Rule Them All

Weight quantization for deep ConvNets has shown promising results for ap...

Please sign up or login with your details

Forgot password? Click here to reset