PROFIT: A Novel Training Method for sub-4-bit MobileNet Models

08/11/2020
by   Eunhyeok Park, et al.
0

4-bit and lower precision mobile models are required due to the ever-increasing demand for better energy efficiency in mobile devices. In this work, we report that the activation instability induced by weight quantization (AIWQ) is the key obstacle to sub-4-bit quantization of mobile networks. To alleviate the AIWQ problem, we propose a novel training method called PROgressive-Freezing Iterative Training (PROFIT), which attempts to freeze layers whose weights are affected by the instability problem stronger than the other layers. We also propose a differentiable and unified quantization method (DuQ) and a negative padding idea to support asymmetric activation functions such as h-swish. We evaluate the proposed methods by quantizing MobileNet-v1, v2, and v3 on ImageNet and report that 4-bit quantization offers comparable (within 1.48 ablation study of the 3-bit quantization of MobileNet-v3, our proposed method outperforms the state-of-the-art method by a large margin, 12.86 accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/24/2018

Precision Highway for Ultra Low-Precision Quantization

Neural network quantization has an inherent problem called accumulated q...
research
03/21/2022

Overcoming Oscillations in Quantization-Aware Training

When training neural networks with simulated quantization, we observe th...
research
01/19/2022

Q-ViT: Fully Differentiable Quantization for Vision Transformer

In this paper, we propose a fully differentiable quantization method for...
research
04/20/2020

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Unlike ReLU, newer activation functions (like Swish, H-swish, Mish) that...
research
12/21/2022

Automatic Network Adaptation for Ultra-Low Uniform-Precision Quantization

Uniform-precision neural network quantization has gained popularity sinc...
research
02/23/2018

Loss-aware Weight Quantization of Deep Networks

The huge size of deep networks hinders their use in small computing devi...
research
10/22/2019

Zero-Crossing Precoding With Maximum Distance to the Decision Threshold for Channels With 1-Bit Quantization and Oversampling

Low-resolution devices are promising for systems that demand low energy ...

Please sign up or login with your details

Forgot password? Click here to reset