FAT: Learning Low-Bitwidth Parametric Representation via Frequency-Aware Transformation

02/15/2021
by   Chaofan Tao, et al.
0

Learning convolutional neural networks (CNNs) with low bitwidth is challenging because performance may drop significantly after quantization. Prior arts often discretize the network weights by carefully tuning hyper-parameters of quantization (e.g. non-uniform stepsize and layer-wise bitwidths), which are complicated and sub-optimal because the full-precision and low-precision models have a large discrepancy. This work presents a novel quantization pipeline, Frequency-Aware Transformation (FAT), which has several appealing benefits. (1) Rather than designing complicated quantizers like existing works, FAT learns to transform network weights in the frequency domain before quantization, making them more amenable to training in low bitwidth. (2) With FAT, CNNs can be easily trained in low precision using simple standard quantizers without tedious hyper-parameter tuning. Theoretical analysis shows that FAT improves both uniform and non-uniform quantizers. (3) FAT can be easily plugged into many CNN architectures. When training ResNet-18 and MobileNet-V2 in 4 bits, FAT plus a simple rounding operation already achieves 70.5 outperforming recent state-of-the-art by reducing 54.9X and 45.7X computations against full-precision models. We hope FAT provides a novel perspective for model quantization. Code is available at <https://github.com/ChaofanTao/FAT_Quantization>.

READ FULL TEXT
research
04/29/2018

UNIQ: Uniform Noise Injection for the Quantization of Neural Networks

We present a novel method for training deep neural network amenable to i...
research
04/29/2018

UNIQ: Uniform Noise Injection for non-uniform Quantization of neural networks

We present a novel method for training a neural network amenable to infe...
research
02/10/2017

Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights

This paper presents incremental network quantization (INQ), a novel meth...
research
06/04/2021

Differentiable Dynamic Quantization with Mixed Precision and Adaptive Resolution

Model quantization is challenging due to many tedious hyper-parameters s...
research
03/23/2021

ReCU: Reviving the Dead Weights in Binary Neural Networks

Binary neural networks (BNNs) have received increasing attention due to ...
research
04/20/2021

Differentiable Model Compression via Pseudo Quantization Noise

We propose to add independent pseudo quantization noise to model paramet...
research
02/08/2020

Soft Threshold Weight Reparameterization for Learnable Sparsity

Sparsity in Deep Neural Networks (DNNs) is studied extensively with the ...

Please sign up or login with your details

Forgot password? Click here to reset