Make RepVGG Greater Again: A Quantization-aware Approach

12/03/2022
by   Xiangxiang Chu, et al.
0

The tradeoff between performance and inference speed is critical for practical applications. Architecture reparameterization obtains better tradeoffs and it is becoming an increasingly popular ingredient in modern convolutional neural networks. Nonetheless, its quantization performance is usually too poor to deploy (e.g. more than 20 when INT8 inference is desired. In this paper, we dive into the underlying mechanism of this failure, where the original design inevitably enlarges quantization error. We propose a simple, robust, and effective remedy to have a quantization-friendly structure that also enjoys reparameterization benefits. Our method greatly bridges the gap between INT8 and FP32 accuracy for RepVGG. Without bells and whistles, the top-1 accuracy drop on ImageNet is reduced within 2% by standard post-training quantization.

READ FULL TEXT

page 14

page 15

page 16

page 17

research
01/09/2020

Least squares binary quantization of neural networks

Quantizing weights and activations of deep neural networks results in si...
research
03/13/2023

Bag of Tricks with Quantized Convolutional Neural Networks for image classification

Deep neural networks have been proven effective in a wide range of tasks...
research
11/10/2021

An Underexplored Dilemma between Confidence and Calibration in Quantized Neural Networks

Modern convolutional neural networks (CNNs) are known to be overconfiden...
research
03/08/2021

Reliability-Aware Quantization for Anti-Aging NPUs

Transistor aging is one of the major concerns that challenges designers ...
research
03/22/2018

A Quantization-Friendly Separable Convolution for MobileNets

As deep learning (DL) is being rapidly pushed to edge computing, researc...
research
06/02/2020

Asymptotically Scale-invariant Multi-resolution Quantization

A multi-resolution quantizer is a sequence of quantizers where the outpu...
research
02/08/2020

BitPruning: Learning Bitlengths for Aggressive and Accurate Quantization

Neural networks have demonstrably achieved state-of-the art accuracy usi...

Please sign up or login with your details

Forgot password? Click here to reset