Q-DETR: An Efficient Low-Bit Quantized Detection Transformer

04/01/2023
by   Sheng Xu, et al.
0

The recent detection transformer (DETR) has advanced object detection, but its application on resource-constrained devices requires massive computation and memory resources. Quantization stands out as a solution by representing the network in low-bit parameters and operations. However, there is a significant performance drop when performing low-bit quantized DETR (Q-DETR) with existing quantization methods. We find that the bottlenecks of Q-DETR come from the query information distortion through our empirical analyses. This paper addresses this problem based on a distribution rectification distillation (DRD). We formulate our DRD as a bi-level optimization problem, which can be derived by generalizing the information bottleneck (IB) principle to the learning of Q-DETR. At the inner level, we conduct a distribution alignment for the queries to maximize the self-information entropy. At the upper level, we introduce a new foreground-aware query matching scheme to effectively transfer the teacher information to distillation-desired features to minimize the conditional information entropy. Extensive experimental results show that our method performs much better than prior arts. For example, the 4-bit Q-DETR can theoretically accelerate DETR with ResNet-50 backbone by 6.6x and achieve 39.4 AP, with only 2.6 COCO dataset.

READ FULL TEXT
research
10/13/2022

Q-ViT: Accurate and Fully Quantized Low-bit Vision Transformer

The large pre-trained vision transformers (ViTs) have demonstrated remar...
research
07/20/2023

Quantized Feature Distillation for Network Quantization

Neural network quantization aims to accelerate and trim full-precision n...
research
10/07/2022

IDa-Det: An Information Discrepancy-aware Distillation for 1-bit Detectors

Knowledge distillation (KD) has been proven to be useful for training co...
research
03/22/2021

n-hot: Efficient bit-level sparsity for powers-of-two neural network quantization

Powers-of-two (PoT) quantization reduces the number of bit operations of...
research
07/14/2020

AQD: Towards Accurate Quantized Object Detection

Network quantization aims to lower the bitwidth of weights and activatio...
research
10/25/2021

Instance-Conditional Knowledge Distillation for Object Detection

Despite the success of Knowledge Distillation (KD) on image classificati...
research
02/10/2021

Impact of Bit Allocation Strategies on Machine Learning Performance in Rate Limited Systems

Intelligent entities such as self-driving vehicles, with their data bein...

Please sign up or login with your details

Forgot password? Click here to reset