Structural Knowledge Distillation for Object Detection

11/23/2022
by   Philip de Rijk, et al.
0

Knowledge Distillation (KD) is a well-known training paradigm in deep neural networks where knowledge acquired by a large teacher model is transferred to a small student. KD has proven to be an effective technique to significantly improve the student's performance for various tasks including object detection. As such, KD techniques mostly rely on guidance at the intermediate feature level, which is typically implemented by minimizing an lp-norm distance between teacher and student activations during training. In this paper, we propose a replacement for the pixel-wise independent lp-norm based on the structural similarity (SSIM). By taking into account additional contrast and structural cues, feature importance, correlation and spatial dependence in the feature space are considered in the loss formulation. Extensive experiments on MSCOCO demonstrate the effectiveness of our method across different training schemes and architectures. Our method adds only little computational overhead, is straightforward to implement and at the same time it significantly outperforms the standard lp-norms. Moreover, more complex state-of-the-art KD methods using attention-based sampling mechanisms are outperformed, including a +3.5 AP gain using a Faster R-CNN R-50 compared to a vanilla model.

READ FULL TEXT

page 6

page 7

research
06/20/2019

GAN-Knowledge Distillation for one-stage Object Detection

Convolutional neural networks have a significant improvement in the accu...
research
01/31/2023

AMD: Adaptive Masked Distillation for Object

As a general model compression paradigm, feature-based knowledge distill...
research
04/10/2019

Relational Knowledge Distillation

Knowledge distillation aims at transferring knowledge acquired in one mo...
research
11/17/2022

DETRDistill: A Universal Knowledge Distillation Framework for DETR-families

Transformer-based detectors (DETRs) have attracted great attention due t...
research
02/11/2023

Dual Relation Knowledge Distillation for Object Detection

Knowledge distillation is an effective method for model compression. How...
research
07/24/2023

A Good Student is Cooperative and Reliable: CNN-Transformer Collaborative Learning for Semantic Segmentation

In this paper, we strive to answer the question "how to collaboratively ...

Please sign up or login with your details

Forgot password? Click here to reset