Learning Efficient Detector with Semi-supervised Adaptive Distillation

01/02/2019
by   Shitao Tang, et al.
0

Knowledge Distillation (KD) has been used in image classification for model compression. However, rare studies apply this technology on single-stage object detectors. Focal loss shows that the accumulated errors of easily-classified samples dominate the overall loss in the training process. This problem is also encountered when applying KD in the detection task. For KD, the teacher-defined hard samples are far more important than any others. We propose ADL to address this issue by adaptively mimicking the teacher's logits, with more attention paid on two types of hard samples: hard-to-learn samples predicted by teacher with low certainty and hard-to-mimic samples with a large gap between the teacher's and the student's prediction. ADL enlarges the distillation loss for hard-to-learn and hard-to-mimic samples and reduces distillation loss for the dominant easy samples, enabling distillation to work on the single-stage detector first time, even if the student and the teacher are identical. Besides, ADL is effective in both the supervised setting and the semi-supervised setting, even when the labeled data and unlabeled data are from different distributions. For distillation on unlabeled data, ADL achieves better performance than existing data distillation which simply utilizes hard targets, making the student detector surpass its teacher. On the COCO database, semi-supervised adaptive distillation (SAD) makes a student detector with a backbone of ResNet-50 surpasses its teacher with a backbone of ResNet-101, while the student has half of the teacher's computation complexity. The code is avaiable at https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation

READ FULL TEXT
research
04/01/2022

Unified and Effective Ensemble Knowledge Distillation

Ensemble knowledge distillation can extract knowledge from multiple teac...
research
07/13/2020

Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection

This paper focuses on the problem of Semi-Supervised Object Detection (S...
research
08/22/2022

Multi-Granularity Distillation Scheme Towards Lightweight Semi-Supervised Semantic Segmentation

Albeit with varying degrees of progress in the field of Semi-Supervised ...
research
03/17/2020

Teacher-Student chain for efficient semi-supervised histology image classification

Deep learning shows great potential for the domain of digital pathology....
research
04/20/2021

SE-SSD: Self-Ensembling Single-Stage Object Detector From Point Cloud

We present Self-Ensembling Single-Stage object Detector (SE-SSD) for acc...
research
08/03/2023

Guided Distillation for Semi-Supervised Instance Segmentation

Although instance segmentation methods have improved considerably, the d...
research
12/02/2021

From Consensus to Disagreement: Multi-Teacher Distillation for Semi-Supervised Relation Extraction

Lack of labeled data is a main obstacle in relation extraction. Semi-sup...

Please sign up or login with your details

Forgot password? Click here to reset