Pixel-Wise Contrastive Distillation

11/01/2022
by   Junqiang Huang, et al.
0

We present the first pixel-level self-supervised distillation framework specified for dense prediction tasks. Our approach, called Pixel-Wise Contrastive Distillation (PCD), distills knowledge by attracting the corresponding pixels from student's and teacher's output feature maps. This pixel-to-pixel distillation demands for maintaining the spatial information of teacher's output. We propose a SpatialAdaptor that adapts the well-trained projection/prediction head of the teacher used to encode vectorized features to processing 2D feature maps. SpatialAdaptor enables more informative pixel-level distillation, yielding a better student for dense prediction tasks. Besides, in light of the inadequate effective receptive fields of small models, we utilize a plug-in multi-head self-attention module to explicitly relate the pixels of student's feature maps. Overall, our PCD outperforms previous self-supervised distillation methods on various dense prediction tasks. A backbone of ResNet-18 distilled by PCD achieves 37.4 AP^bbox and 34.0 AP^mask with Mask R-CNN detector on COCO dataset, emerging as the first pre-training method surpassing the supervised pre-trained counterpart.

READ FULL TEXT
research
03/23/2023

A Simple and Generic Framework for Feature Distillation via Channel-wise Transformation

Knowledge distillation is a popular technique for transferring the knowl...
research
06/28/2023

Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners

Representation learning has been evolving from traditional supervised tr...
research
11/14/2022

Information-guided pixel augmentation for pixel-wise contrastive learning

Contrastive learning (CL) is a form of self-supervised learning and has ...
research
08/16/2021

FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

Recent advancements in deep neural networks have made remarkable leap-fo...
research
01/13/2022

SimReg: Regression as a Simple Yet Effective Tool for Self-supervised Knowledge Distillation

Feature regression is a simple way to distill large neural network model...
research
03/02/2022

SEA: Bridging the Gap Between One- and Two-stage Detector Distillation via SEmantic-aware Alignment

We revisit the one- and two-stage detector distillation tasks and presen...
research
10/02/2022

Pixel-global Self-supervised Learning with Uncertainty-aware Context Stabilizer

We developed a novel SSL approach to capture global consistency and pixe...

Please sign up or login with your details

Forgot password? Click here to reset