Towards a Smaller Student: Capacity Dynamic Distillation for Efficient Image Retrieval

03/16/2023
by   Yi Xie, et al.
0

Previous Knowledge Distillation based efficient image retrieval methods employs a lightweight network as the student model for fast inference. However, the lightweight student model lacks adequate representation capacity for effective knowledge imitation during the most critical early training period, causing final performance degeneration. To tackle this issue, we propose a Capacity Dynamic Distillation framework, which constructs a student model with editable representation capacity. Specifically, the employed student model is initially a heavy model to fruitfully learn distilled knowledge in the early training epochs, and the student model is gradually compressed during the training. To dynamically adjust the model capacity, our dynamic framework inserts a learnable convolutional layer within each residual block in the student model as the channel importance indicator. The indicator is optimized simultaneously by the image retrieval loss and the compression loss, and a retrieval-guided gradient resetting mechanism is proposed to release the gradient conflict. Extensive experiments show that our method has superior inference speed and accuracy, e.g., on the VeRi-776 dataset, given the ResNet101 as a teacher, our method saves 67.13 FLOPs (around 24.13 sacrificing accuracy (around 2.11

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2021

Improved Knowledge Distillation via Adversarial Collaboration

Knowledge distillation has become an important approach to obtain a comp...
research
07/19/2022

Context Unaware Knowledge Distillation for Image Retrieval

Existing data-dependent hashing methods use large backbone networks with...
research
07/10/2020

Data-Efficient Ranking Distillation for Image Retrieval

Recent advances in deep learning has lead to rapid developments in the f...
research
07/11/2023

The Staged Knowledge Distillation in Video Classification: Harmonizing Student Progress by a Complementary Weakly Supervised Framework

In the context of label-efficient learning on video data, the distillati...
research
11/29/2022

Feature-domain Adaptive Contrastive Distillation for Efficient Single Image Super-Resolution

Recently, CNN-based SISR has numerous parameters and high computational ...
research
02/21/2023

MaskedKD: Efficient Distillation of Vision Transformers with Masked Images

Knowledge distillation is a popular and effective regularization techniq...
research
06/11/2022

Reducing Capacity Gap in Knowledge Distillation with Review Mechanism for Crowd Counting

The lightweight crowd counting models, in particular knowledge distillat...

Please sign up or login with your details

Forgot password? Click here to reset