Impact of a DCT-driven Loss in Attention-based Knowledge-Distillation for Scene Recognition

Knowledge Distillation (KD) is a strategy for the definition of a set of transferability gangways to improve the efficiency of Convolutional Neural Networks. Feature-based Knowledge Distillation is a subfield of KD that relies on intermediate network representations, either unaltered or depth-reduced via maximum activation maps, as the source knowledge. In this paper, we propose and analyse the use of a 2D frequency transform of the activation maps before transferring them. We pose that—by using global image cues rather than pixel estimates, this strategy enhances knowledge transferability in tasks such as scene recognition, defined by strong spatial and contextual relationships between multiple and varied concepts. To validate the proposed method, an extensive evaluation of the state-of-the-art in scene recognition is presented. Experimental results provide strong evidences that the proposed strategy enables the student network to better focus on the relevant image areas learnt by the teacher network, hence leading to better descriptive features and higher transferred performance than every other state-of-the-art alternative. We publicly release the training and evaluation framework used along this paper at http://www-vpu.eps.uam.es/publications/DCTBasedKDForSceneRecognition.

READ FULL TEXT

page 3

page 21

page 29

page 30

research
12/03/2019

QUEST: Quantized embedding space for transferring knowledge

Knowledge distillation refers to the process of training a compact stude...
research
06/01/2021

Natural Statistics of Network Activations and Implications for Knowledge Distillation

In a matter that is analog to the study of natural image statistics, we ...
research
04/25/2023

Class Attention Transfer Based Knowledge Distillation

Previous knowledge distillation methods have shown their impressive perf...
research
03/31/2021

Knowledge Distillation By Sparse Representation Matching

Knowledge Distillation refers to a class of methods that transfers the k...
research
02/27/2023

Leveraging Angular Distributions for Improved Knowledge Distillation

Knowledge distillation as a broad class of methods has led to the develo...
research
07/04/2019

Graph-based Knowledge Distillation by Multi-head Attention Network

Knowledge distillation (KD) is a technique to derive optimal performance...
research
06/25/2023

Enhancing Mapless Trajectory Prediction through Knowledge Distillation

Scene information plays a crucial role in trajectory forecasting systems...

Please sign up or login with your details

Forgot password? Click here to reset