[Re] Distilling Knowledge via Knowledge Review

05/18/2022
by   Apoorva Verma, et al.
0

This effort aims to reproduce the results of experiments and analyze the robustness of the review framework for knowledge distillation introduced in the CVPR '21 paper 'Distilling Knowledge via Knowledge Review' by Chen et al. Previous works in knowledge distillation only studied connections paths between the same levels of the student and the teacher, and cross-level connection paths had not been considered. Chen et al. propose a new residual learning framework to train a single student layer using multiple teacher layers. They also design a novel fusion module to condense feature maps across levels and a loss function to compare feature information stored across different levels to improve performance. In this work, we consistently verify the improvements in test accuracy across student models as reported in the original paper and study the effectiveness of the novel modules introduced by conducting ablation studies and new experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2018

Recurrent knowledge distillation

Knowledge distillation compacts deep networks by letting a small student...
research
04/19/2021

Distilling Knowledge via Knowledge Review

Knowledge distillation transfers knowledge from the teacher network to t...
research
08/02/2020

Differentiable Feature Aggregation Search for Knowledge Distillation

Knowledge distillation has become increasingly important in model compre...
research
06/25/2016

Sequence-Level Knowledge Distillation

Neural machine translation (NMT) offers a novel alternative formulation ...
research
04/12/2022

DistPro: Searching A Fast Knowledge Distillation Process via Meta Optimization

Recent Knowledge distillation (KD) studies show that different manually ...
research
03/26/2021

Distilling a Powerful Student Model via Online Knowledge Distillation

Existing online knowledge distillation approaches either adopt the stude...
research
06/29/2022

Revisiting Label Smoothing and Knowledge Distillation Compatibility: What was Missing?

This work investigates the compatibility between label smoothing (LS) an...

Please sign up or login with your details

Forgot password? Click here to reset