Augmenting Knowledge Distillation With Peer-To-Peer Mutual Learning For Model Compression

10/21/2021
by   Usma Niyaz, et al.
0

Knowledge distillation (KD) is an effective model compression technique where a compact student network is taught to mimic the behavior of a complex and highly trained teacher network. In contrast, Mutual Learning (ML) provides an alternative strategy where multiple simple student networks benefit from sharing knowledge, even in the absence of a powerful but static teacher network. Motivated by these findings, we propose a single-teacher, multi-student framework that leverages both KD and ML to achieve better performance. Furthermore, an online distillation strategy is utilized to train the teacher and students simultaneously. To evaluate the performance of the proposed approach, extensive experiments were conducted using three different versions of teacher-student networks on benchmark biomedical classification (MSI vs. MSS) and object detection (Polyp Detection) tasks. Ensemble of student networks trained in the proposed manner achieved better results than the ensemble of students trained using KD or ML individually, establishing the benefit of augmenting knowledge transfer from teacher to students with peer-to-peer learning between students.

READ FULL TEXT
research
07/03/2020

Interactive Knowledge Distillation

Knowledge distillation is a standard teacher-student learning framework ...
research
06/01/2017

Deep Mutual Learning

Model distillation is an effective and widely used technique to transfer...
research
12/06/2022

Leveraging Different Learning Styles for Improved Knowledge Distillation

Learning style refers to a type of training mechanism adopted by an indi...
research
11/15/2020

Online Ensemble Model Compression using Knowledge Distillation

This paper presents a novel knowledge distillation based model compressi...
research
03/14/2023

Teacher-Student Knowledge Distillation for Radar Perception on Embedded Accelerators

Many radar signal processing methodologies are being developed for criti...
research
07/03/2020

Knowledge Distillation Beyond Model Compression

Knowledge distillation (KD) is commonly deemed as an effective model com...
research
05/12/2018

Born Again Neural Networks

Knowledge distillation (KD) consists of transferring knowledge from one ...

Please sign up or login with your details

Forgot password? Click here to reset