Knowledge Distillation for Adaptive MRI Prostate Segmentation Based on Limit-Trained Multi-Teacher Models

03/16/2023
by   Eddardaa Ben Loussaief, et al.
0

With numerous medical tasks, the performance of deep models has recently experienced considerable improvements. These models are often adept learners. Yet, their intricate architectural design and high computational complexity make deploying them in clinical settings challenging, particularly with devices with limited resources. To deal with this issue, Knowledge Distillation (KD) has been proposed as a compression method and an acceleration technology. KD is an efficient learning strategy that can transfer knowledge from a burdensome model (i.e., teacher model) to a lightweight model (i.e., student model). Hence we can obtain a compact model with low parameters with preserving the teacher's performance. Therefore, we develop a KD-based deep model for prostate MRI segmentation in this work by combining features-based distillation with Kullback-Leibler divergence, Lovasz, and Dice losses. We further demonstrate its effectiveness by applying two compression procedures: 1) distilling knowledge to a student model from a single well-trained teacher, and 2) since most of the medical applications have a small dataset, we train multiple teachers that each one trained with a small set of images to learn an adaptive student model as close to the teachers as possible considering the desired accuracy and fast inference time. Extensive experiments were conducted on a public multi-site prostate tumor dataset, showing that the proposed adaptation KD strategy improves the dice similarity score by 9 well-established baseline models.

READ FULL TEXT
research
11/12/2021

Learning Interpretation with Explainable Knowledge Distillation

Knowledge Distillation (KD) has been considered as a key solution in mod...
research
11/15/2019

Stagewise Knowledge Distillation

The deployment of modern Deep Learning models requires high computationa...
research
12/11/2020

Reinforced Multi-Teacher Selection for Knowledge Distillation

In natural language processing (NLP) tasks, slow inference speed and hug...
research
05/10/2023

Explainable Knowledge Distillation for On-device Chest X-Ray Classification

Automated multi-label chest X-rays (CXR) image classification has achiev...
research
11/11/2020

Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation

Dynamic graph representation learning strategies are based on different ...
research
07/16/2019

Light Multi-segment Activation for Model Compression

Model compression has become necessary when applying neural networks (NN...

Please sign up or login with your details

Forgot password? Click here to reset