Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

by   Mahdi Ghorbani, et al.

Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods have been proposed to diminish the heavy computational burden and memory consumption. Among them, the pruning and quantizing methods exhibit a critical drop in performances by compressing the model parameters. While the knowledge distillation methods improve the performance of compact models by focusing on training lightweight networks with the supervision of cumbersome networks. In the proposed method, the knowledge distillation has been performed within the network by constructing multiple branches over the primary stream of the model, known as the self-distillation method. Therefore, the ensemble of sub-neural network models has been proposed to transfer the knowledge among themselves with the knowledge distillation policies as well as an adversarial learning strategy. Hence, The proposed ensemble of sub-models is trained against a discriminator model adversarially. Besides, their knowledge is transferred within the ensemble by four different loss functions. The proposed method has been devoted to both lightweight image classification and encoder-decoder architectures to boost the performance of small and compact models without incurring extra computational overhead at the inference process. Extensive experimental results on the main challenging datasets show that the proposed network outperforms the primary model in terms of accuracy at the same number of parameters and computational cost. The obtained results show that the proposed model has achieved significant improvement over earlier ideas of self-distillation methods. The effectiveness of the proposed models has also been illustrated in the encoder-decoder model.


page 1

page 4

page 5

page 7

page 10


Efficient training of lightweight neural networks using Online Self-Acquired Knowledge Distillation

Knowledge Distillation has been established as a highly promising approa...

AI-KD: Adversarial learning and Implicit regularization for self-Knowledge Distillation

We present a novel adversarial penalized self-knowledge distillation met...

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

Ensemble models comprising of deep Convolutional Neural Networks (CNN) h...

Channel Planting for Deep Neural Networks using Knowledge Distillation

In recent years, deeper and wider neural networks have shown excellent p...

A Lightweight Approach for Network Intrusion Detection based on Self-Knowledge Distillation

Network Intrusion Detection (NID) works as a kernel technology for the s...

Compressing Facial Makeup Transfer Networks by Collaborative Distillation and Kernel Decomposition

Although the facial makeup transfer network has achieved high-quality pe...

Joint Architecture and Knowledge Distillation in Convolutional Neural Network for Offline Handwritten Chinese Text Recognition

The technique of distillation helps transform cumbersome neural network ...

Please sign up or login with your details

Forgot password? Click here to reset