Be Your Own Best Competitor! Multi-Branched Adversarial Knowledge Transfer

by   Mahdi Ghorbani, et al.

Deep neural network architectures have attained remarkable improvements in scene understanding tasks. Utilizing an efficient model is one of the most important constraints for limited-resource devices. Recently, several compression methods have been proposed to diminish the heavy computational burden and memory consumption. Among them, the pruning and quantizing methods exhibit a critical drop in performances by compressing the model parameters. While the knowledge distillation methods improve the performance of compact models by focusing on training lightweight networks with the supervision of cumbersome networks. In the proposed method, the knowledge distillation has been performed within the network by constructing multiple branches over the primary stream of the model, known as the self-distillation method. Therefore, the ensemble of sub-neural network models has been proposed to transfer the knowledge among themselves with the knowledge distillation policies as well as an adversarial learning strategy. Hence, The proposed ensemble of sub-models is trained against a discriminator model adversarially. Besides, their knowledge is transferred within the ensemble by four different loss functions. The proposed method has been devoted to both lightweight image classification and encoder-decoder architectures to boost the performance of small and compact models without incurring extra computational overhead at the inference process. Extensive experimental results on the main challenging datasets show that the proposed network outperforms the primary model in terms of accuracy at the same number of parameters and computational cost. The obtained results show that the proposed model has achieved significant improvement over earlier ideas of self-distillation methods. The effectiveness of the proposed models has also been illustrated in the encoder-decoder model.



There are no comments yet.


page 1

page 4

page 5

page 7

page 10


Efficient training of lightweight neural networks using Online Self-Acquired Knowledge Distillation

Knowledge Distillation has been established as a highly promising approa...

Ensemble Knowledge Distillation for Learning Improved and Efficient Networks

Ensemble models comprising of deep Convolutional Neural Networks (CNN) h...

Channel Planting for Deep Neural Networks using Knowledge Distillation

In recent years, deeper and wider neural networks have shown excellent p...

Follow Your Path: a Progressive Method for Knowledge Distillation

Deep neural networks often have a huge number of parameters, which posts...

On the Orthogonality of Knowledge Distillation with Other Techniques: From an Ensemble Perspective

To put a state-of-the-art neural network to practical use, it is necessa...

Embedded Self-Distillation in Compact Multi-Branch Ensemble Network for Remote Sensing Scene Classification

Remote sensing (RS) image scene classification task faces many challenge...

Compressing Facial Makeup Transfer Networks by Collaborative Distillation and Kernel Decomposition

Although the facial makeup transfer network has achieved high-quality pe...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.