Boosting Residual Networks with Group Knowledge

08/26/2023
by   Shengji Tang, et al.
0

Recent research understands the residual networks from a new perspective of the implicit ensemble model. From this view, previous methods such as stochastic depth and stimulative training have further improved the performance of the residual network by sampling and training of its subnets. However, they both use the same supervision for all subnets of different capacities and neglect the valuable knowledge generated by subnets during training. In this manuscript, we mitigate the significant knowledge distillation gap caused by using the same kind of supervision and advocate leveraging the subnets to provide diverse knowledge. Based on this motivation, we propose a group knowledge based training framework for boosting the performance of residual networks. Specifically, we implicitly divide all subnets into hierarchical groups by subnet-in-subnet sampling, aggregate the knowledge of different subnets in each group during training, and exploit upper-level group knowledge to supervise lower-level subnet groups. Meanwhile, We also develop a subnet sampling strategy that naturally samples larger subnets, which are found to be more helpful than smaller subnets in boosting performance for hierarchical groups. Compared with typical subnet training and other methods, our method achieves the best efficiency and performance trade-offs on multiple datasets and network structures. The code will be released soon.

READ FULL TEXT
research
05/04/2023

Stimulative Training++: Go Beyond The Performance Limits of Residual Networks

Residual networks have shown great success and become indispensable in r...
research
06/01/2022

ORC: Network Group-based Knowledge Distillation using Online Role Change

In knowledge distillation, since a single, omnipotent teacher network ca...
research
10/09/2022

Stimulative Training of Residual Networks: A Social Psychology Perspective of Loafing

Residual networks have shown great success and become indispensable in t...
research
03/22/2022

Channel Self-Supervision for Online Knowledge Distillation

Recently, researchers have shown an increased interest in the online kno...
research
03/26/2021

Hands-on Guidance for Distilling Object Detectors

Knowledge distillation can lead to deploy-friendly networks against the ...
research
02/23/2023

Practical Knowledge Distillation: Using DNNs to Beat DNNs

For tabular data sets, we explore data and model distillation, as well a...
research
11/28/2022

A Boosting Approach to Constructing an Ensemble Stack

An approach to evolutionary ensemble learning for classification is prop...

Please sign up or login with your details

Forgot password? Click here to reset