Alignahead: Online Cross-Layer Knowledge Extraction on Graph Neural Networks

05/05/2022
by   Jiongyu Guo, et al.
0

Existing knowledge distillation methods on graph neural networks (GNNs) are almost offline, where the student model extracts knowledge from a powerful teacher model to improve its performance. However, a pre-trained teacher model is not always accessible due to training cost, privacy, etc. In this paper, we propose a novel online knowledge distillation framework to resolve this problem. Specifically, each student GNN model learns the extracted local structure from another simultaneously trained counterpart in an alternating training procedure. We further develop a cross-layer distillation strategy by aligning ahead one student layer with the layer in different depth of another student model, which theoretically makes the structure information spread over all layers. Experimental results on five datasets including PPI, Coauthor-CS/Physics and Amazon-Computer/Photo demonstrate that the student performance is consistently boosted in our collaborative training framework without the supervision of a pre-trained teacher model. In addition, we also find that our alignahead technique can accelerate the model convergence speed and its effectiveness can be generally improved by increasing the student numbers in training. Code is available: https://github.com/GuoJY-eatsTG/Alignahead

READ FULL TEXT

page 1

page 3

research
10/25/2022

Online Cross-Layer Knowledge Distillation on Graph Neural Networks with Deep Supervision

Graph neural networks (GNNs) have become one of the most popular researc...
research
08/18/2020

Knowledge Transfer via Dense Cross-Layer Mutual-Distillation

Knowledge Distillation (KD) based methods adopt the one-way Knowledge Tr...
research
10/29/2021

Model Fusion of Heterogeneous Neural Networks via Cross-Layer Alignment

Layer-wise model fusion via optimal transport, named OTFusion, applies s...
research
03/04/2021

Extract the Knowledge of Graph Neural Networks and Go Beyond it: An Effective Knowledge Distillation Framework

Semi-supervised learning on graphs is an important problem in the machin...
research
05/23/2023

NORM: Knowledge Distillation via N-to-One Representation Matching

Existing feature distillation methods commonly adopt the One-to-one Repr...
research
09/06/2023

Knowledge Distillation Layer that Lets the Student Decide

Typical technique in knowledge distillation (KD) is regularizing the lea...
research
03/23/2020

Distilling Knowledge from Graph Convolutional Networks

Existing knowledge distillation methods focus on convolutional neural ne...

Please sign up or login with your details

Forgot password? Click here to reset