Distill2Vec: Dynamic Graph Representation Learning with Knowledge Distillation

11/11/2020
by   Stefanos Antaris, et al.
0

Dynamic graph representation learning strategies are based on different neural architectures to capture the graph evolution over time. However, the underlying neural architectures require a large amount of parameters to train and suffer from high online inference latency, that is several model parameters have to be updated when new data arrive online. In this study we propose Distill2Vec, a knowledge distillation strategy to train a compact model with a low number of trainable parameters, so as to reduce the latency of online inference and maintain the model accuracy high. We design a distillation loss function based on Kullback-Leibler divergence to transfer the acquired knowledge from a teacher model trained on offline data, to a small-size student model for online data. Our experiments with publicly available datasets show the superiority of our proposed model over several state-of-the-art approaches with relative gains up to 5 demonstrate the effectiveness of our knowledge distillation strategy, in terms of number of required parameters, where Distill2Vec achieves a compression ratio up to 7:100 when compared with baseline approaches. For reproduction purposes, our implementation is publicly available at https://stefanosantaris.github.io/Distill2Vec.

READ FULL TEXT
research
11/11/2020

EGAD: Evolving Graph Representation Learning with Self-Attention and Knowledge Distillation for Live Video Streaming Events

In this study, we present a dynamic graph representation learning model ...
research
08/22/2022

Tree-structured Auxiliary Online Knowledge Distillation

Traditional knowledge distillation adopts a two-stage training process i...
research
11/08/2020

Ensembled CTR Prediction via Knowledge Distillation

Recently, deep learning-based models have been widely studied for click-...
research
03/16/2023

Knowledge Distillation for Adaptive MRI Prostate Segmentation Based on Limit-Trained Multi-Teacher Models

With numerous medical tasks, the performance of deep models has recently...
research
10/27/2019

MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization

In this paper, we present and discuss a deep mixture model with online k...
research
08/02/2023

Towards Better Query Classification with Multi-Expert Knowledge Condensation in JD Ads Search

Search query classification, as an effective way to understand user inte...
research
11/21/2022

Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation

With the growth of high-dimensional sparse data in web-scale recommender...

Please sign up or login with your details

Forgot password? Click here to reset