Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation

10/14/2022
by   Jieyi Bi, et al.
0

Recent neural methods for vehicle routing problems always train and test the deep models on the same instance distribution (i.e., uniform). To tackle the consequent cross-distribution generalization concerns, we bring the knowledge distillation to this field and propose an Adaptive Multi-Distribution Knowledge Distillation (AMDKD) scheme for learning more generalizable deep models. Particularly, our AMDKD leverages various knowledge from multiple teachers trained on exemplar distributions to yield a light-weight yet generalist student model. Meanwhile, we equip AMDKD with an adaptive strategy that allows the student to concentrate on difficult distributions, so as to absorb hard-to-master knowledge more effectively. Extensive experimental results show that, compared with the baseline neural methods, our AMDKD is able to achieve competitive results on both unseen in-distribution and out-of-distribution instances, which are either randomly synthesized or adopted from benchmark datasets (i.e., TSPLIB and CVRPLIB). Notably, our AMDKD is generic, and consumes less computational resources for inference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/13/2023

AMTSS: An Adaptive Multi-Teacher Single-Student Knowledge Distillation Framework For Multilingual Language Inference

Knowledge distillation is of key importance to launching multilingual pr...
research
02/15/2022

Learning to Solve Routing Problems via Distributionally Robust Optimization

Recent deep models for solving routing problems always assume a single d...
research
09/20/2023

Weight Averaging Improves Knowledge Distillation under Domain Shift

Knowledge distillation (KD) is a powerful model compression technique br...
research
08/17/2019

Improved Techniques for Training Adaptive Deep Networks

Adaptive inference is a promising technique to improve the computational...
research
04/12/2019

Unifying Heterogeneous Classifiers with Distillation

In this paper, we study the problem of unifying knowledge from a set of ...
research
01/15/2021

Data Impressions: Mining Deep Models to Extract Samples for Data-free Applications

Pretrained deep models hold their learnt knowledge in the form of the mo...
research
07/01/2019

Compression of Acoustic Event Detection Models With Quantized Distillation

Acoustic Event Detection (AED), aiming at detecting categories of events...

Please sign up or login with your details

Forgot password? Click here to reset