Unifying Heterogeneous Classifiers with Distillation

04/12/2019
by   Jayakorn Vongkulbhisal, et al.
0

In this paper, we study the problem of unifying knowledge from a set of classifiers with different architectures and target classes into a single classifier, given only a generic set of unlabelled data. We call this problem Unifying Heterogeneous Classifiers (UHC). This problem is motivated by scenarios where data is collected from multiple sources, but the sources cannot share their data, e.g., due to privacy concerns, and only privately trained models can be shared. In addition, each source may not be able to gather data to train all classes due to data availability at each source, and may not be able to train the same classification model due to different computational resources. To tackle this problem, we propose a generalisation of knowledge distillation to merge HCs. We derive a probabilistic relation between the outputs of HCs and the probability over all classes. Based on this relation, we propose two classes of methods based on cross-entropy minimisation and matrix factorisation, which allow us to estimate soft labels over all classes from unlabelled samples and use them in lieu of ground truth labels to train a unified classifier. Our extensive experiments on ImageNet, LSUN, and Places365 datasets show that our approaches significantly outperform a naive extension of distillation and can achieve almost the same accuracy as classifiers that are trained in a centralised, supervised manner.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2021

Towards Understanding Knowledge Distillation

Knowledge distillation, i.e., one classifier being trained on the output...
research
03/30/2020

On the Unreasonable Effectiveness of Knowledge Distillation: Analysis in the Kernel Regime

Knowledge distillation (KD), i.e. one classifier being trained on the ou...
research
07/18/2023

Class-relation Knowledge Distillation for Novel Class Discovery

We tackle the problem of novel class discovery, which aims to learn nove...
research
10/14/2022

Learning Generalizable Models for Vehicle Routing Problems via Knowledge Distillation

Recent neural methods for vehicle routing problems always train and test...
research
08/07/2023

Efficient Temporal Sentence Grounding in Videos with Multi-Teacher Knowledge Distillation

Temporal Sentence Grounding in Videos (TSGV) aims to detect the event ti...
research
04/28/2023

CORSD: Class-Oriented Relational Self Distillation

Knowledge distillation conducts an effective model compression method wh...
research
09/04/2023

Adapting Classifiers To Changing Class Priors During Deployment

Conventional classifiers are trained and evaluated using balanced data s...

Please sign up or login with your details

Forgot password? Click here to reset