Knowledge Distillation Meets Open-Set Semi-Supervised Learning

05/13/2022
by   Jing Yang, et al.
13

Existing knowledge distillation methods mostly focus on distillation of teacher's prediction and intermediate activation. However, the structured representation, which arguably is one of the most critical ingredients of deep models, is largely overlooked. In this work, we propose a novel () method dedicated for distilling representational knowledge semantically from a pretrained teacher to a target student. The key idea is that we leverage the teacher's classifier as a semantic critic for evaluating the representations of both teacher and student and distilling the semantic knowledge with high-order structured information over all feature dimensions. This is accomplished by introducing a notion of cross-network logit computed through passing student's representation into teacher's classifier. Further, considering the set of seen classes as a basis for the semantic space in a combinatorial perspective, we scale to unseen classes for enabling effective exploitation of largely available, arbitrary unlabeled training data. At the problem level, this establishes an interesting connection between knowledge distillation with open-set semi-supervised learning (SSL). Extensive experiments show that our outperforms significantly previous state-of-the-art knowledge distillation methods on both coarse object classification and fine face recognition tasks, as well as less studied yet practically crucial binary network distillation. Under more realistic open-set SSL settings we introduce, we reveal that knowledge distillation is generally more effective than existing Out-Of-Distribution (OOD) sample detection, and our proposed is superior over both previous distillation and SSL competitors. The source code is available at <https://github.com/jingyang2017/SRD_ossl>.

READ FULL TEXT

page 4

page 11

research
05/05/2022

Spot-adaptive Knowledge Distillation

Knowledge distillation (KD) has become a well established paradigm for c...
research
01/18/2022

It's All in the Head: Representation Knowledge Distillation through Classifier Sharing

Representation knowledge distillation aims at transferring rich informat...
research
04/19/2021

Distilling Knowledge via Knowledge Review

Knowledge distillation transfers knowledge from the teacher network to t...
research
07/20/2023

Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering

Despite the empirical success and practical significance of (relational)...
research
11/21/2022

Directed Acyclic Graph Factorization Machines for CTR Prediction via Knowledge Distillation

With the growth of high-dimensional sparse data in web-scale recommender...
research
11/14/2022

An Interpretable Neuron Embedding for Static Knowledge Distillation

Although deep neural networks have shown well-performance in various tas...
research
07/08/2020

Robust Re-Identification by Multiple Views Knowledge Distillation

To achieve robustness in Re-Identification, standard methods leverage tr...

Please sign up or login with your details

Forgot password? Click here to reset