Black-box Few-shot Knowledge Distillation

07/25/2022
by   Dang Nguyen, et al.
0

Knowledge distillation (KD) is an efficient approach to transfer the knowledge from a large "teacher" network to a smaller "student" network. Traditional KD methods require lots of labeled training samples and a white-box teacher (parameters are accessible) to train a good student. However, these resources are not always available in real-world applications. The distillation process often happens at an external party side where we do not have access to much data, and the teacher does not disclose its parameters due to security and privacy concerns. To overcome these challenges, we propose a black-box few-shot KD method to train the student with few unlabeled training samples and a black-box teacher. Our main idea is to expand the training set by generating a diverse set of out-of-distribution synthetic images using MixUp and a conditional variational auto-encoder. These synthetic images along with their labels obtained from the teacher are used to train the student. We conduct extensive experiments to show that our method significantly outperforms recent SOTA few/zero-shot KD methods on image classification tasks. The code and models are available at: https://github.com/nphdang/FS-BBT

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/07/2021

Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model

Knowledge distillation (KD) is a successful approach for deep neural net...
research
05/23/2019

Zero-shot Knowledge Transfer via Adversarial Belief Matching

Performing knowledge transfer from a large teacher network to a smaller ...
research
10/21/2020

Black-Box Ripper: Copying black-box models using generative evolutionary algorithms

We study the task of replicating the functionality of black-box neural m...
research
02/23/2022

Absolute Zero-Shot Learning

Considering the increasing concerns about data copyright and privacy iss...
research
07/10/2020

Data-Efficient Ranking Distillation for Image Retrieval

Recent advances in deep learning has lead to rapid developments in the f...
research
12/09/2020

Progressive Network Grafting for Few-Shot Knowledge Distillation

Knowledge distillation has demonstrated encouraging performances in deep...
research
07/06/2023

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Large vision-language models have achieved outstanding performance, but ...

Please sign up or login with your details

Forgot password? Click here to reset