Few-Shot Learning of Compact Models via Task-Specific Meta Distillation

10/18/2022
by   Yong Wu, et al.
0

We consider a new problem of few-shot learning of compact models. Meta-learning is a popular approach for few-shot learning. Previous work in meta-learning typically assumes that the model architecture during meta-training is the same as the model architecture used for final deployment. In this paper, we challenge this basic assumption. For final deployment, we often need the model to be small. But small models usually do not have enough capacity to effectively adapt to new tasks. In the mean time, we often have access to the large dataset and extensive computing power during meta-training since meta-training is typically performed on a server. In this paper, we propose task-specific meta distillation that simultaneously learns two models in meta-learning: a large teacher model and a small student model. These two models are jointly learned during meta-training. Given a new task during meta-testing, the teacher model is first adapted to this task, then the adapted teacher model is used to guide the adaptation of the student model. The adapted student model is used for final deployment. We demonstrate the effectiveness of our approach in few-shot image classification using model-agnostic meta-learning (MAML). Our proposed method outperforms other alternatives on several benchmark datasets.

READ FULL TEXT
research
12/02/2020

Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains

Pre-trained language models have been applied to various NLP tasks with ...
research
09/26/2019

Fast and Effective Adaptation of Facial Action Unit Detection Deep Model

Detecting facial action units (AU) is one of the fundamental steps in au...
research
10/11/2022

Meta-Learning with Self-Improving Momentum Target

The idea of using a separately trained target model (or teacher) to impr...
research
10/09/2020

Few-shot Learning for Spatial Regression

We propose a few-shot learning method for spatial regression. Although G...
research
03/16/2021

Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning

Model-agnostic meta-learning (MAML) is a popular method for few-shot lea...
research
10/06/2022

Hypernetwork approach to Bayesian MAML

The main goal of Few-Shot learning algorithms is to enable learning from...
research
04/16/2020

Divergent Search for Few-Shot Image Classification

When data is unlabelled and the target task is not known a priori, diver...

Please sign up or login with your details

Forgot password? Click here to reset