HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning

01/11/2022
by   Andrey Zhmoginov, et al.
17

In this work we propose a HyperTransformer, a transformer-based model for few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples. Since the dependence of a small generated CNN model on a specific task is encoded by a high-capacity transformer model, we effectively decouple the complexity of the large task space from the complexity of individual tasks. Our method is particularly effective for small target CNN architectures where learning a fixed universal task-independent embedding is not optimal and better performance is attained when the information about the task can modulate all model parameters. For larger models we discover that generating the last layer alone allows us to produce competitive or better results than those obtained with state-of-the-art methods while being end-to-end differentiable. Finally, we extend our approach to a semi-supervised regime utilizing unlabeled samples in the support set and further improving few-shot performance.

READ FULL TEXT

page 19

page 20

page 21

research
11/29/2017

Semi-Supervised Few-Shot Learning with Prototypical Networks

We consider the problem of semi-supervised few-shot classification (when...
research
09/28/2022

An Embarrassingly Simple Approach to Semi-Supervised Few-Shot Learning

Semi-supervised few-shot learning consists in training a classifier to a...
research
06/11/2019

From Fully Supervised to Zero Shot Settings for Twitter Hashtag Recommendation

We propose a comprehensive end-to-end pipeline for Twitter hashtags reco...
research
03/06/2019

Semi-Supervised Few-Shot Learning with Local and Global Consistency

Learning from a few examples is a key characteristic of human intelligen...
research
03/18/2020

Task-Adaptive Clustering for Semi-Supervised Few-Shot Classification

Few-shot learning aims to handle previously unseen tasks using only a sm...
research
05/22/2021

Semi-Supervised Few-Shot Classification with Deep Invertible Hybrid Models

In this paper, we propose a deep invertible hybrid model which integrate...
research
06/28/2021

What's in a Measurement? Using GPT-3 on SemEval 2021 Task 8 – MeasEval

In the summer of 2020 OpenAI released its GPT-3 autoregressive language ...

Please sign up or login with your details

Forgot password? Click here to reset