Log In Sign Up

On Measuring the Intrinsic Few-Shot Hardness of Datasets

by   Xinran Zhao, et al.

While advances in pre-training have led to dramatic improvements in few-shot learning of NLP tasks, there is limited understanding of what drives successful few-shot adaptation in datasets. In particular, given a new dataset and a pre-trained model, what properties of the dataset make it few-shot learnable and are these properties independent of the specific adaptation techniques used? We consider an extensive set of recent few-shot learning methods, and show that their performance across a large number of datasets is highly correlated, showing that few-shot hardness may be intrinsic to datasets, for a given pre-trained model. To estimate intrinsic few-shot hardness, we then propose a simple and lightweight metric called "Spread" that captures the intuition that few-shot learning is made possible by exploiting feature-space invariances between training and test samples. Our metric better accounts for few-shot hardness compared to existing notions of hardness, and is  8-100x faster to compute.


page 2

page 8

page 9


Few-shot learning using pre-training and shots, enriched by pre-trained samples

We use the EMNIST dataset of handwritten digits to test a simple approac...

A Baseline for Few-Shot Image Classification

Fine-tuning a deep network trained with the standard cross-entropy loss ...

Explore the Power of Dropout on Few-shot Learning

The generalization power of the pre-trained model is the key for few-sho...

ReMP: Rectified Metric Propagation for Few-Shot Learning

Few-shot learning features the capability of generalizing from a few exa...

On the Transferability of Visual Features in Generalized Zero-Shot Learning

Generalized Zero-Shot Learning (GZSL) aims to train a classifier that ca...

Few-shot Adaptation Works with UnpredicTable Data

Prior work on language models (LMs) shows that training on a large numbe...

Unsupervised Pronoun Resolution via Masked Noun-Phrase Prediction

In this work, we propose Masked Noun-Phrase Prediction (MNPP), a pre-tra...

Code Repositories


This is the github repo for EMNLP 2022 paper "On Measuring the Intrinsic Few-Shot Hardness of Datasets".

view repo