Maximum Entropy Models for Fast Adaptation

06/30/2020
by   Samarth Sinha, et al.
12

Deep Neural Networks have shown great promise on a variety of downstream tasks; but their ability to adapt to new data and tasks remains a challenging problem. The ability of a model to perform few-shot adaptation to a novel task is important for the scalability and deployment of machine learning models. Recent work has shown that the learned features in a neural network follow a normal distribution [41], which thereby results in a strong prior on the downstream task. This implicit overfitting to data from training tasks limits the ability to generalize and adapt to unseen tasks at test time. This also highlights the importance of learning task-agnostic representations from data. In this paper, we propose a regularization scheme using a max-entropy prior on the learned features of a neural network; such that the extracted features make minimal assumptions about the training data. We evaluate our method on adaptation to unseen tasks by performing experiments in 4 distinct settings. We find that our method compares favourably against multiple strong baselines across all of these experiments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2022

TeST: Test-time Self-Training under Distribution Shift

Despite their recent success, deep neural networks continue to perform p...
research
11/05/2017

Strategies for Conceptual Change in Convolutional Neural Networks

A remarkable feature of human beings is their capacity for creative beha...
research
09/15/2022

Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

Pre-trained vision-language models (e.g., CLIP) have shown promising zer...
research
09/23/2020

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Much as replacing hand-designed features with learned functions has revo...
research
05/21/2023

PRODIGY: Enabling In-context Learning Over Graphs

In-context learning is the ability of a pretrained model to adapt to nov...
research
04/18/2022

Entropy-based Stability-Plasticity for Lifelong Learning

The ability to continuously learn remains elusive for deep learning mode...
research
10/06/2020

Usable Information and Evolution of Optimal Representations During Training

We introduce a notion of usable information contained in the representat...

Please sign up or login with your details

Forgot password? Click here to reset