Improving Meta-Learning Generalization with Activation-Based Early-Stopping

08/03/2022
by   Simon Guiroy, et al.
4

Meta-Learning algorithms for few-shot learning aim to train neural networks capable of generalizing to novel tasks using only a few examples. Early-stopping is critical for performance, halting model training when it reaches optimal generalization to the new task distribution. Early-stopping mechanisms in Meta-Learning typically rely on measuring the model performance on labeled examples from a meta-validation set drawn from the training (source) dataset. This is problematic in few-shot transfer learning settings, where the meta-test set comes from a different target dataset (OOD) and can potentially have a large distributional shift with the meta-validation set. In this work, we propose Activation Based Early-stopping (ABE), an alternative to using validation-based early-stopping for meta-learning. Specifically, we analyze the evolution, during meta-training, of the neural activations at each hidden layer, on a small set of unlabelled support examples from a single task of the target tasks distribution, as this constitutes a minimal and justifiably accessible information from the target problem. Our experiments show that simple, label agnostic statistics on the activations offer an effective way to estimate how the target generalization evolves over time. At each hidden layer, we characterize the activation distributions, from their first and second order moments, then further summarized along the feature dimensions, resulting in a compact yet intuitive characterization in a four-dimensional space. Detecting when, throughout training time, and at which layer, the target activation trajectory diverges from the activation trajectory of the source data, allows us to perform early-stopping and improve generalization in a large array of few-shot transfer learning settings, across different algorithms, source and target datasets.

READ FULL TEXT
research
09/21/2018

A Meta-Learning Approach for Custom Model Training

Transfer-learning and meta-learning are two effective methods to apply k...
research
10/02/2019

Robust Few-Shot Learning with Adversarially Queried Meta-Learners

Previous work on adversarially robust neural networks requires large tra...
research
06/09/2020

Learning to Stop While Learning to Predict

There is a recent surge of interest in designing deep architectures base...
research
10/17/2022

Measures of Information Reflect Memorization Patterns

Neural networks are known to exploit spurious artifacts (or shortcuts) t...
research
07/27/2021

Channel-Wise Early Stopping without a Validation Set via NNK Polytope Interpolation

State-of-the-art neural network architectures continue to scale in size ...
research
04/10/2023

Simulated Annealing in Early Layers Leads to Better Generalization

Recently, a number of iterative learning methods have been introduced to...
research
03/14/2018

Algebraic Machine Learning

Machine learning algorithms use error function minimization to fit a lar...

Please sign up or login with your details

Forgot password? Click here to reset