Prioritized training on points that are learnable, worth learning, and not yet learned

07/06/2021
by   Sören Mindermann, et al.
5

We introduce Goldilocks Selection, a technique for faster model training which selects a sequence of training points that are "just right". We propose an information-theoretic acquisition function – the reducible validation loss – and compute it with a small proxy model – GoldiProx – to efficiently choose training points that maximize information about a validation set. We show that the "hard" (e.g. high loss) points usually selected in the optimization literature are typically noisy, while the "easy" (e.g. low noise) samples often prioritized for curriculum learning confer less information. Further, points with uncertain labels, typically targeted by active learning, tend to be less relevant to the task. In contrast, Goldilocks Selection chooses points that are "just right" and empirically outperforms the above approaches. Moreover, the selected sequence can transfer to other architectures; practitioners can share and reuse it without the need to recreate it.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2022

Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

Training on web-scale data can take months. But most computation and tim...
research
06/22/2021

A Practical Unified Notation for Information-Theoretic Quantities in ML

Information theory is of importance to machine learning, but the notatio...
research
06/26/2019

Selection Via Proxy: Efficient Data Selection For Deep Learning

Data selection methods such as active learning and core-set selection ar...
research
08/01/2022

Unifying Approaches in Data Subset Selection via Fisher Information and Information-Theoretic Quantities

The mutual information between predictions and model parameters – also r...
research
03/09/2021

Active Testing: Sample-Efficient Model Evaluation

We introduce active testing: a new framework for sample-efficient model ...
research
04/29/2021

Selecting the Points for Training using Graph Centrality

We describe a method to select the nodes in Graph datasets for training ...
research
04/10/2022

Information-theoretic Online Memory Selection for Continual Learning

A challenging problem in task-free continual learning is the online sele...

Please sign up or login with your details

Forgot password? Click here to reset