A Study on the Predictability of Sample Learning Consistency

07/07/2022
by   Alain Raymond-Saez, et al.
10

Curriculum Learning is a powerful training method that allows for faster and better training in some settings. This method, however, requires having a notion of which examples are difficult and which are easy, which is not always trivial to provide. A recent metric called C-Score acts as a proxy for example difficulty by relating it to learning consistency. Unfortunately, this method is quite compute intensive which limits its applicability for alternative datasets. In this work, we train models through different methods to predict C-Score for CIFAR-100 and CIFAR-10. We find, however, that these models generalize poorly both within the same distribution as well as out of distribution. This suggests that C-Score is not defined by the individual characteristics of each sample but rather by other factors. We hypothesize that a sample's relation to its neighbours, in particular, how many of them share the same labels, can help in explaining C-Scores. We plan to explore this in future work.

READ FULL TEXT
research
02/08/2020

Exploring the Memorization-Generalization Continuum in Deep Learning

Human learners appreciate that some facts demand memorization whereas ot...
research
05/13/2021

On the Bahadur representation of sample quantiles for score functionals

We establish the Bahadur representation of sample quantiles for stabiliz...
research
11/23/2020

Unsupervised Difficulty Estimation with Action Scores

Evaluating difficulty and biases in machine learning models has become o...
research
10/29/2019

Distribution Density, Tails, and Outliers in Machine Learning: Metrics and Applications

We develop techniques to quantify the degree to which a given (training ...
research
10/20/2019

Image Difficulty Curriculum for Generative Adversarial Networks (CuGAN)

Despite the significant advances in recent years, Generative Adversarial...
research
03/28/2022

Understanding out-of-distribution accuracies through quantifying difficulty of test samples

Existing works show that although modern neural networks achieve remarka...
research
06/05/2023

NLU on Data Diets: Dynamic Data Subset Selection for NLP Classification Tasks

Finetuning large language models inflates the costs of NLU applications ...

Please sign up or login with your details

Forgot password? Click here to reset