DeepAI AI Chat
Log In Sign Up

Does Learning Require Memorization? A Short Tale about a Long Tail

by   Vitaly Feldman, et al.

State-of-the-art results on image recognition tasks are achieved using over-parameterized learning algorithms that (nearly) perfectly fit the training set. This phenomenon is referred to as data interpolation or, informally, as memorization of the training data. The question of why such algorithms generalize well to unseen data is not adequately addressed by the standard theoretical frameworks and, as a result, significant theoretical and experimental effort has been devoted to understanding the properties of such algorithms. We provide a simple and generic model for prediction problems in which interpolating the dataset is necessary for achieving close-to-optimal generalization error. The model is motivated and supported by the results of several recent empirical works. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. The model allows to quantify the effect of not fitting the training data on the generalization performance of the learned classifier and demonstrates that memorization is necessary whenever frequencies are long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. To the best of our knowledge, this is the first general framework that demonstrates statistical benefits of plain memorization for learning. Our results also have concrete implications for the cost of ensuring differential privacy in learning.


page 1

page 2

page 3

page 4


What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation

Deep learning algorithms are well-known to have a propensity for fitting...

Adaptive Statistical Learning with Bayesian Differential Privacy

In statistical learning, a dataset is often partitioned into two parts: ...

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models

Recent studies indicate that NLU models are prone to rely on shortcut fe...

Reconstructing Training Data from Trained Neural Networks

Understanding to what extent neural networks memorize training data is a...

Privacy-preserving Prediction

Ensuring differential privacy of models learned from sensitive user data...

Privacy Preserving Recalibration under Domain Shift

Classifiers deployed in high-stakes real-world applications must output ...

Towards Differential Relational Privacy and its use in Question Answering

Memorization of the relation between entities in a dataset can lead to p...