Unifying Grokking and Double Descent

03/10/2023
by   Xander Davies, et al.
0

A principled understanding of generalization in deep learning may require unifying disparate observations under a single conceptual framework. Previous work has studied grokking, a training dynamic in which a sustained period of near-perfect training performance and near-chance test performance is eventually followed by generalization, as well as the superficially similar double descent. These topics have so far been studied in isolation. We hypothesize that grokking and double descent can be understood as instances of the same learning dynamics within a framework of pattern learning speeds. We propose that this framework also applies when varying model capacity instead of optimization steps, and provide the first demonstration of model-wise grokking.

READ FULL TEXT

page 4

page 9

research
12/06/2021

Multi-scale Feature Learning Dynamics: Insights for Double Descent

A key challenge in building theoretical foundations for deep learning is...
research
12/04/2019

Deep Double Descent: Where Bigger Models and More Data Hurt

We show that a variety of modern deep learning tasks exhibit a "double-d...
research
08/26/2021

When and how epochwise double descent happens

Deep neural networks are known to exhibit a `double descent' behavior as...
research
02/26/2023

Can we avoid Double Descent in Deep Neural Networks?

Finding the optimal size of deep learning models is very actual and of b...
research
07/27/2021

On the Role of Optimization in Double Descent: A Least Squares Study

Empirically it has been observed that the performance of deep neural net...
research
05/31/2022

VC Theoretical Explanation of Double Descent

There has been growing interest in generalization performance of large m...
research
06/07/2021

Double Descent and Other Interpolation Phenomena in GANs

We study overparameterization in generative adversarial networks (GANs) ...

Please sign up or login with your details

Forgot password? Click here to reset