DeepAI AI Chat
Log In Sign Up

Grokking phase transitions in learning local rules with gradient descent

by   Bojan Žunkovič, et al.

We discuss two solvable grokking (generalisation beyond overfitting) models in a rule learning scenario. We show that grokking is a phase transition and find exact analytic expressions for the critical exponents, grokking probability, and grokking time distribution. Further, we introduce a tensor-network map that connects the proposed grokking setup with the standard (perceptron) statistical learning theory and show that grokking is a consequence of the locality of the teacher model. As an example, we analyse the cellular automata learning task, numerically determine the critical exponent and the grokking time distributions and compare them with the prediction of the proposed grokking model. Finally, we numerically analyse the connection between structure formation and grokking.


page 1

page 2

page 3

page 4


Emergence of a finite-size-scaling function in the supervised learning of the Ising phase transition

We investigate the connection between the supervised learning of the bin...

Exact Phase Transitions in Deep Learning

This work reports deep-learning-unique first-order and second-order phas...

Modeling self-organizing traffic lights with elementary cellular automata

There have been several highway traffic models proposed based on cellula...

Phase Transitions in Community Detection: A Solvable Toy Model

Recently, it was shown that there is a phase transition in the community...

The Hidden-Manifold Hopfield Model and a learning phase transition

The Hopfield model has a long-standing tradition in statistical physics,...

Exact enumeration of satisfiable 2-SAT formulae

We obtain exact expressions counting the satisfiable 2-SAT formulae and ...

Resolving Molecular Contributions of Ion Channel Noise to Interspike Interval Variability through Stochastic Shielding

The contributions of independent noise sources to the variability of act...