DeepAI AI Chat
Log In Sign Up

Grokking phase transitions in learning local rules with gradient descent

10/26/2022
by   Bojan Žunkovič, et al.
0

We discuss two solvable grokking (generalisation beyond overfitting) models in a rule learning scenario. We show that grokking is a phase transition and find exact analytic expressions for the critical exponents, grokking probability, and grokking time distribution. Further, we introduce a tensor-network map that connects the proposed grokking setup with the standard (perceptron) statistical learning theory and show that grokking is a consequence of the locality of the teacher model. As an example, we analyse the cellular automata learning task, numerically determine the critical exponent and the grokking time distributions and compare them with the prediction of the proposed grokking model. Finally, we numerically analyse the connection between structure formation and grokking.

READ FULL TEXT

page 1

page 2

page 3

page 4

10/01/2020

Emergence of a finite-size-scaling function in the supervised learning of the Ising phase transition

We investigate the connection between the supervised learning of the bin...
05/25/2022

Exact Phase Transitions in Deep Learning

This work reports deep-learning-unique first-order and second-order phas...
07/10/2009

Modeling self-organizing traffic lights with elementary cellular automata

There have been several highway traffic models proposed based on cellula...
12/02/2013

Phase Transitions in Community Detection: A Solvable Toy Model

Recently, it was shown that there is a phase transition in the community...
03/29/2023

The Hidden-Manifold Hopfield Model and a learning phase transition

The Hopfield model has a long-standing tradition in statistical physics,...
08/18/2021

Exact enumeration of satisfiable 2-SAT formulae

We obtain exact expressions counting the satisfiable 2-SAT formulae and ...
11/17/2020

Resolving Molecular Contributions of Ion Channel Noise to Interspike Interval Variability through Stochastic Shielding

The contributions of independent noise sources to the variability of act...